AI Code Vulnerability Scanner

AI Code Vulnerability Scanner — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Data Science and Predictive Analytics

    Data Science and Predictive Analytics

    The first edition of the textbook Data Science and Predictive Analytics: Biomedical and Health Applications using R, authored by Ivo D. Dinov, was published in August 2018 by Springer. The second edition of the book was printed in 2023. This textbook covers some of the core mathematical foundations, computational techniques, and artificial intelligence approaches used in data science research and applications. By using the statistical computing platform R and a broad range of biomedical case-studies, the 23 chapters of the book first edition provide explicit examples of importing, exporting, processing, modeling, visualizing, and interpreting large, multivariate, incomplete, heterogeneous, longitudinal, and incomplete datasets (big data). == Structure == === First edition table of contents === The first edition of the Data Science and Predictive Analytics (DSPA) textbook is divided into the following 23 chapters, each progressively building on the previous content. === Second edition table of contents === The significantly reorganized revised edition of the book (2023) expands and modernizes the presented mathematical principles, computational methods, data science techniques, model-based machine learning and model-free artificial intelligence algorithms. The 14 chapters of the new edition start with an introduction and progressively build foundational skills to naturally reach biomedical applications of deep learning. Introduction Basic Visualization and Exploratory Data Analytics Linear Algebra, Matrix Computing, and Regression Modeling Linear and Nonlinear Dimensionality Reduction Supervised Classification Black Box Machine Learning Methods Qualitative Learning Methods—Text Mining, Natural Language Processing, and Apriori Association Rules Learning Unsupervised Clustering Model Performance Assessment, Validation, and Improvement Specialized Machine Learning Topics Variable Importance and Feature Selection Big Longitudinal Data Analysis Function Optimization Deep Learning, Neural Networks == Reception == The materials in the Data Science and Predictive Analytics (DSPA) textbook have been peer-reviewed in the Journal of the American Statistical Association, International Statistical Institute’s ISI Review Journal, and the Journal of the American Library Association. Many scholarly publications reference the DSPA textbook. As of January 17, 2021, the electronic version of the book first edition (ISBN 978-3-319-72347-1) is freely available on SpringerLink and has been downloaded over 6 million times. The textbook is globally available in print (hardcover and softcover) and electronic formats (PDF and EPub) in many college and university libraries and has been used for data science, computational statistics, and analytics classes at various institutions.

    Read more →
  • Hanna Hajishirzi

    Hanna Hajishirzi

    Hannaneh Hajishirzi is an Iranian-American computer scientist specializing in natural language processing. She is Torode Family Professor in Computer Science & Engineering in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, head of the H2Lab in the Allen School, and a senior director of natural language processing in the Allen Institute for AI. == Education and career == After a bachelor's degree from the Sharif University of Technology, Hajishirzi completed her Ph.D. in computer science in 2011, at the University of Illinois Urbana-Champaign. Her dissertation, Action-Centered Reasoning for Probabilistic Dynamic Systems, was supervised by Eyal Amir. After postdoctoral research at Disney Research in Pittsburgh, Hajishirzi joined the University of Washington in 2012, as a research scientist in electrical engineering. In 2015 she became a research assistant professor in electrical engineering. She obtained a regular-rank assistant professorship in 2018, at the same time becoming an AI Fellow in the Allen Institute for AI, where she became a senior director of research in 2021. She was promoted to associate professor in 2022 and to full professor in 2025. == Recognition == Hajishirzi was named as a Fellow of the Association for Computational Linguistics in 2025, "for significant contributions to question answering, scientific applications, multimodal artificial intelligence, and fully open language models". == Personal life == Hajishirzi is married to Ali Farhadi, the CEO of the Allen Institute for AI.

    Read more →
  • Dissociated press

    Dissociated press

    Dissociated press is a parody generator (a computer program that generates nonsensical text). The generated text is based on another text using the Markov chain technique. The name is a play on "Associated Press" and the psychological term dissociation (although word salad is more typical of conditions like aphasia and schizophrenia – which is, however, frequently confused with dissociative identity disorder by laypeople). An implementation of the algorithm is available in Emacs. Another implementation is available as a Perl module in CPAN, Games::Dissociate. == The algorithm == The algorithm starts by printing a number of consecutive words (or letters) from the source text. Then it searches the source text for an occurrence of the few last words or letters printed out so far. If multiple occurrences are found, it picks a random one, and proceeds with printing the text following the chosen occurrence. After a predetermined length of text is printed out, the search procedure is repeated for the newly printed ending. Considering that words and phrases tend to appear in specific grammatical contexts, the resulting text usually seems correct grammatically, and if the source text is uniform in style, the result appears to be of similar style and subject, and takes some effort on the reader's side to recognize as not genuine. Still, the randomness of the assembly process deprives it of any logical flow - the loosely related parts are connected in a nonsensical way, creating a humorously abstract, random result. == Examples == Here is a short example of word-based Dissociated Press applied to the Jargon File: wart: n. A small, crocky feature that sticks out of an array (C has no checks for this). This is relatively benign and easy to spot if the phrase is bent so as to be not worth paying attention to the medium in question. Here is a short example of letter-based Dissociated Press applied to the same source: window sysIWYG: n. A bit was named aften /bee´t@/ prefer to use the other guy's re, especially in every cast a chuckle on neithout getting into useful informash speech makes removing a featuring a move or usage actual abstractionsidered interj. Indeed spectace logic or problem! == History == The dissociated press algorithm is described in HAKMEM (1972) Item #176. The name "dissociated press" is first known to have been associated with the Emacs implementation. Brian Hayes discussed a Travesty algorithm in Scientific American in November 1983. The article provided a garbled William Faulkner passage: When he got on the table, he come in. He never come out of my own pocket as a measure of protecting the company against riot and bloodshed. And when he said. "You tell me a bus ticket, let alone write out no case histories. Then the law come back with a knife!" Hugh Kenner and Joseph O'Rourke of Johns Hopkins University discussed their frequency table-based Travesty generator for microcomputers in BYTE in November 1984. The article included the Turbo Pascal source for two versions of the generator, one using Hayes' algorithm and another using Claude Shannon's Hellbat algorithm. Murray Lesser offered a compiled BASIC version in the magazine in July 1985, in September 1985 Peter Wayner offered a version that used tree data structures instead of frequency tables, and in December 1985 Neil J. Rubenking offered a version written in Turbo Pascal that stored frequency information in a B-tree.

    Read more →
  • Andrew McCallum

    Andrew McCallum

    Andrew McCallum is an American professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social network analysis. == Career == McCallum graduated summa cum laude from Dartmouth College in 1989. He completed his Ph.D. at the University of Rochester in 1995 under the supervision of Dana H. Ballard. McCallum was then a postdoctoral fellow, working with Sebastian Thrun and Tom M. Mitchell at Carnegie Mellon University. From 1998 to 2000, he was a Research Scientist and Research Coordinator at Justsystem Pittsburgh Research Center. From 2000 to 2002, he was Vice President of Research and Development at WhizBang Labs, and Director of its Pittsburgh office. Since 2002, he has worked as a professor of computer science at the University of Massachusetts Amherst. In 2020, he also joined Google as a part-time research scientist. He was elected as a fellow of the Association for the Advancement of Artificial Intelligence in 2009, and as an Association for Computing Machinery in 2017. From 2014 to 2017, he was the President of International Machine Learning Society (IMLS), which organizes the International Conference on Machine Learning. He is also the director of the Center for Data Science at UMass, leading a new partnership with the Chan and Zuckerberg Initiative. In 2018, the initiative made an initial grant of 5.5 million to the center, supporting research to facilitate new ways for scientists to explore and discover research articles. == Main contributions == In collaboration with John D. Lafferty and Fernando Pereira, McCallum developed conditional random fields, first described in a paper presented at the International Conference on Machine Learning (ICML). In 2011 this research paper won the ICML "Test of Time" (10-year best paper) award. McCallum has written several widely used open-source software toolkits for machine learning, natural language processing and other text processing, including Rainbow, Mallet (software project), and FACTORIE. In addition, he was instrumental in publishing the Enron Corpus, a large collection of emails that has been used as a basis for a number of academic studies of social networking and language. McCallum instigated and directs the nonprofit project OpenReview.net, an online platform that aims to promote openness in scientific communication, particularly the peer review process, by providing a flexible cloud-based web interface and underlying database API.

    Read more →
  • WaveMaker

    WaveMaker

    WaveMaker is a Java-based low-code development platform designed for building software applications and platforms. The company, WaveMaker Inc., is based in Mountain View, California. The platform is intended to assist enterprises in speeding up their application development and IT modernization initiatives through low-code capabilities. Additionally, for independent software vendors (ISVs), WaveMaker serves as a customizable low-code component that integrates into their products. The WaveMaker Platform is a licensed software platform allowing organizations to establish their own end-to-application platform-as-a-service (PaaS) for the creation and operation of custom apps. It allows developers and business users to create apps that are customizable. These applications can seamlessly consume APIs, visualize data, and automatically adapt to multi-device responsive interfaces. WaveMaker's low-code platform allows organizations to deploy applications on either public or private cloud infrastructure. Containers can be deployed on top of virtual machines or directly on bare metal. The software features a graphical user interface (GUI) console for managing IT app infrastructure, leveraging the capabilities of Docker containerization. The solution offers functionalities for automating application deployment, managing the application lifecycle, overseeing release management, and controlling deployment workflows and access permissions: Apps for web, tablet, and smartphone interfaces Enterprise technologies like Java, Hibernate, Spring, AngularJS, JQuery Docker-provided APIs and CLI Software stack packaging, container provisioning, stack and app upgrading, replication, and fault tolerance == WaveMaker Studio == WaveMaker RAD Platform is built around WaveMaker Studio, a WYSIWYG rapid development tool that allows business users to compose an application using a drag-and-drop method. WaveMaker Studio supports rapid application development (RAD) for the web, similar to what products like PowerBuilder and Lotus Notes provided for client-server computing. WaveMaker Studio allows developers to produce an application once, then automatically adjust it for a particular target platform, whether a PC, mobile phone, or tablet. Applications created using the WaveMaker Studio follow a model–view–controller architecture. WaveMaker Studio has been downloaded more than two million times. The Studio community consists of 30,000 registered users. Applications generated by WaveMaker Studio are licensed under the Apache license. Studio 8 was released on September 25, 2015. The prior version, Studio 7, has some notable development milestones. It was based on AngularJS framework, previous Studio versions (6.7, 6.6, 6.5) use the Dojo Toolkit. Some of the features WaveMaker Studio 7 include: Automatic generation of Hibernate mapping, and Hibernate queries from database schema import. Automatic creation of Enterprise Data Widgets based on schema import. Each widget can display data from a database table as a grid or edit form. Edit form implements create, update, and delete functions automatically. WYSIWYG Ajax development studio runs in a browser. Deployment to Tomcat, IBM WebSphere, Weblogic, JBoss. Mashup tool to assemble web applications based on SOAP, REST and RSS web services, Java Services and databases. Supports existing CSS, HTML and Java code. The ability to deploy a standard Java .war file. == Technologies and frameworks == WaveMaker allows users to build applications that run on "Open Systems Stack" based on the following technologies and frameworks: AngularJS, Bootstrap, NVD3, HTML, CSS, Apache Cordova, Hibernate, Spring, Spring Security, Java. The various supported integrations include: Databases: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, IBM DB2, HSQLDB Authentication: LDAP, Active Directory, CAS, Custom Java Service, Database Version Control: Bitbucket (or Stash), GitHub, Apache Subversion Deployment: Amazon AWS, Microsoft Azure, WaveMaker Private Cloud (Docker containerization), IBM Web Sphere, Apache Tomcat, SpringSource tcServer, Oracle WebLogic Server, JBoss(WildFly), GlassFish App Stores: Google Play, Apple App Store, Windows Store == History == In 2003, WaveMaker was founded as ActiveGrid. Then, in 2007, it was rebranded as Wavemaker. It was acquired by VMware in 2011. In March 2013, support for the WaveMaker project was discontinued. In May 2013, Pramati Technologies acquired the assets of WaveMaker. In February 2014, Wavemaker Studio 6.7 was released, which was the last open source version of Studio. In September 2014 WaveMaker Inc. launched the WaveMaker RAD Platform, which allowed organizations to run their own application platform for building and running apps. In March 2023, WaveMaker released version 11.5, which includes enhanced low-code development capabilities and new AI-driven tools to streamline the application development process.

    Read more →
  • Mehryar Mohri

    Mehryar Mohri

    Mehryar Mohri is a professor and theoretical computer scientist at the Courant Institute of Mathematical Sciences. He is also heading the Machine Learning Theory (ML Theory) team at Google Research. == Career == Prior to joining the Courant Institute, Mohri was a research department head and later technology leader at AT&T Bell Labs, where he was a member of the technical staff for about ten years. Mohri has also taught as an assistant professor at the University of Paris 7 (1992-1993) and Ecole Polytechnique (1992-1994). == Research == Mohri's main area of research is machine learning, in particular learning theory. He is also an expert in automata theory and algorithms. He is the author of several core algorithms that have served as the foundation for the design of many deployed speech recognition and natural language processing systems. == Publications == Mohri is the author of the reference book Foundations of Machine Learning used as a textbook in many graduate-level machine learning courses. Mohri is also a member of the Lothaire group of mathematicians with the pseudonym M. Lothaire and contributed to the book on Applied Combinatorics on Words. He is the author of more than 250 conference and journal publications. == Organizational affiliations == Mohri is currently the President of the Association for Algorithmic Learning Theory (AALT) and the Steering Committee Chair for the ALT conference. He is also Editorial Board member of Machine Learning and TheoretiCS, Action Editor of the Journal of Machine Learning Research (JMLR) and a member of the advisory board for the Journal of Automata, Languages and Combinatorics.

    Read more →
  • Tf–idf

    Tf–idf

    In information retrieval, tf–idf (term frequency–inverse document frequency, TFIDF, TFIDF, TF–IDF, or Tf–idf) is a measure of importance of a word to a document in a collection or corpus, adjusted for the fact that some words appear more frequently in general. Like the bag-of-words model, it models a document as a multiset of words, without word order. It is a refinement over the simple bag-of-words model, by allowing the weight of words to depend on the rest of the corpus. It was often used as a weighting factor in searches of information retrieval, text mining, and user modeling. A survey conducted in 2015 showed that 83% of text-based recommender systems in digital libraries used tf–idf. Variations of the tf–idf weighting scheme were often used by search engines as a central tool in scoring and ranking a document's relevance given a user query. One of the simplest ranking functions is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model. == Motivations == Karen Spärck Jones (1972) conceived a statistical interpretation of term-specificity called Inverse Document Frequency (idf), which became a cornerstone of term weighting: The specificity of a term can be quantified as an inverse function of the number of documents in which it occurs.For example, the df (document frequency) and idf for some words in Shakespeare's 37 plays might be represented as follows: We see that "Romeo", "Falstaff", and "salad" appears in very few plays, so seeing these words, one could get a good idea as to which play it might be. In contrast, "good" and "sweet" appears in every play and are completely uninformative as to which play it is. == Definition == The tf–idf is the product of two statistics, term frequency and inverse document frequency. There are various ways for determining the exact values of both statistics. A formula that aims to define the importance of a keyword or phrase within a document or a web page. === Term frequency === Term frequency, tf(t,d), is the relative frequency of term t within document d, t f ( t , d ) = f t , d ∑ t ′ ∈ d f t ′ , d {\displaystyle \mathrm {tf} (t,d)={\frac {f_{t,d}}{\sum _{t'\in d}{f_{t',d}}}}} , where ft,d is the raw count of a term in a document, i.e., the number of times that term t occurs in document d. Note the denominator is simply the total number of terms in document d (counting each occurrence of the same term separately). There are various other ways to define term frequency: the raw count itself: tf(t,d) = ft,d Boolean "frequencies": tf(t,d) = 1 if t occurs in d and 0 otherwise; logarithmically scaled frequency: tf(t,d) = log (1 + ft,d); augmented frequency, to prevent a bias towards longer documents, e.g. raw frequency divided by the raw frequency of the most frequently occurring term in the document: t f ( t , d ) = 0.5 + 0.5 ⋅ f t , d max { f t ′ , d : t ′ ∈ d } {\displaystyle \mathrm {tf} (t,d)=0.5+0.5\cdot {\frac {f_{t,d}}{\max\{f_{t',d}:t'\in d\}}}} === Inverse document frequency === The inverse document frequency is a measure of how much information the word provides, i.e., how common or rare it is across all documents. It is the logarithmically scaled inverse fraction of the documents that contain the word (obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient): i d f ( t , D ) = log ⁡ N n t {\displaystyle \mathrm {idf} (t,D)=\log {\frac {N}{n_{t}}}} with D {\displaystyle D} : is the set of all documents in the corpus N = | D | {\displaystyle N={|D|}} : total number of documents in the corpus n t = | { d ∈ D : t ∈ d } | {\displaystyle n_{t}=|\{d\in D:t\in d\}|} : number of documents where the term t {\displaystyle t} appears (i.e., t f ( t , d ) ≠ 0 {\displaystyle \mathrm {tf} (t,d)\neq 0} ). If the term is not in the corpus, this will lead to a division-by-zero. It is therefore common to adjust the numerator to 1 + N {\displaystyle 1+N} and the denominator to 1 + | { d ∈ D : t ∈ d } | {\displaystyle 1+|\{d\in D:t\in d\}|} . === Term frequency–inverse document frequency === Then tf–idf is calculated as t f i d f ( t , d , D ) = t f ( t , d ) ⋅ i d f ( t , D ) {\displaystyle \mathrm {tfidf} (t,d,D)=\mathrm {tf} (t,d)\cdot \mathrm {idf} (t,D)} A high weight in tf–idf is reached by a high term frequency (in the given document) and a low document frequency of the term in the whole collection of documents; the weights hence tend to filter out common terms. Since the ratio inside the idf's log function is always greater than or equal to 1, the value of idf (and tf–idf) is greater than or equal to 0. As a term appears in more documents, the ratio inside the logarithm approaches 1, bringing the idf and tf–idf closer to 0. == Justification of idf == Idf was introduced as "term specificity" by Karen Spärck Jones in a 1972 paper. Although it has worked well as a heuristic, its theoretical foundations have been troublesome for at least three decades afterward, with many researchers trying to find information theoretic justifications for it. Spärck Jones's own explanation did not propose much theory, aside from a connection to Zipf's law. Attempts have been made to put idf on a probabilistic footing, by estimating the probability that a given document d contains a term t as the relative document frequency, P ( t | D ) = | { d ∈ D : t ∈ d } | N , {\displaystyle P(t|D)={\frac {|\{d\in D:t\in d\}|}{N}},} so that we can define idf as i d f = − log ⁡ P ( t | D ) = log ⁡ 1 P ( t | D ) = log ⁡ N | { d ∈ D : t ∈ d } | {\displaystyle {\begin{aligned}\mathrm {idf} &=-\log P(t|D)\\&=\log {\frac {1}{P(t|D)}}\\&=\log {\frac {N}{|\{d\in D:t\in d\}|}}\end{aligned}}} Namely, the inverse document frequency is the logarithm of "inverse" relative document frequency. This probabilistic interpretation in turn takes the same form as that of self-information. However, applying such information-theoretic notions to problems in information retrieval leads to problems when trying to define the appropriate event spaces for the required probability distributions: not only documents need to be taken into account, but also queries and terms. == Link with information theory == Both term frequency and inverse document frequency can be formulated in terms of information theory; it helps to understand why their product has a meaning in terms of joint informational content of a document. A characteristic assumption about the distribution p ( d , t ) {\displaystyle p(d,t)} is that: p ( d | t ) = 1 | { d ∈ D : t ∈ d } | {\displaystyle p(d|t)={\frac {1}{|\{d\in D:t\in d\}|}}} This assumption and its implications, according to Aizawa: "represent the heuristic that tf–idf employs." The conditional entropy of a "randomly chosen" document in the corpus D {\displaystyle D} , conditional to the fact it contains a specific term t {\displaystyle t} (and assuming that all documents have equal probability to be chosen) is: H ( D | T = t ) = − ∑ d p d | t log ⁡ p d | t = − log ⁡ 1 | { d ∈ D : t ∈ d } | = log ⁡ | { d ∈ D : t ∈ d } | | D | + log ⁡ | D | = − i d f ( t ) + log ⁡ | D | {\displaystyle H({\cal {D}}|{\cal {T}}=t)=-\sum _{d}p_{d|t}\log p_{d|t}=-\log {\frac {1}{|\{d\in D:t\in d\}|}}=\log {\frac {|\{d\in D:t\in d\}|}{|D|}}+\log |D|=-\mathrm {idf} (t)+\log |D|} In terms of notation, D {\displaystyle {\cal {D}}} and T {\displaystyle {\cal {T}}} are "random variables" corresponding to respectively draw a document or a term. The mutual information can be expressed as M ( T ; D ) = H ( D ) − H ( D | T ) = ∑ t p t ⋅ ( H ( D ) − H ( D | W = t ) ) = ∑ t p t ⋅ i d f ( t ) {\displaystyle M({\cal {T}};{\cal {D}})=H({\cal {D}})-H({\cal {D}}|{\cal {T}})=\sum _{t}p_{t}\cdot (H({\cal {D}})-H({\cal {D}}|W=t))=\sum _{t}p_{t}\cdot \mathrm {idf} (t)} The last step is to expand p t {\displaystyle p_{t}} , the unconditional probability to draw a term, with respect to the (random) choice of a document, to obtain: M ( T ; D ) = ∑ t , d p t | d ⋅ p d ⋅ i d f ( t ) = ∑ t , d t f ( t , d ) ⋅ 1 | D | ⋅ i d f ( t ) = 1 | D | ∑ t , d t f ( t , d ) ⋅ i d f ( t ) . {\displaystyle M({\cal {T}};{\cal {D}})=\sum _{t,d}p_{t|d}\cdot p_{d}\cdot \mathrm {idf} (t)=\sum _{t,d}\mathrm {tf} (t,d)\cdot {\frac {1}{|D|}}\cdot \mathrm {idf} (t)={\frac {1}{|D|}}\sum _{t,d}\mathrm {tf} (t,d)\cdot \mathrm {idf} (t).} This expression shows that summing the Tf–idf of all possible terms and documents recovers the mutual information between documents and term taking into account all the specificities of their joint distribution. Each Tf–idf hence carries the "bit of information" attached to a term x document pair. == Link with statistical theory == Tf–idf is closely related to the negative logarithmically transformed p-value from a one-tailed formulation of Fisher's exact test when the underlying corpus documents satisfy certain idealized assumptions. More recently, tf–idf variants were shown to arise as components in the test st

    Read more →
  • AI Code Generators Reviews: What Actually Works in 2026

    AI Code Generators Reviews: What Actually Works in 2026

    Trying to pick the best AI code generator? An AI code generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI code generator slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Stanza Living

    Stanza Living

    Stanza Living is the common brand name for Dtwelve Spaces Private Limited. It provides fully-managed shared living accommodations to students and young professionals. Founded by Anindya Dutta and Sandeep Dalmia, the company is present across 23 cities including Delhi, NCR, Bangalore, Visakhapatnam, Hyderabad, Chennai, Coimbatore, Indore, Pune, Baroda, Vijayawada, and Dehradun, Kota in India, with a capacity of 70,000 beds. Stanza Living is a technology-enabled housing concept which provides fully-furnished residences with amenities like meals, internet, laundry services, housekeeping, security and community engagement programmes. The company has an asset-light business model under which it engages in long-term lease agreements with property owners/developers, who convert their assets into shared living residences as per company guidelines. These assets are subsequently operated by Stanza Living. == Industry background == A report by Cushman & Wakefield (C&W) titled 'Exploring the Student Housing Universe in India City Insights', estimates that there were over 9.08 million migrant student enrolments in India's higher educational institutions (HEIs) for the year 2018-19 who need quality accommodation facilities. According to the report, Delhi-NCR, Mumbai, and Pune are the three biggest markets for student housing in the country, and these cities require an additional 4.75 lakh beds from organized co-living operators to meet the current demand. == History == Stanza Living provides tech-enabled, fully managed community living facilities for students and working professionals. The company was launched as a student housing business in Delhi NCR with a capacity of 100 beds, and grew to 14 cities by 2019. By early 2020, the company began catering to working professionals as well. The company has a combined inventory of 70,000 beds under management for both students and working professionals. Stanza Living is currently valued at $300 million. It has raised a capital of about $70 million from leading global investors like Falcon Edge Capital, Sequoia Capital, Matrix Partners and Accel Partners. November 2017 – Seed funding, September 2018 – Series A, March 2019 – Debt financing, July 2019 – Series C round, December 2019 - Debt financing. The company has invested in building technology products for business efficiency and consumer experience, like the Stanza Resident App and Stanza Real Estate App. Stanza Living has close to 1,500 employees across India. It is recognized among Top Real Estate Tech Startups of 2020 across the globe by research and analysis company Tracxn. The company has been shortlisted among Top 25 Start-ups of India in 2019 by LinkedIn == Founders == Stanza Living was co-founded by Anindya Dutta and Sandeep Dalmia. Sandeep Dalmia is an alumnus of Delhi College of Engineering and IIM Ahmedabad. Prior to Stanza, he was a Principal at Boston Consulting Group, working across India, US and South East Asia markets. Anindya Dutta was previously a Real Estate investor with Oaktree Capital and prior to that, he worked at Goldman Sachs in London. He is an alumnus of IIT Kharagpur and IIM Ahmedabad.

    Read more →
  • Roni Rosenfeld

    Roni Rosenfeld

    Roni Rosenfeld (Hebrew: רוני רוזנפלד) is an Israeli-American computer scientist and computational epidemiologist, currently serving as the head of the Machine Learning Department at Carnegie Mellon University. He is an international expert in machine learning, infectious disease forecasting, statistical language modeling and artificial intelligence. == Education == Rosenfeld received his B.Sc. in mathematics and physics from Tel Aviv University in 1985. He received his Ph.D. in computer science from Carnegie Mellon University in 1994. While a graduate student, he developed and open-sourced a statistical language-modeling toolkit to allow anyone to create statistical language models from their own corpora and experiment with and extend the toolkit's capabilities. The toolkit has been used by more than 100 NLP laboratories in more than 20 countries. Rosenfeld's Ph.D. thesis, A Maximum Entropy Approach to Adaptive Statistical Language Modeling, was advised by Raj Reddy and Xuedong Huang and won the 2001 Computer, Speech and Language award for "Most Influential Paper in the Last 5 Years." == Career == Shortly after receiving his Ph.D., Rosenfeld joined the faculty of the Carnegie Mellon School of Computer Science as an assistant professor. He was promoted to the rank of associate professor in 1999 and received tenure in 2001. In 2005 he was promoted to professor of language technologies, machine learning computer science and computational biology in the School of Computer Science at Carnegie Mellon University. Rosenfeld also holds adjunct appointments at the University of Pittsburgh School of Medicine, department of computational and systems biology. From 2002 to 2003, Rosenfeld was a visiting professor at the University of Hong Kong. Rosenfeld is the director of Carnegie Mellon's Machine Learning for Social Good (ML4SG) program. He has held educational leadership positions in a variety of programs, including the M.S. in computational finance (1997–1999), graduate computational and statistical learning (2001–2003), M.S. in machine learning (2017) and undergraduate minor in machine learning. Rosenfeld was appointed Head of Carnegie Mellon's Machine Learning Department in 2018. == Research == Rosenfeld's research interests include epidemiological forecasting, information and communication technologies for development (ICT4D), and machine learning for social good. === Epidemiological forecasting === Rosenfeld is a world expert in epidemiological forecasting. He founded and directs the Delphi research group, which has won most of the epidemiological forecasting challenges organized by the U.S. CDC and other U.S. government agencies. In December 2016, the CDC named his group the "Most Accurate Forecaster" for 2015–2016, and in October 2017, the Delphi group's two systems took the top two spots in the 2016-2017 flu forecasting challenge. The CDC recognized Rosenfeld's Delphi group at Carnegie Mellon University as having contributed the most accurate national-, regional-, and state-level influenza-like illness forecasts and national-level hospitalization forecasts to the site. In 2019, the CDC recognized forecasts provided by the Delphi group at Carnegie Mellon as having been the most accurate for five seasons in a row, and named the Delphi group an Influenza Forecasting Center of Excellence, a five-year designation that includes $3 million in research funding. Rosenfeld describes his forecasting research goal as "to make epidemiological forecasting as universally accepted and useful as weather forecasting is today." His recent work in the area has focused on selecting high value epidemiological forecasting targets (e.g. Influenza and Dengue); creating baseline forecasting methods for them; establishing metrics for measuring and tracking forecasting accuracy; estimating the limits of forecastability for each target; and identifying new sources of data that could be helpful to the forecasting goal. == Honors and awards == 2017 Joel and Ruth Spira Teaching Award 2017 CDC Influenza Forecasting Challenge "Most Accurate Forecaster" 1992 Allen Newell Medal for Research Excellence

    Read more →
  • Karen Spärck Jones

    Karen Spärck Jones

    Karen Ida Boalth Spärck Jones (26 August 1935 – 4 April 2007) was a self-taught programmer and a pioneering British computer and information scientist responsible for the concept of inverse document frequency (IDF), a technology that underlies most modern search engines. She was an advocate for women in computer science, her slogan being, "Computing is too important to be left to men." In 2019, The New York Times published her belated obituary in its series Overlooked, calling her "a pioneer of computer science for work combining statistics and linguistics, and an advocate for women in the field." From 2008, to recognise her achievements in the fields of information retrieval (IR) and natural language processing (NLP), the Karen Spärck Jones Award is awarded annually to a recipient for outstanding research in one or both of her fields. == Early life and education == Karen Ida Boalth Spärck Jones was born in Huddersfield, Yorkshire, England. Her parents were Alfred Owen Jones, a chemistry lecturer, and Ida Spärck, a Norwegian who worked for the Norwegian government while in exile in London during World War II. Spärck Jones was educated at a grammar school in Huddersfield and then from 1953 to 1956 at Girton College, Cambridge, studying history, with an additional final year in Moral Sciences (philosophy). While at Cambridge, Spärck Jones joined the organisation known as the Cambridge Language Research Unit (CLRU) and met the head of CLRU Margaret Masterman, who would inspire her to go into computer science. While working at the CLRU, Spärck Jones began pursuing her PhD. At the time of submission, her PhD thesis was cast aside as uninspired and lacking original thought, but was later published in its entirety as a book. She briefly became a school teacher before moving into computer science. Spärck Jones married fellow Cambridge computer scientist Roger Needham in 1958. Spärck Jones's mother, Ida Spärck, had fled Norway on one of the last boats out after the German invasion in April 1940, going on to serve the Norwegian government in exile in London throughout the war. This background of displacement and resilience shaped the household in which Spärck Jones grew up. She later kept her mother's Norwegian surname professionally after marrying, stating that "it maintains a permanent existence of your own." Spärck Jones described her entry into computing as almost accidental. She had been working as a schoolteacher when she began visiting the CLRU out of curiosity about her husband's work. It was Margaret Masterman — whom she later described as "a very strange and interesting woman" — who offered her a research position and drew her fully into the field. == Career == Spärck Jones worked at the Cambridge Language Research Unit from the late 1950s, then at Cambridge University Computer Laboratory from 1974 until her retirement in 2002. From 1999, she held the post of Professor of Computers and Information. She had been given a permanent position only in 1993, and earlier in her career had been employed on a series of short-term contracts. She continued to work in the Computer Laboratory until shortly before her death. Her publications include nine books and numerous papers. A full list of her publications is available from the Cambridge Computer Laboratory. Spärck Jones' main research interests, since the late 1950s, were natural language processing and information retrieval. In 1964, Spärck Jones published "Synonymy and Semantic Classification", which is now seen as a foundational paper in the field of natural language processing. One of her most important contributions was the concept of inverse document frequency (IDF) weighting in information retrieval, which she introduced in a 1972 paper. IDF is used in most search engines today, usually as part of the term frequency–inverse document frequency (TF–IDF) weighting scheme. In the 1980s, Spärck Jones began her work on early speech recognition systems. In 1982 she became involved in the Alvey Programme which was an initiative to motivate more computer science research across the country. == Significance of inverse document frequency == At the time Spärck Jones was working, most computer scientists were focused on making people adapt to machines — learning precise codes and commands to retrieve information. Spärck Jones was working in the opposite direction: teaching computers to understand human language as it is actually used. Her 1972 paper introduced the concept of inverse document frequency (IDF) by observing that not all words carry equal informational value. A word like "the" appears in virtually every document and tells a retrieval system almost nothing about what any specific document is about. A rare word like "photosynthesis," by contrast, is highly specific and informative. IDF assigns each word a statistical weight based on how rarely it occurs across a document collection — the rarer the word, the higher its weight. When combined with term frequency (TF), which measures how often a word appears within a single document, the resulting TF–IDF score gives every word a relevance rating that can be used to rank documents in response to a search query. By 2007, Spärck Jones noted that "pretty much every web engine uses those principles." Her colleague John Tait remarked that "a lot of the stuff she was working on until five or ten years ago seemed like mad nonsense, and now we take it for granted." The 1972 paper remains among the most cited works in information retrieval research, with over 4,500 citations recorded in Google Scholar at the time of her death. The conceptual foundation of TF–IDF — that word meaning is statistical and contextual — has also informed later developments in machine learning and natural language processing, including transformer-based language models such as BERT. == Impact on artificial intelligence == Even though Spärck Jones' views on artificial intelligence (AI) were rather pessimistic in regard to the perceived limitations of AI in information retrieval, her work in natural language processing, information retrieval, and introducing the concept of inverse document frequency (IDF) contributed to the future technological development of AI. Her statistical and ranking methods shifted the direction of the development of AI towards being more expandable and led by data. Her work had a more indirect and conceptual impact on AI, compared to the current and direct impact it has had on search engines. == Gender and advocacy == Spärck Jones spent the majority of her career at Cambridge on short-term contracts without permanent employment, a situation she attributed directly to gender. In her 2001 IEEE oral history interview she stated that Cambridge was "in many ways not user-friendly, in the sense of women-friendly." She was frequently the only woman present in professional meetings throughout her career. She channelled this experience into active advocacy. She was a founding member of the women@cl network at Cambridge's Computer Laboratory, worked on outreach programmes aimed at encouraging girls into computing, and became widely known for her slogan: "Computing is too important to be left to men." She was the first woman ever to receive the BCS Lovelace Medal. === Honours and awards === These include: Gerard Salton Award (1988) Elected a Fellow of Association for the Advancement of Artificial Intelligence (AAAI) in 1993 President of the Association for Computational Linguistics (ACL) in 1994 Honorary degree of Doctor of Science from The City University in 1997. Elected a Fellow of the British Academy (FBA), where she also served as Vice-President in 2000–2002 Fellow of European Association for Artificial Intelligence (ECCAI) Association for Information Science and Technology (ASIS&T) Award of Merit (2002) Association for Computational Linguistics (ACL) Lifetime Achievement Award (2004) ACM - AAAI Allen Newell Award (2006) BCS Lovelace Medal (2007) Association for Computing Machinery (ACM) Women's Group Athena Award (2007) == Death and legacy == Spärck Jones died on 4 April 2007, due to cancer at the age of 71. In 2008, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the British Computer Society established an annual Karen Spärck Jones Award in her honour, to encourage and promote research that advances understanding of Natural Language Processing or Information Retrieval. The Karen Spärck Jones lecture sponsored by BCS recognises the contribution that women have made to computing. In August 2017, the University of Huddersfield renamed one of its campus buildings in her honour. Formerly known as Canalside West, the Spärck Jones building houses the University's School of Computing and Engineering. When Spärck Jones died in 2007, The Times did not publish an obituary for her, despite having published one for her husband Roger Needham in 2003. In 2019, The New York Times included her in its Overlooked series under the title "Ove

    Read more →
  • Robert Wilensky

    Robert Wilensky

    Robert Wilensky (26 March 1951 – 15 March 2013) was an American computer scientist and professor at the UC Berkeley School of Information, with his main focus of research in artificial intelligence. == Academic career == In 1971, Wilensky received his bachelor's degree in mathematics from Yale University, and in 1978, a Ph.D. in computer science from the same institution. After finishing his thesis, "Understanding Goal-Based Stories", Wilensky joined the faculty from the EECS Department of UC Berkeley. In 1986, he worked as the doctoral advisor of Peter Norvig, who then later published the standard textbook of the field: Artificial Intelligence: A Modern Approach. From 1993 to 1997, Wilensky was the Berkeley Computer Science Division Chair. During this time, he also served as director of the Berkeley Cognitive Science Program, director of the Berkeley Artificial Intelligence Research Project, and board member of the International Computer Science Institute. In 1997, he became a fellow of the Association for Computing Machinery "for research contributions to the areas of natural language processing and digital libraries as well as outstanding leadership in Computer Science." Furthermore, he also was a Fellow of the Association for the Advancement of Artificial Intelligence. He retired from faculty in 2007 and died on Friday, March 15, 2013, of a bacterial infection at the Alta Bates Summit Medical Center. Wilensky was married to Ann Danforth and he is survived by her and their two children, Avi and Eli Wilensky == Research == Throughout his career, Wilensky authored and co-authored over 60 scholarly articles and technical reports on AI, natural language processing, and information dissemination. In addition to his numerous technical publications, Wilensky also published two books on the programming language LISP, LISPcraft and Common LISPcraft, and had almost completed another book manuscript when he suffered a cardiac arrest and stopped writing. Among his publications are: R. Wilensky, (1986-09-17). Common LISPcraft. W. W. Norton & Company. ISBN 9780393955446. T. A. Phelps and R. Wilensky, "Toward active, extensible, networked documents: Multivalent architecture and applications," in Proc. 1st ACM Intl. Conf. on Digital Libraries, E. A. Fox and G. Marchionini, Eds., New York, NY: ACM Press, 1996, pp. 100–108. J. Traupman and R. Wilensky, "Experiments in Improving Unsupervised Word Sense Disambiguation," University of California, Berkeley, Department of EECS, Computer Science Division, Tech. Rep. 03–1227, Feb. 2003. R. Wilensky, Planning and Understanding: A Computational Approach to Human Reasoning, Advanced Book Program, Reading, MA: Addison-Wesley Publishing Co., 1983. R. Wilensky, "Understanding Goal-Based Stories," Yale University, Sep. 1978. B. Kahn and R. Wilensky, "A Framework for Distributed Digital Object Services", May 1995.

    Read more →
  • Attention (machine learning)

    Attention (machine learning)

    In machine learning, attention is a method that determines the importance of each component in a sequence relative to the other components in that sequence. In natural language processing, importance is represented by "soft" weights assigned to each word in a sentence. More generally, attention encodes vectors called token embeddings across a fixed-width sequence that can range from tens to millions of tokens in size. Unlike "hard" weights, which are computed during the backwards training pass, "soft" weights exist only in the forward pass and therefore change with every step of the input. Earlier designs implemented the attention mechanism in a serial recurrent neural network (RNN) language translation system, but a more recent design, namely the transformer, removed the slower sequential RNN and relied more heavily on the faster parallel attention scheme. Inspired by ideas about attention in humans, the attention mechanism was developed to address the weaknesses of using information from the hidden layers of recurrent neural networks. Recurrent neural networks favor information contained in words at the end of a sentence and thus deemed more recent, thereby tending to attenuate the significance and associated predictive weight assigned to information earlier in the sentence. Attention allows a token equal access to any part of a sentence directly, rather than only through the previous state. == History == Additional surveys of the attention mechanism in deep learning are provided by Niu et al. and Soydaner. The major breakthrough came with self-attention, where each element in the input sequence attends to all others, enabling the model to capture global dependencies. This idea was central to the Transformer architecture, which replaced recurrence with attention mechanisms. As a result, Transformers became the foundation for models like BERT, T5 and generative pre-trained transformers (GPT). == Overview == The modern era of machine attention was revitalized by grafting an attention mechanism (Fig 1. orange) to an Encoder-Decoder. Figure 2 shows the internal step-by-step operation of the attention block (A) in Fig 1. === Interpreting attention weights === In translating between languages, alignment is the process of matching words from the source sentence to words of the translated sentence. Networks that perform verbatim translation without regard to word order would show the highest scores along the (dominant) diagonal of the matrix. The off-diagonal dominance shows that the attention mechanism is more nuanced. Consider an example of translating I love you to French. On the first pass through the decoder, 94% of the attention weight is on the first English word I, so the network offers the word je. On the second pass of the decoder, 88% of the attention weight is on the third English word you, so it offers t'. On the last pass, 95% of the attention weight is on the second English word love, so it offers aime. In the I love you example, the second word love is aligned with the third word aime. Stacking soft row vectors together for je, t', and aime yields an alignment matrix: Sometimes, alignment can be multiple-to-multiple. For example, the English phrase look it up corresponds to cherchez-le. Thus, "soft" attention weights work better than "hard" attention weights (setting one attention weight to 1, and the others to 0), as we would like the model to make a context vector consisting of a weighted sum of the hidden vectors, rather than "the best one", as there may not be a best hidden vector. == Variants == Many variants of attention implement soft weights, such as fast weight programmers, or fast weight controllers (1992). A "slow" neural network outputs the "fast" weights of another neural network through outer products. The slow network learns by gradient descent. It was later renamed as "linearized self-attention". Bahdanau-style attention, also referred to as additive attention, Luong-style attention, which is known as multiplicative attention, Early attention mechanisms similar to modern self-attention were proposed using recurrent neural networks. However, the highly parallelizable self-attention was introduced in 2017 and successfully used in the Transformer model, positional attention and factorized positional attention. For convolutional neural networks, attention mechanisms can be distinguished by the dimension on which they operate, namely: spatial attention, channel attention, or combinations. These variants recombine the encoder-side inputs to redistribute those effects to each target output. Often, a correlation-style matrix of dot products provides the re-weighting coefficients. In the figures below, W is the matrix of context attention weights, similar to the formula in Overview section above. == Optimizations == === Flash attention === The size of the attention matrix is proportional to the square of the number of input tokens. Therefore, when the input is long, calculating the attention matrix requires a lot of GPU memory. Flash attention is an implementation that reduces the memory needs and increases efficiency without sacrificing accuracy. It achieves this by partitioning the attention computation into smaller blocks that fit into the GPU's faster on-chip memory, reducing the need to store large intermediate matrices and thus lowering memory usage while increasing computational efficiency. === FlexAttention === FlexAttention is an attention kernel developed by Meta that allows users to modify attention scores prior to softmax and dynamically chooses the optimal attention algorithm. == Applications == Attention is widely used in natural language processing, computer vision, and speech recognition. In NLP, it improves context understanding in tasks like question answering and summarization. In vision, visual attention helps models focus on relevant image regions, enhancing object detection and image captioning. === Attention maps as explanations for vision transformers === From the original paper on vision transformers (ViT), visualizing attention scores as a heat map (called saliency maps or attention maps) has become an important and routine way to inspect the decision making process of ViT models. One can compute the attention maps with respect to any attention head at any layer, while the deeper layers tend to show more semantically meaningful visualization. Attention rollout is a recursive algorithm to combine attention scores across all layers, by computing the dot product of successive attention maps. Because vision transformers are typically trained in a self-supervised manner, attention maps are generally not class-sensitive. When a classification head is attached to the ViT backbone, class-discriminative attention maps (CDAM) combines attention maps and gradients with respect to the class [CLS] token. Some class-sensitive interpretability methods originally developed for convolutional neural networks can be also applied to ViT, such as GradCAM, which back-propagates the gradients to the outputs of the final attention layer. Using attention as basis of explanation for the transformers in language and vision is not without debate. While some pioneering papers analyzed and framed attention scores as explanations, higher attention scores do not always correlate with greater impact on model performances. == Mathematical representation == === Standard scaled dot-product attention === For matrices: Q ∈ R m × d k , K ∈ R n × d k {\displaystyle Q\in \mathbb {R} ^{m\times d_{k}},K\in \mathbb {R} ^{n\times d_{k}}} and V ∈ R n × d v {\displaystyle V\in \mathbb {R} ^{n\times d_{v}}} , the scaled dot-product, or QKV attention, is defined as: Attention ( Q , K , V ) = softmax ( Q K T d k ) V ∈ R m × d v {\displaystyle {\text{Attention}}(Q,K,V)={\text{softmax}}\left({\frac {QK^{T}}{\sqrt {d_{k}}}}\right)V\in \mathbb {R} ^{m\times d_{v}}} where T {\displaystyle {}^{T}} denotes transpose and the softmax function is applied independently to every row of its argument. The matrix Q {\displaystyle Q} contains m {\displaystyle m} queries, while matrices K , V {\displaystyle K,V} jointly contain an unordered set of n {\displaystyle n} key-value pairs. Value vectors in matrix V {\displaystyle V} are weighted using the weights resulting from the softmax operation, so that the rows of the m {\displaystyle m} -by- d v {\displaystyle d_{v}} output matrix are confined to the convex hull of the points in R d v {\displaystyle \mathbb {R} ^{d_{v}}} given by the rows of V {\displaystyle V} . To understand the permutation invariance and permutation equivariance properties of QKV attention, let A ∈ R m × m {\displaystyle A\in \mathbb {R} ^{m\times m}} and B ∈ R n × n {\displaystyle B\in \mathbb {R} ^{n\times n}} be permutation matrices; and D ∈ R m × n {\displaystyle D\in \mathbb {R} ^{m\times n}} an arbitrary matrix. The softmax function is permutation equivariant in the sense that: softmax ( A D B ) = A softmax ( D ) B {\displays

    Read more →
  • Xuedong Huang

    Xuedong Huang

    Xuedong David Huang (born October 20, 1962) is a Chinese-American computer scientist and technology executive who has made contributions to spoken language processing and artificial intelligence, including Azure AI Services. He is Zoom's chief technology officer after serving as Microsoft's Technical Fellow and Azure AI Chief Technology Officer for 30 years. Huang is a strong advocate of AI for Accessibility, and AI for Cultural Heritage. == Education == Huang received his PhD from the University of Edinburgh in 1989 (sponsored by the British ORS and Edinburgh University Scholarship), his MS from Tsinghua University in 1984, and BS from Hunan University in 1982. == Career == After receiving his PhD in 1989, Huang joined Carnegie Mellon University and worked with Raj Reddy and Kai-Fu Lee on speech recognition. At CMU, he directed the Sphinx-II speech system research which achieved the best performance in every category of DARPA's 1992 benchmarking. Microsoft Research recruited him to found and lead Microsoft's spoken language initiatives in 1993. His co-authored book Spoken Language Processing and his Historical speech recognition review succinctly summarize several generations of spoken language research. As Microsoft's Mr. Speech for three decades, Huang has been instrumental in creating Microsoft's Speech Application Programming Interface (SAPI), shipping Microsoft Speech Server, and modernizing spoken language and integrative AI services via Azure AI, which not only enables millions of 3rd party customers but also powers up Microsoft's Windows, Office, Teams, and Azure OpenAI Services. Huang helped Microsoft and Azure Cognitive Services achieve multiple industry's first human parity milestones on the following open research tasks: transcribing conversational speech, machine translation, conversational QnA, and computer vision image captioning. Huang has made significant contributions to the software and AI industry through his executive leadership and his scientific publications, owning more than 170 US patents and impacting billions through Azure AI enabled products and services. In 2016, Wired magazine named him one of 25 Geniuses. In 2021, Azure AI was named the winner of InfoWorld's Technology of the Year Award. Huang was awarded the Allen Newell research excellence medal in 1992, and IEEE Speech Processing Best Paper in 1993. He was recognized as an IEEE Fellow by Institute of Electrical and Electronics Engineers in 2000, named ACM Fellow by Association for Computing Machinery in 2017, and a member of Washington State Academy of Sciences. Huang received 2022 Asian American Corporate Leadership Award, and IEEE Amar Bose Industrial Leader Award. In 2023, he was elected a member of the US National Academy of Engineering (NAE), and a member of the American Academy of Arts and Sciences.

    Read more →
  • Kurt Keutzer

    Kurt Keutzer

    Kurt Keutzer (born November 9, 1955) is an American computer scientist. == Early life and education == Kurt Keutzer grew up in Indianapolis, Indiana. He earned a bachelor's degree in mathematics from Maharishi University of Management (formerly Mararishi International University) in 1978, and a PhD in computer science from Indiana University Bloomington in 1984. == Career == Keutzer joined Bell Labs in 1984, where he worked on logic synthesis. In 1991, he joined the electronic design automation company Synopsys, where he was promoted to chief technology officer. He subsequently joined the University of California, Berkeley as a professor in 1998. His research at Berkeley has focused on the intersection of high performance computing and machine learning. Working with a number of graduate students at Berkeley, Keutzer developed FireCaffe, which scaled the training of deep neural networks to over 100 GPUs. Later, with LARS and LAMB optimizers, they scaled it to over 1000 servers. Keutzer and his students also developed deep neural networks such as SqueezeNet, SqueezeDet, and SqueezeSeg, which can run efficiently on mobile devices. Keutzer co-founded DeepScale with his PhD student Forrest Iandola in 2015, and Keutzer served as the company's chief strategy officer. The firm was focused on developing deep neural networks for advanced driver assistance systems in passenger cars. On October 1, 2019, electric vehicle manufacturer Tesla, Inc. purchased DeepScale to augment and accelerate its self-driving vehicle work. == Honors and awards == Keutzer was named a Fellow of the IEEE in 1996. Recipient of DAC Most Influential Paper (MIP) award (24th DAC, 1987) for his "Dagon: technology binding and local optimization by DAG matching” publication. == Books by Keutzer == 1988. Dwight Hill, Don Shugard, John Fishburn, and Kurt Keutzer. Algorithms and Techniques for VLSI Layout Synthesis. Springer. 1994. Srinivas Devadas, Abhijit Ghosh, and Kurt Keutzer. Logic Synthesis. McGraw-Hill. 2002. David Chinnery and Kurt Keutzer. Closing the Gap Between ASIC & Custom: Tools and Techniques for High-Performance ASIC Design. Springer. (2nd edition appeared in 2007.) 2004. Pinhong Chen, Desmond A. Kirkpatrick, and Kurt Keutzer. Static Crosstalk-Noise Analysis: For Deep Sub-Micron Digital Designs. Springer. 2005. Matthias Gries and Kurt Keutzer. Building ASIPs: The Mescal Methodology. Springer.

    Read more →