AI App Inventor

AI App Inventor — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • YNAB

    YNAB

    You Need a Budget (YNAB) (pronounced ) is an online personal budgeting program based on the envelope system developed by a privately owned American company of the same name. It is available via any web browser or a mobile app. == History == The program was initially developed as standalone software in 2004 by Jesse Mecham, while he was in college pursuing his master's degree in accounting, after he and his wife faced financial difficulty and decided to improve their budgeting. It evolved from a spreadsheet that he created for the budgeting process. The acronym stands for "you need a budget." In 2015 they changed their licensing model to software as a service. In 2020, YNAB had 115 employees, all working remotely. == Overview == The service encourages users to follow four principles or "rules": Give every dollar a job: Each dollar in a budget is allocated to a specific purpose. This concept is also called zero-based budgeting. Embrace true expenses: All expenses are planned for, so that there are no surprises. Roll with the punches: Being flexible when there is overspending. Age your money: Keeping money in your budget without immediately spending it. Users can either import transactions automatically from their financial institutions or input them manually. The software also displays financial reports to keep users informed about their finances at a glance. == Awards and recognition == YNAB has been named one of the best budgeting apps by U.S. News & World Report, Kiplinger's Personal Finance, CNN, HuffPost, CNBC, and hundreds of other financial reporting outlets. The Wall Street Journal – Best budgeting app for hands-on budgeters. Forbes – Best Budgeting Apps Money – Best budgeting app for college students. Lifehacker – Most popular personal finance software. Wirecutter – "Great pick for hard-core budgeters". Investopedia – Best overall budgeting app.

    Read more →
  • Top 10 AI Customer-support Bots Compared (2026)

    Top 10 AI Customer-support Bots Compared (2026)

    Trying to pick the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Nathalie Japkowicz

    Nathalie Japkowicz

    Nathalie Japkowicz is a Canadian computer scientist specializing in machine learning. She is a professor and department chair of computer science at the American University College of Arts and Sciences. == Life == Nathalie Japkowicz completed a B.Sc. at McGill University in 1988. She earned an M.Sc. from the University of Toronto in 1990. She completed a Ph.D. at Rutgers University in 1999. Her dissertation was titled Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Stephen José Hanson and Casimir Alexander Kulikowski were her doctoral advisors. Japkowicz worked at the University of Ottawa in the school of electrical engineering and computer science. She was the lead of its laboratory for research on machine learning for defense security. From 2003 to 2005, Japkowicz was the secretary of the Canadian Artificial Intelligence Association (CAIAC). She was CAIAC vice president from 2009 to 2014 and president from 2013 to 2015, and part-president from 2015 to 2017. Japkowicz is a professor and department chair of computer science at the American University College of Arts and Sciences. She researches artificial intelligence, machine learning, data mining, and big data analysis. == Selected works == Gao, Yong; Japkowicz, Nathalie, eds. (2009). Advances in Artificial Intelligence: 22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25–27, 2009 Proceedings. Lecture Notes in Computer Science. Vol. 5549. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-01818-3. ISBN 978-3-642-01817-6. S2CID 27083226. Japkowicz, Nathalie; Shah, Mohak (2011). Evaluating Learning Algorithms: A Classification Perspective (1 ed.). Cambridge University Press. doi:10.1017/cbo9780511921803. ISBN 978-0-511-92180-3. Japkowicz, Nathalie; Matwin, Stan, eds. (2015). Discovery Science: 18th International Conference, DS 2015, Banff, AB, Canada, October 4–6, 2015. Proceedings. Lecture Notes in Computer Science. Vol. 9356. Cham: Springer International Publishing. doi:10.1007/978-3-319-24282-8. ISBN 978-3-319-24281-1. S2CID 1302223. Japkowicz, Nathalie; Stefanowski, Jerzy, eds. (2016). Big Data Analysis: New Algorithms for a New Society. Studies in Big Data. Vol. 16. Cham: Springer International Publishing. doi:10.1007/978-3-319-26989-4. ISBN 978-3-319-26987-0. Ceci, Michelangelo; Japkowicz, Nathalie; Liu, Jiming; Papadopoulos, George A.; Raś, Zbigniew W., eds. (2018). Foundations of Intelligent Systems: 24th International Symposium, ISMIS 2018, Limassol, Cyprus, October 29–31, 2018, Proceedings. Lecture Notes in Computer Science. Vol. 11177. Cham: Springer International Publishing. doi:10.1007/978-3-030-01851-1. ISBN 978-3-030-01850-4. S2CID 53038780.

    Read more →
  • Claire Cardie

    Claire Cardie

    Claire Cardie is an American computer scientist specializing in natural language processing. Since 2006, she has been a professor of computer science and information science at Cornell University, and from 2010 to 2011 she was the first Charles and Barbara Weiss Chair of Information Science at Cornell. Her research interests include coreference resolution and sentiment analysis. == Education and career == Cardie is a 1982 graduate of Yale University, majoring in computer science. After working for several companies as a computer programmer, she returned to graduate study in the late 1980s and completed her Ph.D. at the University of Massachusetts Amherst in 1994. Her dissertation, Domain-Specific Knowledge Acquisition for Conceptual Sentence Analysis, was supervised by Wendy Lehnert. She has been on the Cornell University faculty since 1994, initially in computer science and since 2005 also in information science. She was an assistant professor (1994–2000) and associate professor (2000–06), before being promoted to a full professorship in 2006. In 2007 she founded a start-up company, Appinions, and she was its chief scientist until 2015. Her doctoral students at Cornell have included Amit Singhal and Kiri Wagstaff. == Recognition == Cardie became a Fellow of the Association for Computational Linguistics in 2016. She was elected as an ACM Fellow in 2019 "for contributions to natural language processing, including coreference resolution, information and opinion extraction". She was named to the 2021 class of Fellows of the American Association for the Advancement of Science.

    Read more →
  • Programming tool

    Programming tool

    A programming tool or software development tool is a computer program that is used to develop another computer program, usually by helping the developer manage computer files. For example, a programmer may use a tool called a source code editor to edit source code files, and then a compiler to convert the source code into machine code files. They may also use build tools that automatically package executable program and data files into shareable packages or install kits. A set of tools that are run one after another, with each tool feeding its output to the next one, is called a toolchain. An integrated development environment (IDE) integrates the function of several tools into a single program. Usually, an IDE provides a source code editor as well as other built-in or plug-in tools that help with compiling, debugging, and testing. Whether a program is considered a development tool can be subjective. Some programs, such as the GNU compiler collection, are used exclusively for software development while others, such as Notepad, are not meant specifically for development but are nevertheless often used for programming. == Categories == Notable categories of development tools: Assembler – Converts assembly language into machine code Bug tracking system – Software application that records software bugs Build automation – Building software via an unattended fashion Code review software – Activity where one or more people check a program's code Compiler – Software that translates code from one programming language to another Compiler-compiler – Program that generates parsers or compilers, a.k.a. parser generator Debugger – Software for debugging a computer program Decompiler – Program translating executable to source code Disassembler – Computer program to translate machine language into assembly language Documentation generator – Automation technology for creating software documentation Graphical user interface builder – Software development tool Linker – Program that combines intermediate build files into an executable file Loader – Loads executable files into memory and prepares them for execution by the CPU. Memory debugger – Software memory problem finder Minifier – Removal of unnecessary characters in code without changing its functionality Pretty-printer – Formatting to make code or markup easier to readPages displaying short descriptions of redirect targets Performance profiler – Measuring the time or resources used by a section of a computer program Static code analyzer – Analysis of computer programs without executing themPages displaying short descriptions of redirect targets Source code editor – Text editor specializing in software codePages displaying short descriptions of redirect targets Source code generation – Type of computer programmingPages displaying short descriptions of redirect targets Version control system – Stores and tracks versions of files

    Read more →
  • Sparse dictionary learning

    Sparse dictionary learning

    Sparse dictionary learning (also known as sparse coding or SDL) is a representation learning method which aims to find a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms, and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than any one of the signals being observed. These two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal, but also provide an improvement in sparsity and flexibility of the representation. One of the most important applications of sparse dictionary learning is in the field of compressed sensing or signal recovery. In compressed sensing, a high-dimensional signal can be recovered with only a few linear measurements, provided that the signal is sparse or near-sparse. Since not all signals satisfy this condition, it is crucial to find a sparse representation of that signal such as the wavelet transform or the directional gradient of a rasterized matrix. Once a matrix or a high-dimensional vector is transferred to a sparse space, different recovery algorithms like basis pursuit, CoSaMP, or fast non-iterative algorithms can be used to recover the signal. One of the key principles of dictionary learning is that the dictionary has to be inferred from the input data. The emergence of sparse dictionary learning methods was stimulated by the fact that in signal processing, one typically wants to represent the input data using a minimal amount of components. Before this approach, the general practice was to use predefined dictionaries such as Fourier or wavelet transforms. However, in certain cases, a dictionary that is trained to fit the input data can significantly improve the sparsity, which has applications in data decomposition, compression, and analysis, and has been used in the fields of image denoising and classification, and video and audio processing. Sparsity and overcomplete dictionaries have immense applications in image compression, image fusion, and inpainting. == Problem statement == Given the input dataset X = [ x 1 , . . . , x K ] , x i ∈ R d {\displaystyle X=[x_{1},...,x_{K}],x_{i}\in \mathbb {R} ^{d}} we wish to find a dictionary D ∈ R d × n : D = [ d 1 , . . . , d n ] {\displaystyle \mathbf {D} \in \mathbb {R} ^{d\times n}:D=[d_{1},...,d_{n}]} and a representation R = [ r 1 , . . . , r K ] , r i ∈ R n {\displaystyle R=[r_{1},...,r_{K}],r_{i}\in \mathbb {R} ^{n}} such that both ‖ X − D R ‖ F 2 {\displaystyle \|X-\mathbf {D} R\|_{F}^{2}} is minimized and the representations r i {\displaystyle r_{i}} are sparse enough. This can be formulated as the following optimization problem: argmin D ∈ C , r i ∈ R n ∑ i = 1 K ‖ x i − D r i ‖ 2 2 + λ ‖ r i ‖ 0 {\displaystyle {\underset {\mathbf {D} \in {\mathcal {C}},r_{i}\in \mathbb {R} ^{n}}{\text{argmin}}}\sum _{i=1}^{K}\|x_{i}-\mathbf {D} r_{i}\|_{2}^{2}+\lambda \|r_{i}\|_{0}} , where C ≡ { D ∈ R d × n : ‖ d i ‖ 2 ≤ 1 ∀ i = 1 , . . . , n } {\displaystyle {\mathcal {C}}\equiv \{\mathbf {D} \in \mathbb {R} ^{d\times n}:\|d_{i}\|_{2}\leq 1\,\,\forall i=1,...,n\}} , λ > 0 {\displaystyle \lambda >0} C {\displaystyle {\mathcal {C}}} is required to constrain D {\displaystyle \mathbf {D} } so that its atoms would not reach arbitrarily high values allowing for arbitrarily low (but non-zero) values of r i {\displaystyle r_{i}} . λ {\displaystyle \lambda } controls the trade off between the sparsity and the minimization error. The minimization problem above is not convex because of the ℓ0-"norm" and solving this problem is NP-hard. In some cases L1-norm is known to ensure sparsity and so the above becomes a convex optimization problem with respect to each of the variables D {\displaystyle \mathbf {D} } and R {\displaystyle \mathbf {R} } when the other one is fixed, but it is not jointly convex in ( D , R ) {\displaystyle (\mathbf {D} ,\mathbf {R} )} . === Properties of the dictionary === The dictionary D {\displaystyle \mathbf {D} } defined above can be "undercomplete" if n < d {\displaystyle n d {\displaystyle n>d} with the latter being a typical assumption for a sparse dictionary learning problem. The case of a complete dictionary does not provide any improvement from a representational point of view and thus isn't considered. Undercomplete dictionaries represent the setup in which the actual input data lies in a lower-dimensional space. This case is strongly related to dimensionality reduction and techniques like principal component analysis which require atoms d 1 , . . . , d n {\displaystyle d_{1},...,d_{n}} to be orthogonal. The choice of these subspaces is crucial for efficient dimensionality reduction, but it is not trivial. And dimensionality reduction based on dictionary representation can be extended to address specific tasks such as data analysis or classification. However, their main downside is limiting the choice of atoms. Overcomplete dictionaries, however, do not require the atoms to be orthogonal (they will never have a basis anyway) thus allowing for more flexible dictionaries and richer data representations. An overcomplete dictionary which allows for sparse representation of signal can be a famous transform matrix (wavelets transform, fourier transform) or it can be formulated so that its elements are changed in such a way that it sparsely represents the given signal in a best way. Learned dictionaries are capable of giving sparser solutions as compared to predefined transform matrices. == Algorithms == As the optimization problem described above can be solved as a convex problem with respect to either dictionary or sparse coding while the other one of the two is fixed, most of the algorithms are based on the idea of iteratively updating one and then the other. The problem of finding an optimal sparse coding R {\displaystyle R} with a given dictionary D {\displaystyle \mathbf {D} } is known as sparse approximation (or sometimes just sparse coding problem). A number of algorithms have been developed to solve it (such as matching pursuit and LASSO) and are incorporated in the algorithms described below. === Method of optimal directions (MOD) === The method of optimal directions (or MOD) was one of the first methods introduced to tackle the sparse dictionary learning problem. The core idea of it is to solve the minimization problem subject to the limited number of non-zero components of the representation vector: min D , R { ‖ X − D R ‖ F 2 } s.t. ∀ i ‖ r i ‖ 0 ≤ T {\displaystyle \min _{\mathbf {D} ,R}\{\|X-\mathbf {D} R\|_{F}^{2}\}\,\,{\text{s.t.}}\,\,\forall i\,\,\|r_{i}\|_{0}\leq T} Here, F {\displaystyle F} denotes the Frobenius norm. MOD alternates between getting the sparse coding using a method such as matching pursuit and updating the dictionary by computing the analytical solution of the problem given by D = X R + {\displaystyle \mathbf {D} =XR^{+}} where R + {\displaystyle R^{+}} is a Moore-Penrose pseudoinverse. After this update D {\displaystyle \mathbf {D} } is renormalized to fit the constraints and the new sparse coding is obtained again. The process is repeated until convergence (or until a sufficiently small residue). MOD has proved to be a very efficient method for low-dimensional input data X {\displaystyle X} requiring just a few iterations to converge. However, due to the high complexity of the matrix-inversion operation, computing the pseudoinverse in high-dimensional cases is in many cases intractable. This shortcoming has inspired the development of other dictionary learning methods. === K-SVD === K-SVD is an algorithm that performs SVD at its core to update the atoms of the dictionary one by one and basically is a generalization of K-means. It enforces that each element of the input data x i {\displaystyle x_{i}} is encoded by a linear combination of not more than T 0 {\displaystyle T_{0}} elements in a way identical to the MOD approach: min D , R { ‖ X − D R ‖ F 2 } s.t. ∀ i ‖ r i ‖ 0 ≤ T 0 {\displaystyle \min _{\mathbf {D} ,R}\{\|X-\mathbf {D} R\|_{F}^{2}\}\,\,{\text{s.t.}}\,\,\forall i\,\,\|r_{i}\|_{0}\leq T_{0}} This algorithm's essence is to first fix the dictionary, find the best possible R {\displaystyle R} under the above constraint (using Orthogonal Matching Pursuit) and then iteratively update the atoms of dictionary D {\displaystyle \mathbf {D} } in the following manner: ‖ X − D R ‖ F 2 = | X − ∑ i = 1 K d i x T i | F 2 = ‖ E k − d k x T k ‖ F 2 {\displaystyle \|X-\mathbf {D} R\|_{F}^{2}=\left|X-\sum _{i=1}^{K}d_{i}x_{T}^{i}\right|_{F}^{2}=\|E_{k}-d_{k}x_{T}^{k}\|_{F}^{2}} The next steps of the algorithm include rank-1 approximation of the residual matrix E k {\displaystyle E_{k}} , updating d k {\displaystyle d_{k}} and enforcing the s

    Read more →
  • Top 10 AI Background Removers Compared (2026)

    Top 10 AI Background Removers Compared (2026)

    Curious about the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Law and Corpus Linguistics

    Law and Corpus Linguistics

    Law and corpus linguistics (LCL) is an academic sub-discipline that uses large databases of examples of language usage equipped with tools designed by linguists called corpora to better get at the meaning of words and phrases in legal texts (statutes, constitutions, contracts, etc.). Thus, LCL is the application of corpus linguistic tools, theories, and methodologies to issues of legal interpretation in much the same way law and economics is the application of economic tools, theories, and methodologies to various legal issues. == History == A 2005 law review article by Lawrence Solan noted in passing that corpus linguistics had potential for its application to interpreting legal texts. But the first systematic exploration and advocacy of applying the tools and methodologies of corpus linguistics to legal interpretive questions of law and corpus linguistics came in the fall of 2010, when the BYU Law Review published a note by Stephen Mouritsen, entitled The Dictionary is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. The note argued that dictionaries are the primary linguistic tool used by judges to determine the plain or ordinary meaning of words and phrases, and highlighted the deficiencies of such an approach. In its stead, the note proposed using corpus linguistics. And the note would be later cited by Adam Liptak in a New York Times article on statutory construction. Law and corpus linguistics (LCL) gained greater legitimacy in July 2011 with the first judicial opinion in American history utilizing corpus linguistics to determine the meaning of a legal text: In re the Adoption of Baby E.Z. In a concurrence in part and in the judgment, Justice Thomas Lee wrote to put forth an alternative ground for the majority's holding—interpreting the phrase "custody determination" by using corpus linguistics. Justice Lee looked at 500 randomized sample sentences from the Corpus of Contemporary American English (COCA) and found that the most common sense of "custody" was in the context of divorce rather than adoption. Further, he found that "custody" is ten times more likely to co-occur (or collocate) with "divorce" than with "adoption". From that evidence Justice Lee concluded that he "would find that the custody proceedings covered by the Act are limited to proceedings resulting in the modifiable custody orders of a divorce", rather than the broader range of custody proceedings. Other jurisprudence and scholarship would follow. In a 2015 concurrence in State v. Rasabout, Justice Lee used a COCA search to determine that "discharge" when used with a firearm (or one of its synonyms) overwhelmingly referred to a single shot rather than emptying the entire magazine of the weapon. And in 2016, four of the five justices joined a footnote in a majority opinion by Justice Lee commending a party for using corpus linguistics in its briefing even though the Court found it unnecessary to resolve the related question. Finally, in 2016 the Michigan Supreme Court became the first court to use a linguist-designed corpus in a majority opinion (COCA), with both the majority and the dissent turning to COCA to determine the meaning of the word "information". In 2020, courts desiring to bolster the legal theory of original intent have sought the opportunity to undertake analyses of statutes utilizing corpus linguistics. In a Ninth Circuit Court of Appeals case, Jones v. Becerra (No. 20-56174), a case involving the Second Amendment and the constitutionality of a California statute which bans the sale of firearms to individuals under the age of 21, a Ninth Circuit panel requested that the parties address three questions: 1) “What is the original public meaning of the Second Amendment phrases: ‘A well regulated Militia’; ‘the right of the people’; and ‘shall not be infringed’? 2) How does the tool of corpus linguistics help inform the determination of the original public meaning of those Second Amendment phrases?” 3) How do the data yielded from corpus linguistics assist in the interpretation of the constitutionality of age-based restrictions under the Second Amendment? As to scholarship, in 2012, Mouritsen followed up his original work with an article in the Columbia Science and Technology Law Review, where he further refined and promoted the use of corpus-based methods for determining questions of legal ambiguity. Additionally, in 2016 two essays and an article on law and corpus linguistics were published. The Yale Law Journal Forum published Corpus Linguistics & Original Public Meaning: A New Tool to Make Originalism More Empirical. Written by Justice Lee and two co-authors, the essay urged originalists to turn to corpus linguistics to improve the rigor and accuracy of originalist scholarship. And in response, the Forum published an essay by Lawrence Solan (a Brooklyn Law professor with a PhD in linguistics), Can Corpus Linguistics Help Make Originalism Scientific? The Boston University Public Interest Law Journal published The Merciful Corpus: The Rule of Lenity, Ambiguity and Corpus Linguistics by Daniel Ortner. In the article Ortner applied corpus linguistics to determining whether sufficient ambiguity exists to trigger the rule of lenity in five Supreme Court cases. Looking forward, in 2017 two more articles are slated for publication. Lee Strang focuses on corpus linguistics and originalism in the U.C. Davis Law Review, and Lawrence Solan and Tammy Gales explore corpus linguistics in the context of finding ordinary meaning in statutory interpretation in the International Journal of Legal Discourse. Lawyers and journalists have also taken notice of corpus linguistics at it relates to the law. In 2010, Neal Goldfarb filed the first known brief in the Supreme Court using corpus linguistics (COCA) to determine whether the ordinary meaning of "personal" referred to corporations in the case FCC v. AT&T. The amicus brief looked at the top collocates (words that co-occur) of "personal" in COHA as well as BYU's Time Magazine Corpus. And writing for The Atlantic, Ben Zimmer took note of this new trend, referring to corpus linguistics in the courts as "Like Lexis on Steroids". On the academic front, in 2013 BYU Law School started the first class on law and corpus linguistics, co-taught by Mouritsen, Lee, and (now Dean) Gordon Smith. The class is currently in its fourth year. And in February 2016, BYU Law School hosted the inaugural conference on LCL, with over two dozen legal and linguistic scholars from around the country discussing and debating the next steps forward for the growing academic movement. The conference has been held regularly in subsequent years. At the 2016 conference BYU Law School announced its plans and progress on the Corpus of Founding Era American English (COFEA), a corpus that covers 1760–1799 and contains more than 120 million words have been collected from founding era letters, diaries, newspapers, non-fiction books, fiction, sermons, speeches, debates, legal cases, and other legal materials.

    Read more →
  • Ericom Connect

    Ericom Connect

    Ericom Connect is a remote access/application publishing solution produced by Ericom Software that provides secure, centrally managed access to physical or hosted desktops and applications running on Microsoft Windows and Linux systems. == Product overview == Ericom Connect is desktop virtualization and application virtualization software that allows users to run applications remotely, without installing them on the local computer or device. The software is noted for its scalability, ease of deployment, and compatibility with any type of infrastructure, cloud or physical. Ericom Connect uses AccessPad (native client for desktops), AccessToGo (native client for mobile), or AccessNow, one of the first HTML5 RDP solutions to support clientless access to Windows desktops and applications from any device with an HTML5-compatible browser, including Macintosh computers, mobile devices, and Google Chromebooks. Other notable features include performance monitoring, built-in real-time analytics & BI, support for two-factor authentication (using RSA SecurID), multi-tenancy and multi-datacenter support via a single unified web interface, and a “Launch Simulation” feature that allows users to visualize and simulate actual step-by-step user processes directly from within the administration console. In addition to scalability, by distributing configurations, logs, etc., across multiple servers there is no single point of failure, as can be the case if all configuration information is stored on one server. == History == Ericom Connect was introduced in 2015. Ericom Connect is a successor to Ericom PowerTerm Web Connect. PowerTerm Web Connect used an architecture similar to what was then current with Citrix and VMWare, relying on a centralized SQL server, a connection broker, image management for different hypervisors, and a variety of clients. Ericom Connect uses a new grid architecture that provides more scalability, reliability, and flexibility than before.

    Read more →
  • Joseph Keshet

    Joseph Keshet

    Joseph (Yossi) Keshet (Hebrew: יוסף (יוסי) קשת; born: 28 February 1973) is an Israeli professor in the Electrical and Computer Engineering Faculty of the Technion, where he is the director of the Speech, Language, and Deep Learning Lab. His research focuses on human speech processing and machine learning. == Early life and education == Keshet was born in Tel-Aviv. He graduated from the Amal School and began his academic studies at the Department of Electrical Engineering-Systems at Tel-Aviv University in 1991 and received his B.Sc. (Cum Laude) in 1994. Keshet served in the IDF Unit 8200 from 1995 to 2002 as the head of the speech processing research section in the R&D Center. During his service, he received a national award from the Administration for the Development of Weapons and Technological Infrastructure (Maf’at). Keshet was award his M.Sc. from the same department after he completed his Israel Defense Force service in 2002. His Dissertation was titled: Stop consonant spotting in continuous speech and was supervised by Dan Chazan from IBM Research Labs, Haifa. He continued his Ph.D. studies at the Hebrew University of Jerusalem until 2008. Prof. Yoram Singer supervised his thesis on Large Margin Algorithms for Discriminative Continuous Speech. == Career == Keshet was a Research Associate (postdoc) at IDIAP Research Institute, Martigny, Switzerland in 2007, and joined the TTI-Chicago and Department of Computer Science, University of Chicago, Chicago, IL in 2009 as Research Assistant Professor. In 2013, he returned to Israel and joined the Computer Science department at Bar-Ilan University as a senior lecturer and head of the Speech, Language, and Deep Learning Lab. In 2020, Keshet became a Founding Venture Partner at the Disruptive AI Venture Capital. In the same year, he also joined Amazon in Tel-Aviv as an Amazon Scholar. In 2022, Keshet joined the Faculty of Electrical and Computer Engineering at the Technion. == Research == Keshet's research work focuses on both machine learning and computational study of human speech and language. His work on speech and language concentrates on speech processing, speech recognition, acoustic phonetics, and pathological speech. In machine learning, Keshet is focused on deep learning and structured tasks. According to Google Scholar (September 2020), Keshet is one of the 15 most cited researchers in the field of spoken language processing. The algorithms that were developed in the Speech, Language, and Deep Learning Lab can analyze different pathological conditions in the throat and vocal cords based on the subject's voice. Other algorithms showed that the voice can be used to estimate physical and emotional state of the speaker. Another research led by Keshet suggested that it is possible to fool structured AI systems (like Google Voice). == Membership in professional societies == Keshet is the founder and chair of the Machine Learning for Speech and Language Processing Special Interest Group (SIGML) of the International Speech Communication Association (ISCA), from 2011. He is a senior member of the IEEE Signal Processing Society since 2018 and a member of ISCA since 2002. == Publications == Prof. Keshet has authored more than 70 scientific publications and edited one book. === Book === Joseph Keshet and Samy Bengio, Eds., Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods, John Wiley & Sons, March 2009. === Selected articles === Jacob T. Cohen, Alma Cohen, Limor Benyamini, Yossi Adi, Joseph Keshet, Predicting glottal closure insufficiency using fundamental frequency contour analysis, Head & Neck, Journal of the Sciences and Specialities of the Head and Neck, Volume 41, Issue 7, pp. 2324–2331, July 2019. Yehoshua Dissen, Jacob Goldberger, and Joseph Keshet, Formant Estimation and Tracking: A Deep Learning Approach, Journal of the Acoustical Society of America, 145 (2), February 2019. Joseph Keshet, Automatic speech recognition: A primer for speech-language pathology researchers, International Journal of Speech-Language Pathology, Vol. 20 No. 6, pp. 599–609, 2018. Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, Joseph Keshet, Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring, Usenix, 2018. Tzeviya Fuchs, Joseph Keshet, Spoken Term Detection Automatically Adjusted for a Given Threshold, IEEE Journal of Selected Topics in Signal Processing, Dec 2017, Volume 11, Issue 8, pp. 1–8. Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet, Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples, Neural Information and Processing Systems (NIPS), 2017. Joseph Keshet, Subhransu Maji, Tamir Hazan, and Tommi Jaakkola, Perturbation Models and PAC-Bayesian Generalization Bounds, in Perturbations, Optimization, and Statistics, Tamir Hazan, George Papandreou, and Daniel Tarlow, Eds., The MIT Press, 2016. Matthew Goldrick, Joseph Keshet, Erin Gustafson, Jordana Heller, and Jeremy Needle, Automatic Analysis of Slips of the Tongue: Insights into the Cognitive Architecture of Speech Production, Cognition, 149, 31–39, 2016. Joseph Keshet, Optimizing the Measure of Performance in Structured Prediction, in Advanced Structured Prediction, Sebastian Nowozin, Peter V. Gehler, Jeremy January, and Christoph H. Lampert, Eds., The MIT Press, 2014. Morgan Sonderegger and Joseph Keshet, Automatic Measurement of Voice Onset Time using Discriminative Structured Prediction, Journal of the Acoustical Society of America, Vol. 132, Issue 6, pp. 3965−3979, 2012. David McAllester, Tamir Hazan and Joseph Keshet, Direct Loss Minimization for Structured Prediction, The 24th Annual Conference on Neural Information Processing Systems (NIPS), 2010. Joseph Keshet, David Grangier and Samy Bengio, Discriminative Keyword Spotting, Speech Communication, Volume 51, Issue 4, pp. 317–329, April 2009. == Personal life == Keshet is married to Lital. They have three children.

    Read more →
  • Top 10 AI Avatar Generators Compared (2026)

    Top 10 AI Avatar Generators Compared (2026)

    In search of the best AI avatar generator? An AI avatar generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI avatar generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Global Language Monitor

    Global Language Monitor

    The Global Language Monitor (GLM) is a company based in Austin, Texas, that analyzes trends in the English language. == History == Founded in Silicon Valley in 2003 by Paul J.J. Payack, the GLM describes its role as "a media analytics company that documents, analyzes and tracks cultural trends in language the world over, with a particular emphasis upon International and Global English". In April 2008, GLM moved its headquarters from San Diego to Austin. In July 2020, GLM announced that the word covid was its Top Word of 2020 for English. The company has been repeatedly criticized by linguists for promoting misinformation about language. Writing on Language Log, the linguist Ben Zimmer accused it of "hoodwink[ing] unsuspecting journalists on a range of pseudoscientific claims".

    Read more →
  • Data access layer

    Data access layer

    A data access layer (DAL) is a software architectural layer that provides access to data from one or more sources, such as a relational database, NoSQL database, SQL query engine, file system, or other persistent storage. It separates client code from the details of storage systems, query execution, connection handling, and data retrieval. Data access layers are commonly used to centralize data access logic, reduce coupling between applications and data sources, and provide a consistent interface for retrieving, writing, or querying data. Depending on the system, a data access layer may be implemented as application code, a shared library, an intermediary service, or part of a broader database abstraction layer. == In application architecture == In application software, a data access layer provides a boundary between business logic or application code and the systems used to store or retrieve data. For example, a data access layer may expose methods or interfaces for retrieving, writing, or querying data while hiding details such as connection management, SQL statements, storage APIs, error handling, and result conversion. Depending on the application, the layer may return objects, records, tabular results, documents, streams, or other representations of data. A common implementation is a set of classes, functions, or methods that directly reference database queries, stored procedures, storage APIs, or other data sources. For example, instead of using commands such as insert, delete, and update throughout an application to access a specific table, methods such as registerUser or loginUser may be implemented inside the data access layer. Business logic methods from an application can also be mapped to the data access layer. Instead of making several database queries directly, an application can call a single DAL method that abstracts those database calls. Applications using a data access layer may be either dependent on or independent from a particular database server. If the data access layer supports multiple database systems, the application can use any database system that the DAL can access. In either case, the data access layer provides a centralized location for calls into the underlying data store, which can make it easier to maintain, test, or port the application to other storage systems. == Implementation patterns == A data access layer can be implemented using several patterns and technologies, including data access objects, repositories, stored procedures, query builders, database drivers, or object–relational mapping tools. These mechanisms may implement part or all of a data access layer, but are not always equivalent to the layer itself. Object–relational mapping tools are commonly used in data access layers for object-oriented applications that map records in a relational database to objects in a programming language. Other data access layers may expose lower-level database interfaces, tabular results, document-oriented data, files, streams, or protocol-level interfaces. == Use with multiple underlying data systems == A data access layer may be used to abstract differences between multiple underlying data systems, allowing applications to access them through a more consistent interface. In such designs, applications call the DAL rather than interacting directly with each database or storage system. The layer may then handle connection management, query generation, result mapping, error handling, and other implementation details. A data access layer may be implemented as a shared library or as an intermediary service, such as a proxy or gateway. In this configuration, client applications or services connect to the data access layer, which then communicates with one or more underlying databases or query engines. This can provide a common location for authentication, authorization, logging, routing, and translation between different database interfaces. == Interfaces and protocols == Data access layers may expose or use standardized interfaces and protocols for database access. Examples include Open Database Connectivity (ODBC), Java Database Connectivity (JDBC), database-native wire protocols, and newer interfaces such as Apache Arrow Database Connectivity (ADBC) and Arrow Flight SQL. In systems that support multiple data stores, a data access layer may provide a consistent interface while using different drivers, protocols, or query mechanisms internally. == Distinction from related patterns == A data access layer is related to, but broader than, a data access object, which is usually an object-oriented design pattern for encapsulating access to a persistence mechanism. It is also related to a database abstraction layer, which focuses on hiding differences between database systems. In practice, the terms may overlap.

    Read more →
  • Mona Diab

    Mona Diab

    Mona Talat Diab (Arabic: منى طلعت دياب) is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning. == Education == Diab completed her M.Sc. in computer science with a major in machine learning and artificial intelligence at The George Washington University (1997) and her Ph.D. in computational linguistics at the University of Maryland, Linguistics Department and University of Maryland Institute for Advanced Computer Studies (UMIACS) in 2003, under the supervision of Philip Resnik. She was also a postdoctoral research scientist at Stanford University (2003–2005) under the mentorship of Dan Jurafsky, where she was a part of the Stanford NLP Group. == Career == After her postdoc at Stanford, Diab took a position as research scientist (principal investigator) at the Center for Computational Learning Systems (CCLS) in Columbia University, where she was also adjunct professor in the computer science department. In 2013 she joined the George Washington University as an associate professor, where she was promoted to full professor in 2017. Diab is the founder and director of the GW NLP lab CARE4Lang. Diab served as an elected faculty senator at Columbia University for 6 years (2007–2012) and an elected faculty senator at GW (2013–2014). She served the computational linguistics community as elected member, secretary and president of ACL SIGLEX (2005–2016) and elected president of ACL SIGSemitic. She currently serves as the elected VP-elect for ACL SIGDAT. In 2017 Diab joined Amazon AWS AI Deep Learning Group for Human Language Technologies, where she led the AWS Lex project for task oriented dialogue systems for enterprises. A couple of years later, she moved to Facebook AI as a research scientist. In the fall of 2023, she became the director of CMU's Language Technologies Institute -- the first full time director since the passing of its founder Jaime Carbonell. == Research == Diab's research interests include several areas in computational linguistics/natural language processing, like conversational AI, computational lexical semantics, multilingual and cross lingual processing, social media processing with an emphasis on computational socio- pragmatics, information extraction & text analytics, machine translation. Besides this, she also has special interests in Arabic NLP and low resource scenarios. Diab co-established two research trends in the computational linguistics field, computational approaches to linguistic code switching in 2007 and semantic textual similarity in 2010. Diab together with Nizar Habash and Owen Rambow, co-founded CADIM in 2005, a global reference point in Arabic dialect processing. In 2012, Diab together with Eneko Agirre and Johan Bos, brought together two ACL communities SIGLEX and SIGSEM and established the 1st tier conference SEM. == Awards and recognition == Selected as one of top 150 leaders and visionaries in AI nationwide to participate in White House AI Summit in Government, Washington, D.C., US, September 2019 March 2017: 3 Muslim Women in STEM You Should Know About, Teen Vogue, March 2017 May 2017: Behind Every Strong Woman Is...Another Strong Woman: Ten women give thanks to the women who supported them on the way up. Elle, May 2017. Google Faculty Research Award – Tharwa++: Building a multidialectal Arabic Lexical Repository, (PI), 09.2015 –12.2016. Google Faculty Research Award – Nuanced Sentiment and Perspective Analysis for Arabic Social Media Text, (PI), 12.2014 –12.2015 QNRF Best Poster Award – Ossama Obeid, Houda Bouamor, Wajdi Zaghouani, Mahmoud Ghoneim, Abdelati Hawwari, Mona Diab, Kemal Oflazer. (2016) MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization. Proceedings of the 2nd Workshop on Arabic Corpora and Processing Tools, LREC 2016. Best Paper Award – Aminian, Maryam, Mahmoud Ghoneim, Mona Diab. (2015) Unsupervised False Friend Disambiguation Using Contextual Word Clusters and Parallel Word Alignments. In Proceedings of Workshop 9th Semantics Syntax Statistical Translation, NAACL 2015, Denver CO, US. == Publications == Diab has over 250 publications, and she is an acting editor for several scientific journals. === Selected publications === Semeval-2012 task 6: A pilot on semantic textual similarity. E. Agirre, D. Cer, M. Diab, A. Gonzalez-Agirre. SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012) Predictive linguistic features of schizophrenia. ES Kayi, M Diab, L Pauselli, M Compton, G Coppersmith. arXiv preprint arXiv:1810.09377 Ideological perspective detection using semantic features. H Elfardy, M Diab, C Callison-Burch – Proceedings of SEM 2015 DeSePtion: Dual sequence prediction and adversarial examples for improved fact-checking. Christopher Hidey, Tuhin Chakrabarty, Tariq Alhindi, Siddharth Varia, Kriste Krstovski, Mona Diab, Smaranda Muresan, 2020 Does Causal Coherence Predict Online Spread of Social Media? Pedram Hosseini, Mona Diab, David A Broniatowski. Proceedings of International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, 2019. Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections. YA Lai, X Zhu, Y Zhang, M Diab, arXiv preprint arXiv:2003.08529, 2020 Readability of written medicine information materials in Arabic language: expert and consumer evaluation. S Al Aqeel, N Abanmy, A Aldayel, H Al-Khalifa, M Al-Yahya, M Diab. BMC health services research 18 (1), 1–7, 2019 Unsupervised word mapping using structural similarities in monolingual embeddings. H Aldarmaki, M Mohan, M Diab – Transactions of the Association for Computational Linguistics, 2018 An unsupervised method for word sense tagging using parallel corpora M Diab, P Resnik. Proceedings of ACL 2002 Overview for the first shared task on language identification in code-switched data. Thamar Solorio, Elizabeth Blair, Suraj Maharjan, Steven Bethard, Mona Diab, Mahmoud Ghoneim, Abdelati Hawwari, Fahad AlGhamdi, Julia Hirschberg, Alison Chang, Pascale Fung. Proceedings of the First Workshop on Computational Approaches to Code Switching, 2014 Modeling sentences in the latent space. W Guo, M Diab – ACL 20 12 Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. M Carpuat, M Diab – NAACL-HLT 2010 Rumor detection and classification for twitter data. S Hamidian, MT Diab – arXiv preprint arXiv:1912.08926, 2019 Subgroup detection in ideological discussions. A Abu-Jbara, P Dasigi, M Diab, D Radev – ACL 2012 Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of arabic. A. Pasha, M. Al-Badrashiny, M. Diab, A. El Kholy, R. Eskander, N. Habash, M. Pooleery, O. Rambow, R. Roth. LREC 14, 1094–1101. 2014 Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots. A. Gupta, P. Zhang, G. Lalwani, M. Diab. EMNLP 2019 A multitask learning approach for diacritic restoration. S. Alqahtani, A. Mishra, M. Diab. ACL 2020

    Read more →
  • Myhill–Nerode theorem

    Myhill–Nerode theorem

    In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 (Nerode & Sauer 1957, p. ii). == Statement == Given a language L {\displaystyle L} , and a pair of strings x {\displaystyle x} and y {\displaystyle y} , define a distinguishing extension to be a string z {\displaystyle z} such that exactly one of the two strings x z {\displaystyle xz} and y z {\displaystyle yz} belongs to L {\displaystyle L} . Define a relation ∼ L {\displaystyle \sim _{L}} on strings as x ∼ L y {\displaystyle x\;\sim _{L}\ y} if there is no distinguishing extension for x {\displaystyle x} and y {\displaystyle y} . It is easy to show that ∼ L {\displaystyle \sim _{L}} is an equivalence relation on strings, and thus it divides the set of all strings into equivalence classes. The Myhill–Nerode theorem states that a language L {\displaystyle L} is regular if and only if ∼ L {\displaystyle \sim _{L}} has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting L {\displaystyle L} . Furthermore, every minimal DFA for the language is isomorphic to the canonical one (Hopcroft & Ullman 1979). Generally, for any language, the constructed automaton is a state automaton acceptor. However, it does not necessarily have finitely many states. The Myhill–Nerode theorem shows that finiteness is necessary and sufficient for language regularity. Some authors refer to the ∼ L {\displaystyle \sim _{L}} relation as Nerode congruence, in honor of Anil Nerode. == Use and consequences == The Myhill–Nerode theorem may be used to show that a language L {\displaystyle L} is regular by proving that the number of equivalence classes of ∼ L {\displaystyle \sim _{L}} is finite. This may be done by an exhaustive case analysis in which, beginning from the empty string, distinguishing extensions are used to find additional equivalence classes until no more can be found. For example, the language consisting of binary representations of numbers that can be divided by 3 is regular. Given two binary strings x , y {\displaystyle x,y} , extending them by one digit gives 2 x + b , 2 y + b {\displaystyle 2x+b,2y+b} , so 2 x + b ≡ 2 y + b mod 3 {\displaystyle 2x+b\equiv 2y+b\mod 3} iff x ≡ y mod 3 {\displaystyle x\equiv y\mod 3} . Thus, 00 {\displaystyle 00} (or 11 {\displaystyle 11} ), 01 {\displaystyle 01} , and 10 {\displaystyle 10} are the only distinguishing extensions, resulting in the 3 classes. The minimal automaton accepting our language would have three states corresponding to these three equivalence classes. Another immediate corollary of the theorem is that if for a language L {\displaystyle L} the relation ∼ L {\displaystyle \sim _{L}} has infinitely many equivalence classes, it is not regular. It is this corollary that is frequently used to prove that a language is not regular. == Generalizations == The Myhill–Nerode theorem can be generalized to tree automata.

    Read more →