AI Analytics Hub

AI Analytics Hub — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Artificial intelligence

    Artificial intelligence

    Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in engineering, mathematics and computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines, chatbots, virtual assistants, autonomous vehicles, and play and analysis in strategy games (e.g., chess and Go). Since the 2020s, generative AI has become widely available to generate images, audio, and videos from text prompts. The traditional goals of AI research include learning, reasoning, knowledge representation, planning, natural language processing, and perception, as well as support for robotics. To reach these goals, AI researchers have used techniques including state space search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics. AI also draws upon psychology, linguistics, philosophy, neuroscience, and other fields. Some companies, such as OpenAI, Google DeepMind and Meta, aim to create artificial general intelligence (AGI) – AI that can complete virtually any cognitive task at least as well as a human. Artificial intelligence was founded as an academic discipline in 1956, and the field went through multiple cycles of optimism throughout its history, followed by periods of disappointment and loss of funding, known as AI winters. Funding and interest increased substantially after 2012, when graphics processing units began being used to accelerate neural networks, and deep learning outperformed previous AI techniques. This growth accelerated further after 2017 with the transformer architecture. In the 2020s, an AI boom has coincided with advances in generative AI, which allowed for the creation and modification of media. In addition to AI safety and unintended consequences and harms from the use of AI, ethical concerns, AI's long-term effects, and potential existential risks have prompted discussions of AI regulation. == Goals == The general problem of simulating (or creating) intelligence has been broken into subproblems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research. === Reasoning and problem-solving === Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions. By the late 1980s and 1990s, methods were developed for dealing with uncertain or incomplete information, employing concepts from probability and economics. Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": They become exponentially slower as the problems grow. Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments. Accurate and efficient reasoning is an unsolved problem. === Knowledge representation === Knowledge representation and knowledge engineering allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval, scene interpretation, clinical decision support, knowledge discovery (mining "interesting" and actionable inferences from large databases), and other areas. A knowledge base is a body of knowledge represented in a form that can be used by a program. An ontology is the set of objects, relations, concepts, and properties used by a particular domain of knowledge. Knowledge bases need to represent things such as objects, properties, categories, and relations between objects; situations, events, states, and time; causes and effects; knowledge about knowledge (what we know about what other people know); default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing); and many other aspects and domains of knowledge. Among the most difficult problems in knowledge representation are the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous); and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally). There is also the difficulty of knowledge acquisition, the problem of obtaining knowledge for AI applications. === Planning and decision-making === An "agent" is any entity (artificial or not) that perceives and takes actions in the world. A rational agent has goals or preferences and takes actions to make them happen. In automated planning, the agent has a specific goal. In automated decision-making, the agent has preferences—there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision-making agent assigns a number to each situation (called the "utility") that measures how much the agent prefers it. For each possible action, it can calculate the "expected utility": the utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility. In classical planning, the agent knows exactly what the effect of any action will be. In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked. Alongside thorough testing and improvement based on previous decisions, having an explanation for why the agent took certain decisions is a way to build trust, especially when the decisions have to be relied upon. In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with inverse reinforcement learning), or the agent can seek information to improve its preferences. Information value theory can be used to weigh the value of exploratory or experimental actions. The space of possible future actions and situations is typically intractably large, so the agents must take actions and evaluate situations while being uncertain of what the outcome will be. A Markov decision process has a transition model that describes the probability that a particular action will change the state in a particular way and a reward function that supplies the utility of each state and the cost of each action. A policy associates a decision with each possible state. The policy could be calculated (e.g., by iteration), be heuristic, or it can be learned. Game theory describes the rational behavior of multiple interacting agents and is used in AI programs that make decisions that involve other agents. === Learning === Machine learning is the study of programs that can improve their performance on a given task automatically. It has been a part of AI from the beginning. There are several kinds of machine learning. Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance. Supervised learning requires labeling the training data with the expected answers, and comes in two main varieties: classification (where the program must learn to predict what category the input belongs in) and regression (where the program must deduce a numeric function based on numeric input). In reinforcement learning, the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good". Transfer learning is when the knowledge gained from one problem is applied to a new problem. Deep learning is a type of machine learning that runs inputs through biologically inspired artificial neural networks for all of these types of learning. Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization. === Natural language processing === Natural language processing (NLP) allows programs to read, write and communicate in human languages. Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering. Early work, based on Noam Chomsky's generative grammar and semantic networks, had difficulty with word-sense disambiguation unless

    Read more →
  • How to Choose an AI Image Generator

    How to Choose an AI Image Generator

    Shopping for the best AI image generator? An AI image generator is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI image generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Law and Corpus Linguistics

    Law and Corpus Linguistics

    Law and corpus linguistics (LCL) is an academic sub-discipline that uses large databases of examples of language usage equipped with tools designed by linguists called corpora to better get at the meaning of words and phrases in legal texts (statutes, constitutions, contracts, etc.). Thus, LCL is the application of corpus linguistic tools, theories, and methodologies to issues of legal interpretation in much the same way law and economics is the application of economic tools, theories, and methodologies to various legal issues. == History == A 2005 law review article by Lawrence Solan noted in passing that corpus linguistics had potential for its application to interpreting legal texts. But the first systematic exploration and advocacy of applying the tools and methodologies of corpus linguistics to legal interpretive questions of law and corpus linguistics came in the fall of 2010, when the BYU Law Review published a note by Stephen Mouritsen, entitled The Dictionary is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. The note argued that dictionaries are the primary linguistic tool used by judges to determine the plain or ordinary meaning of words and phrases, and highlighted the deficiencies of such an approach. In its stead, the note proposed using corpus linguistics. And the note would be later cited by Adam Liptak in a New York Times article on statutory construction. Law and corpus linguistics (LCL) gained greater legitimacy in July 2011 with the first judicial opinion in American history utilizing corpus linguistics to determine the meaning of a legal text: In re the Adoption of Baby E.Z. In a concurrence in part and in the judgment, Justice Thomas Lee wrote to put forth an alternative ground for the majority's holding—interpreting the phrase "custody determination" by using corpus linguistics. Justice Lee looked at 500 randomized sample sentences from the Corpus of Contemporary American English (COCA) and found that the most common sense of "custody" was in the context of divorce rather than adoption. Further, he found that "custody" is ten times more likely to co-occur (or collocate) with "divorce" than with "adoption". From that evidence Justice Lee concluded that he "would find that the custody proceedings covered by the Act are limited to proceedings resulting in the modifiable custody orders of a divorce", rather than the broader range of custody proceedings. Other jurisprudence and scholarship would follow. In a 2015 concurrence in State v. Rasabout, Justice Lee used a COCA search to determine that "discharge" when used with a firearm (or one of its synonyms) overwhelmingly referred to a single shot rather than emptying the entire magazine of the weapon. And in 2016, four of the five justices joined a footnote in a majority opinion by Justice Lee commending a party for using corpus linguistics in its briefing even though the Court found it unnecessary to resolve the related question. Finally, in 2016 the Michigan Supreme Court became the first court to use a linguist-designed corpus in a majority opinion (COCA), with both the majority and the dissent turning to COCA to determine the meaning of the word "information". In 2020, courts desiring to bolster the legal theory of original intent have sought the opportunity to undertake analyses of statutes utilizing corpus linguistics. In a Ninth Circuit Court of Appeals case, Jones v. Becerra (No. 20-56174), a case involving the Second Amendment and the constitutionality of a California statute which bans the sale of firearms to individuals under the age of 21, a Ninth Circuit panel requested that the parties address three questions: 1) “What is the original public meaning of the Second Amendment phrases: ‘A well regulated Militia’; ‘the right of the people’; and ‘shall not be infringed’? 2) How does the tool of corpus linguistics help inform the determination of the original public meaning of those Second Amendment phrases?” 3) How do the data yielded from corpus linguistics assist in the interpretation of the constitutionality of age-based restrictions under the Second Amendment? As to scholarship, in 2012, Mouritsen followed up his original work with an article in the Columbia Science and Technology Law Review, where he further refined and promoted the use of corpus-based methods for determining questions of legal ambiguity. Additionally, in 2016 two essays and an article on law and corpus linguistics were published. The Yale Law Journal Forum published Corpus Linguistics & Original Public Meaning: A New Tool to Make Originalism More Empirical. Written by Justice Lee and two co-authors, the essay urged originalists to turn to corpus linguistics to improve the rigor and accuracy of originalist scholarship. And in response, the Forum published an essay by Lawrence Solan (a Brooklyn Law professor with a PhD in linguistics), Can Corpus Linguistics Help Make Originalism Scientific? The Boston University Public Interest Law Journal published The Merciful Corpus: The Rule of Lenity, Ambiguity and Corpus Linguistics by Daniel Ortner. In the article Ortner applied corpus linguistics to determining whether sufficient ambiguity exists to trigger the rule of lenity in five Supreme Court cases. Looking forward, in 2017 two more articles are slated for publication. Lee Strang focuses on corpus linguistics and originalism in the U.C. Davis Law Review, and Lawrence Solan and Tammy Gales explore corpus linguistics in the context of finding ordinary meaning in statutory interpretation in the International Journal of Legal Discourse. Lawyers and journalists have also taken notice of corpus linguistics at it relates to the law. In 2010, Neal Goldfarb filed the first known brief in the Supreme Court using corpus linguistics (COCA) to determine whether the ordinary meaning of "personal" referred to corporations in the case FCC v. AT&T. The amicus brief looked at the top collocates (words that co-occur) of "personal" in COHA as well as BYU's Time Magazine Corpus. And writing for The Atlantic, Ben Zimmer took note of this new trend, referring to corpus linguistics in the courts as "Like Lexis on Steroids". On the academic front, in 2013 BYU Law School started the first class on law and corpus linguistics, co-taught by Mouritsen, Lee, and (now Dean) Gordon Smith. The class is currently in its fourth year. And in February 2016, BYU Law School hosted the inaugural conference on LCL, with over two dozen legal and linguistic scholars from around the country discussing and debating the next steps forward for the growing academic movement. The conference has been held regularly in subsequent years. At the 2016 conference BYU Law School announced its plans and progress on the Corpus of Founding Era American English (COFEA), a corpus that covers 1760–1799 and contains more than 120 million words have been collected from founding era letters, diaries, newspapers, non-fiction books, fiction, sermons, speeches, debates, legal cases, and other legal materials.

    Read more →
  • Léon Bottou

    Léon Bottou

    Léon-Yves Bottou (French pronunciation: [leɔ̃ bɔtu]; born 1965) is a researcher best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image compression technology (together with Yann LeCun and Patrick Haffner), and the maintainer of DjVuLibre, the open source implementation of DjVu. He is the original developer of the Lush programming language. == Life == Léon Bottou was born in France in 1965. He obtained the Diplôme d'Ingénieur from École Polytechnique in 1987, a Magistère de Mathématiques Fondamentales et Appliquées et d’Informatique from École Normale Supérieure in 1988, a Diplôme d'Études Approndies in Computer Science in 1988, in 1988, and a PhD from Université Paris-Sud in 1991. In 1988, in collaboration with Yann LeCun, he published SN, a software package for simulating artificial neural networks. His master's thesis concerned using Time Delay Neural Networks for speech recognition. He then joined the Adaptive Systems Research Department at AT&T Bell Laboratories in Holmdel, New Jersey, where he collaborated with Vladimir Vapnik on local learning algorithms. in 1992, he returned to France and founded Neuristique S.A., a company that produced machine learning tools and one of the first data mining software packages, including Lush, an object-oriented programming language based on C and Lisp designed for training and using large-scale neural networks. In 1995, he returned to Bell Laboratories, where he developed a number of new machine learning methods, such as Graph Transformer Networks (similar to conditional random field), and applied them to handwriting recognition and OCR. The bank check recognition system that he helped develop was widely deployed by NCR and other companies, reading over 10% of all the checks in the US in the late 1990s and early 2000s. In 1996, he joined AT&T Labs and worked primarily on the DjVu image compression technology, that is used by some websites, notably the Internet Archive, to distribute scanned documents. Between 2002 and 2010, he was a research scientist at NEC Laboratories in Princeton, New Jersey, where he focused on the theory and practice of machine learning with large-scale datasets, on-line learning, and stochastic optimization methods. He developed the open source software LaSVM for fast large-scale support vector machine, and stochastic gradient descent software for training linear SVM and Conditional Random Fields. In 2010 he joined the Microsoft adCenter in Redmond, Washington, and in 2012 became a Principal Researcher at Microsoft Research in New York City. In March 2015 he joined Facebook Artificial Intelligence Research, also in New York City, as a research lead. His work in gradient descent argued that both stochastic gradient descent and batch gradient descent reach similar levels of loss with the same number of training samples, but SGD is faster when running on large datasets. He also argued that second-order gradient descent methods, such as quasi-Newton methods, can be beneficial compared to plain SGD. See (Bottou et al 2018) for a review. He was program chair of the 2013 Conference on Neural Information Processing Systems and the 2009 International Conference on Machine Learning. In 2007, he was received one of the first Blavatnik Awards for Young Scientists from the Blavatnik Family Foundation and the New York Academy of Sciences.

    Read more →
  • Nagarik App

    Nagarik App

    Nagarik App (translation: Citizen App) is a mobile application launched by the Government of Nepal to provide government-related services in a single online platform. The app was developed to facilitate an easier, systematic, and simplified delivery of government services to Nepali citizens digitally. The app was launched to play a pivotal role in revolutionizing the way citizens interact with the government. It offers government services through a single unified platform, minimizing the need for citizens to navigate multiple channels or physical offices for their diverse needs of government services. The services are added gradually according to the needs and services required. The government aims to reduce the physical queues and the need to be physically present to get services from the different government offices. One can get services online round-the-clock even during holidays. As of now, 25 services are included in the app, ranging from Police Clearance Report to Voters Card. The app contains and provides a vast range of government services. The app was launched on the occasion of the fourth National Information and Communication Technology Day, 2021 (2078 BS). The event marked a significant milestone in Nepal’s digital transformation journey. It aims to reduce all the bureaucratic hurdles that the citizens have been facing and make government services more efficient and convenient. In Oct 20, 2024, a E-Chalan was introduced for managing traffic violations in initially piloting in Kathmandu Valley. The Kathmandu Valley Traffic Police Office announced that physical licenses would no longer be confiscated for traffic rule violations. Instead, a "Digital Chit (E-Chalan)" system was implemented, allowing drivers to pay fines electronically. Integrated with the NagarikApp, the system enables police to access drivers' licenses, record violations, and update details directly in the app. == Features and Services == Inland Revenue Department (Nepal) PAN Registration Election Commission (Nepal) Voter Card Pre-Registration and Details Nepal Police Online Clearance Report Traffic Violations and Fine Payment Nepal Passport, Driving License, National Identity Card (NID), Citizenship, and Voter ID link details My Municipality (Includes contact info of the representatives, services such as ambulance, nearby police, and budget programs and plans) The Government Press ID card PF/PAN/SST/CIT statements can be viewed Nagarik Pahichan Dwar (Online bank accounts can be opened and KYC can be verified for selected banks using the QR) == Awards and honors == Each year, World Summit Award honors outstanding digital applications and solutions across various categories. The winners of the World Summit Award represent the pinnacle of innovation in their respective categories. Nagarik App was selected among 180 participants and won the World Summit Award of 2022 in Government and Citizen Engagement category. == Latest Statistics & Usage Trends (2082 BS / 2025 AD) == As of August 2025, over 1.5 million Nepali citizens have registered and actively use the Nagarik App, according to the National Information Technology Center (NITC). The majority of daily logins come from: Kathmandu Valley – 37% of total users Province 1 (Koshi) – 19% of total users Bagmati Province – 15% of total users On average, 45,000+ transactions (service requests, document verifications, and payments) are processed through the app each day. The most-used services include: PAN Card Registration – 28% of total requests Police Clearance Report – 22% Driving License Linking & E-Chalan Payment – 18% Vehicle Tax Payment – 14% Source: Internal report from NITC, July 2025 == Step-by-Step: How to Link Your Driving License with Nagarik App == Update the App – Install the latest version from Play Store or App Store. Login or Register – Ensure your SIM is registered in your own name. Go to “Transport Services” in the menu. Select “Driving License” – Enter your license number and date of birth. Verify via OTP – Sent to your registered mobile number. Confirmation – Your digital license will appear inside the app. This guide is continuously updated to reflect the latest rules from the Kathmandu Valley Traffic Police Office and changes in NITC’s backend system. For in-depth details, step-by-step tutorials, and the most recent Nagarik App updates, visit the full article on The Bipin Blog.

    Read more →
  • Georgetown–IBM experiment

    Georgetown–IBM experiment

    The Georgetown–IBM experiment was an influential demonstration of machine translation, which was performed on January 7, 1954. Developed jointly by Georgetown University and IBM, the experiment involved completely automatic translation of more than sixty Russian sentences into English. == Background == Conceived and performed primarily in order to attract governmental and public interest and funding by showing the possibilities of machine translation, it was by no means a fully featured system: It had only six grammar rules and 250 lexical items in its vocabulary (of stems and endings). Words in the vocabulary were in the fields of politics, law, mathematics, chemistry, metallurgy, communications and military affairs. Vocabulary was punched onto punch cards. This complete dictionary was never fully shown (only the extended one from Garvin's article). Apart from general topics, the system was specialized in the domain of organic chemistry. The translation was carried out using an IBM 701 mainframe computer (launched in April 1953). The Georgetown-IBM experiment is the best-known result of the MIT conference in June 1952 to which all active researchers in the machine translation field were invited. At the conference, Duncan Harkin from US Department of Defense suggested that his department would finance a new machine translation project. Jerome Weisner supported the idea and offered finance from the Research Laboratory of Electronics at MIT. Leon Dostert had been invited to the project for his previous experience with the automatic correction of translations (back then 'mechanical translation'); his interpretation system had a strong impact on the Nuremberg War Crimes Tribunal. The linguistics part of the demonstration was carried out for the most part by linguist Paul Garvin who had also good knowledge of Russian. Over 60 Romanized Russian statements from a wide range of political, legal, mathematical, and scientific topics were entered into the machine by a computer operator who knew no Russian, and the resulting English translations appeared on a printer. The sentences to be translated were carefully selected. Many operations for the demonstration were fitted to specific words and sentences. In addition, there was no relational or sentence analysis which could recognize the sentence structure. The approach was mostly 'lexicographical' based on a dictionary where a specific word had a connection with specific rules and steps. == Algorithm == The algorithm first translates Russian words into numerical codes, then performs the following case-analysis on each numerical code to choose between possible English word translations, reorder the English words, or omit some English words. The flowchart of the algorithm is reproduced in (see Table 1 for the 6 rules). == Translation examples == How it analyzes Vyelyichyina ugla opryedyelyayetsya otnoshyenyiyem dlyini dugi k radyiusu (figure 2 of ). == Reception == Well publicized by journalists and perceived as a success, the experiment did encourage governments to invest in computational linguistics. The authors claimed that within three or five years, machine translation could well be a solved problem. However, the real progress was much slower, and after the ALPAC report in 1966, which found that the ten years of long research had failed to fulfill the expectations, funding was reduced dramatically. The demonstration was given widespread coverage in the foreign press, but only a small fraction of journalists drew attention to previous machine translation attempts.

    Read more →
  • Anil K. Jain (computer scientist, born 1948)

    Anil K. Jain (computer scientist, born 1948)

    Anil Kumar Jain (born 1948) is an Indian-American computer scientist and University Distinguished Professor in the Department of Computer Science and Engineering at Michigan State University. He is one of the most highly cited researchers in computer science, and is internationally recognized for his foundational contributions to pattern recognition, computer vision, and biometric recognition, particularly in fingerprint recognition and face recognition. Jain is a member of the United States National Academy of Engineering, a Foreign Member of the Chinese Academy of Sciences, and a Foreign Fellow of the Indian National Academy of Engineering. He is a Fellow of the ACM, IEEE, AAAS, IAPR, and SPIE. His research has shaped the field of biometrics and has been applied in systems used worldwide for identity verification, law enforcement, and border security. In 2024, he was awarded the BBVA Foundation Frontiers of Knowledge Award in the category of Information and Communication Technologies. == Early life and education == Born in Basti, India, Jain received his Bachelor of Technology in electrical engineering from the Indian Institute of Technology, Kanpur in 1969. He then moved to the United States, where he earned his M.S. in 1970 and Ph.D. in 1973 from Ohio State University. His doctoral dissertation, titled Some Aspects of Dimensionality and Sample Size Problems in Statistical Pattern Recognition, was supervised by Robert B. McGhee and laid the groundwork for his subsequent research in pattern recognition. == Career == Jain began his academic career at Wayne State University, where he taught from 1972 to 1974. In 1974, he joined the faculty of Michigan State University, where he has remained for over five decades and currently holds the position of University Distinguished Professor. Throughout his career, Jain has conducted pioneering research in data clustering, fingerprint recognition, and face recognition. His work has been published in leading scientific journals including Scientific American, Nature, IEEE Spectrum, and MIT Technology Review. He served as Editor-in-Chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 1991 to 1994. Jain has also contributed to national security and policy through his service on several advisory bodies. He served as a member of the U.S. National Academies panels on Information Technology, Whither Biometrics, and Improvised Explosive Devices (IED). He has also served on the Defense Science Board, the Forensic Science Standards Board, and the AAAS Latent Fingerprint Working Group. In 2014, Jain was named Innovator of the Year at Michigan State University for transferring several technologies on face and fingerprint recognition to major players in the biometrics industry. He holds eight U.S. and Korean patents related to biometric technologies. == Research contributions == Jain's research spans pattern recognition, computer vision, machine learning, and biometric recognition. His contributions have been particularly influential in several areas: === Biometric recognition === Jain is considered one of the foremost authorities on biometric recognition systems. His research group at Michigan State University has developed algorithms and systems for fingerprint, face, and iris recognition that have been widely adopted in both academic research and commercial applications. His work on fingerprint matching algorithms has been instrumental in establishing standards for automated fingerprint identification systems (AFIS) used by law enforcement agencies worldwide. In recent years, Jain and his research team have made significant advances in child fingerprint recognition, demonstrating that digital scans of a young child's fingerprint can be correctly recognized one year later with over 99 percent accuracy for children as young as six months old. This research has important implications for child identification in developing countries, where it can be used to track immunization records and provide access to medical care. === Data clustering === Jain's survey article "Data clustering: a review" (1999), co-authored with M. N. Murty and P. J. Flynn, is one of the most highly cited papers in computer science. His 2010 paper "Data Clustering: 50 Years Beyond K-Means" provided a comprehensive overview of the evolution of clustering methods and remains an essential reference in the field. === Statistical pattern recognition === Jain's work on statistical pattern recognition, including his influential survey "Statistical pattern recognition: A review" (2000) with R. P. W. Duin and Jianchang Mao, has shaped the theoretical foundations of the field. == Citation metrics and academic impact == Jain is among the most highly cited researchers in computer science. Based on his Google Scholar profile, he had an h-index of 200 in 2020, which was the highest among computer scientists identified in a survey published by UCLA at the time. As of August 2023, his h-index on Google Scholar is 211. He has since been surpassed by Yoshua Bengio, a researcher of similar subjects (neural networks and deep learning for artificial intelligence), who had an h-index of 224 as of August 2023. Another source reported that as of December 2022, he had the highest discipline h-index (D-index) in computer science. == Honors and awards == Jain has received numerous awards and honors recognizing his contributions to computer science and engineering: === Academy memberships === Member, United States National Academy of Engineering (2016) — elected "for contributions to the engineering and practice of biometrics" Foreign Fellow, Indian National Academy of Engineering (2016) Foreign Member, Chinese Academy of Sciences (2019) Member, The World Academy of Sciences (2019) Fellow, National Academy of Inventors === Professional society fellowships === Fellow, ACM Fellow, IEEE (1988) — for contributions to image processing Fellow, AAAS Fellow, International Association for Pattern Recognition Fellow, SPIE === Major awards === BBVA Foundation Frontiers of Knowledge Award in Information and Communication Technologies (2024) IAPR King-Sun Fu Prize (2008) IEEE W. Wallace McDowell Award (2007) — the highest technical honor awarded by the IEEE Computer Society, for pioneering contributions to theory, technique, and practice of pattern recognition, computer vision, and biometric recognition systems IEEE Computer Society Technical Achievement Award (2003) IAPR Pierre Devijver Award (2002) Humboldt Research Award (2002) Guggenheim Fellowship (2001) Fulbright Fellowship (1998) IEEE ICDM Research Contribution Award (2008) === Best paper awards === IEEE Transactions on Neural Networks (1996) Pattern Recognition journal (1987, 1991, 2005) === Honorary doctorates === Universidad Autónoma de Madrid (2018) Hong Kong University of Science and Technology (2021) == Legacy and endowments == Two endowed funds have been established in Jain's honor at Michigan State University, recognizing his lasting impact on the field and the university. In 2015, a former visiting scholar from Jain's laboratory made an anonymous $400,000 gift to create the Anil K. Jain Endowed Graduate Fellowship, which supports doctoral-level research in pattern recognition, computer vision, and biometric recognition. In 2022, the Anil K. and Nandita K. Jain Endowed Professorship was established through $1 million in contributions from multiple donors, including a substantial gift from the Jain family, to support faculty recruitment and retention in the Department of Computer Science and Engineering. == Selected publications == === Books === 1988. Algorithms For Clustering Data. With Richard C. Dubes. Prentice Hall. 1993. Markov Random Fields: Theory and Applications. With Rama Chellappa eds. Academic Press. 1999. Biometrics: Personal Identification in Networked Society. With Ruud M. Bolle and Sharath Pankanti eds. Springer. 2003. Handbook of Fingerprint Recognition. (2nd edition 2009). With D. Maio, D. Maltoni, S. Prabhakar. Springer. 2005. Handbook of Face Recognition. (2nd edition 2011). With S. Z. Li ed. Springer. 2006. Handbook of Multibiometrics. With A. Ross and K. Nandakumar. Springer. 2007. Handbook of Biometrics. With P. Flynn and A. Ross eds. Springer. 2011. Introduction to Biometrics. With A. Ross and K. Nandakumar. Springer. 2015. Encyclopedia of Biometrics (Second Edition). With Stan Li. Springer. === Research articles === Cross, George R. and Anil K. Jain. "Markov random field texture models". IEEE Transactions on Pattern Analysis and Machine Intelligence (1983): 25–39. Jain, Anil K., and Farshid Farrokhnia. "Unsupervised texture segmentation using Gabor filters". Pattern Recognition 24.12 (1991): 1167–1186. Jain, Anil K., and Douglas Zongker. "Feature selection: Evaluation, application, and small sample performance". IEEE Transactions on Pattern Analysis and Machine Intelligence, 19.2 (1997): 153–158. Jain, Anil K., L. Hong, S. Pankanti, R. Bolle. "An Identity-A

    Read more →
  • Best AI Headshot Generators in 2026

    Best AI Headshot Generators in 2026

    In search of the best AI headshot generator? An AI headshot generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI headshot generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • You.com

    You.com

    You.com is an artificial intelligence search startup that has pivoted away from consumer search engine operations toward business-focused AI tools and APIs. The company was founded in 2020 by Richard Socher, the former chief scientist at Salesforce, and Bryan McCann, a former NLP researcher at Salesforce. == History == Following its 2020 founding, You.com opened its public beta on November 9, 2021, and received $20 million in funding led by Salesforce founder and CEO Marc Benioff. Other investors include Breyer Capital, Sound Ventures, and Day One Ventures. The domain You.com was initially purchased in 1996 by Benioff. Benioff invested in You.com and transferred ownership of the You.com domain name to the company. In July 2022, You.com announced its $25 million Series A funding round led by Radical Ventures with participation from Time Ventures, Breyer Capital, Norwest Venture Partners and Day One Ventures. In September 2024, You.com raised $50 million in Series B funding led by Georgian. In September 2025, You.com raised $100 million in Series C funding led by Cox Enterprises at a $1.5 billion valuation, achieving unicorn status. == Business model == You.com generates revenue primarily through enterprise sales of search APIs and AI tools. The platform provides web search capabilities that can be integrated into enterprise applications and AI agents. == Features == On December 23, 2022, You.com was the first search engine to launch an LLM chatbot with live web results alongside its responses. Initially known as YouChat, the chatbot was primarily based on the GPT-3.5 large language model and could answer questions, suggest ideas, translate text, summarize articles, compose emails, and write code snippets, while staying up-to-date with current events and citing sources. Several further versions of YouChat were released. The second version, called YouChat 2.0, was released on February 7, 2023, incorporated improved conversational AI and community-built applications by blending a large language model named C-A-L (Chat, Apps, and Links). This update enabled YouChat to provide results in various formats, such as charts, photos, videos, tables, graphs, text or code, so users can find answers without leaving the search results page. YouChat 3.0, unveiled on May 4, 2023, combined chat functionality with results from Reddit, TikTok, Stack Overflow and Wikipedia. === YouPro === On June 21, 2023, You.com introduced YouPro, a paid subscription. Both free and paid versions provide access to large language models connected to the internet with citation capabilities. === ARI === In February 2025, You.com launched ARI (Advanced Research and Insights), a deep research agent that scans over 400 sources simultaneously to produce research reports with verified citations and interactive graphs, charts, and visualizations. The platform targets regulated industries where comprehensive source verification is critical, with customers including healthcare publishers and advisory firms. == Reception == You.com was named one of TIME's Best Inventions of 2022. You.com's ARI (Advanced Research & Insights) feature was named one of TIME's Best Inventions of 2025.

    Read more →
  • Collocation

    Collocation

    In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words that make it up. This contrasts with an idiom, where the meaning of the whole cannot be inferred from its parts, and may be completely unrelated. There are about seven main types of collocations: adjective + noun, noun + noun (such as collective nouns), noun + verb, verb + noun, adverb + adjective, verbs + prepositional phrase (phrasal verbs), and verb + adverb. Collocation extraction is a computational technique that finds collocations in a document or corpus, using various computational linguistics elements resembling data mining. == Expanded definition == Collocations are partly or fully fixed expressions that become established through repeated context-dependent use. Such terms as crystal clear, middle management, nuclear family, and cosmetic surgery are examples of collocated pairs of words. Collocations can be in a syntactic relation (such as verb–object: make and decision), lexical relation (such as antonymy), or they can be in no linguistically defined relation. Knowledge of collocations is vital for the competent use of a language: a grammatically correct sentence will stand out as awkward if collocational preferences are violated. This makes collocation a common focus for language teaching. Corpus linguists specify a key word in context (KWIC) and identify the words immediately surrounding them, to illustrate the way words are used in practice. The processing of collocations involves a number of parameters, the most important of which is the measure of association, which evaluates whether the co-occurrence is purely by chance or statistically significant. Due to the non-random nature of language, most collocations are classed as significant, and the association scores are simply used to rank the results. Commonly used measures of association include mutual information, t scores, and log-likelihood. Rather than select a single definition, Gledhill proposes that collocation involves at least three different perspectives: co-occurrence, a statistical view, which sees collocation as the recurrent appearance in a text of a node and its collocates; construction, which sees collocation either as a correlation between a lexeme and a lexical-grammatical pattern, or as a relation between a base and its collocative partners; and expression, a pragmatic view of collocation as a conventional unit of expression, regardless of form. These different perspectives contrast with the usual way of presenting collocation in phraseological studies. Traditionally speaking, collocation is explained in terms of all three perspectives at once, in a continuum: == In dictionaries == In 1933, Harold Palmer's Second Interim Report on English Collocations highlighted the importance of collocation as a key to producing natural-sounding language, for anyone learning a foreign language. Thus from the 1940s onwards, information about recurrent word combinations became a standard feature of monolingual learner's dictionaries. As these dictionaries became "less word-centred and more phrase-centred", more attention was paid to collocation. This trend was supported, from the beginning of the 21st century, by the availability of large text corpora and intelligent corpus-querying software, making it possible to provide a more systematic account of collocation in dictionaries. Using these tools, dictionaries such as the Macmillan English Dictionary and the Longman Dictionary of Contemporary English included boxes or panels with lists of frequent collocations. There are also a number of specialized dictionaries devoted to describing the frequent collocations in a language. These include (for Spanish) Redes: Diccionario combinatorio del español contemporaneo (2004), (for French) Le Robert: Dictionnaire des combinaisons de mots (2007), and (for English) the LTP Dictionary of Selected Collocations (1997) and the Macmillan Collocations Dictionary (2010). == Statistically significant collocation == Student's t-test can be used to determine whether the occurrence of a collocation in a corpus is statistically significant. For a bigram w 1 w 2 {\displaystyle w_{1}w_{2}} , let P ( w 1 ) = # w 1 N {\displaystyle P(w_{1})={\frac {\#w_{1}}{N}}} be the unconditional probability of occurrence of w 1 {\displaystyle w_{1}} in a corpus with size N {\displaystyle N} , and let P ( w 2 ) = # w 2 N {\displaystyle P(w_{2})={\frac {\#w_{2}}{N}}} be the unconditional probability of occurrence of w 2 {\displaystyle w_{2}} in the corpus. The t-score for the bigram w 1 w 2 {\displaystyle w_{1}w_{2}} is calculated as: where x ¯ = # w i w j N {\displaystyle {\bar {x}}={\frac {\#w_{i}w_{j}}{N}}} is the sample mean of the occurrence of w 1 w 2 {\displaystyle w_{1}w_{2}} , # w 1 w 2 {\displaystyle \#w_{1}w_{2}} is the number of occurrences of w 1 w 2 {\displaystyle w_{1}w_{2}} , μ = P ( w i ) P ( w j ) {\displaystyle \mu =P(w_{i})P(w_{j})} is the probability of w 1 w 2 {\displaystyle w_{1}w_{2}} under the null-hypothesis that w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} appear independently in the text, and s 2 = x ¯ ( 1 − x ¯ ) ≈ x ¯ {\displaystyle s^{2}={\bar {x}}(1-{\bar {x}})\approx {\bar {x}}} is the sample variance. With a large N {\displaystyle N} , the t-test is equivalent to a Z-test.

    Read more →
  • AI Photo Editors: Free vs Paid (2026)

    AI Photo Editors: Free vs Paid (2026)

    Trying to pick the best AI photo editor? An AI photo editor is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI photo editor slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Automatic number-plate recognition

    Automatic number-plate recognition

    Automatic number-plate recognition (ANPR; see also other names below) is a technology that uses optical character recognition on images to read vehicle registration plates to create vehicle location data. It can use existing closed-circuit television, road-rule enforcement cameras, or cameras specifically designed for the task. ANPR is used by police forces around the world for law enforcement purposes, including checking if a vehicle is registered or licensed. It is also used for electronic toll collection on pay-per-use roads and as a method of cataloguing the movements of traffic, for example by highways agencies. Automatic number-plate recognition can be used to store the images captured by the cameras as well as the text from the license plate, with some configurable to store a photograph of the driver. Systems commonly use infrared lighting to allow the camera to take the picture at any time of day or night. ANPR technology must take into account plate variations from place to place. Privacy issues have caused concerns about ANPR, such as government tracking citizens' movements, misidentification, high error rates, and increased government spending. Critics have described it as a form of mass surveillance. == Other names == ANPR is also known by various other terms: Automatic (or automated) license-plate recognition (ALPR) Automatic (or automated) license-plate reader (ALPR) Automatic vehicle identification (AVI) Danish: Automatisk nummerpladegenkendelse, lit. 'Automatic number plate recognition' (ANPG) Car-plate recognition (CPR) License-plate recognition (LPR) French: Lecture automatique de plaques d'immatriculation, lit. 'Automatic reading of registration plates' (LAPI) Mobile license-plate reader (MLPR) Vehicle license-plate recognition (VLPR) Vehicle recognition identification (VRI) == Development == ANPR was invented in 1976 at the Police Scientific Development Branch in Britain. Prototype systems were working by 1979, and contracts were awarded to produce industrial systems, first at EMI Electronics, and then at Computer Recognition Systems (CRS, now part of Jenoptik) in Wokingham, UK. Early trial systems were deployed on the A1 road and at the Dartford Tunnel. The first arrest through detection of a stolen car was made in 1981. However, ANPR did not become widely used until new developments in cheaper and easier to use software were pioneered during the 1990s. The collection of ANPR data for future use (i.e., in solving then-unidentified crimes) was documented in the early 2000s. The first documented case of ANPR being used to help solve a murder occurred in November 2005, in Bradford, UK, where ANPR played a vital role in locating and subsequently convicting the killers of Sharon Beshenivsky. == Components == The software aspect of the system runs on standard home computer hardware and can be linked to other applications or databases. It first uses a series of image manipulation techniques to detect, normalize and enhance the image of the number plate, and then optical character recognition (OCR) to extract the alphanumerics of the license plate. ANPR systems are generally deployed in one of two basic approaches: one allows for the entire process to be performed at the lane location in real-time, and the other transmits all the images from many lanes to a remote computer location and performs the OCR process there at some later point in time. When done at the lane site, the information captured of the plate alphanumeric, date-time, lane identification, and any other information required is completed in approximately 250 milliseconds. This information can easily be transmitted to a remote computer for further processing if necessary, or stored at the lane for later retrieval. In the other arrangement, there are typically large numbers of PCs used in a server farm to handle high workloads, such as those found in the London congestion charge project. Often in such systems, there is a requirement to forward images to the remote server, and this can require larger bandwidth transmission media. === Technology === ANPR uses optical character recognition (OCR) on images taken by cameras. When Dutch vehicle registration plates switched to a different style in 2002, one of the changes made was to the font, introducing small gaps in some letters (such as P and R) to make them more distinct and therefore more legible to such systems. Some license plate arrangements use variations in font sizes and positioning—ANPR systems must be able to cope with such differences to be truly effective. More complicated systems can cope with international variants, though many programs are individually tailored to each country. The cameras used can be existing road-rule enforcement or closed-circuit television cameras, as well as mobile units, which are usually attached to vehicles. Some systems use infrared cameras to take a clearer image of the plates. ==== In mobile systems ==== During the 1990s, significant advances in technology took automatic number-plate recognition (ANPR) systems from limited expensive, hard to set up, fixed based applications to simple "point and shoot" mobile ones. This was made possible by the creation of software that ran on cheaper PC based, non-specialist hardware that also no longer needed to be given the pre-defined angles, direction, size and speed in which the plates would be passing the camera's field of view. Further scaled-down components at lower price points led to a record number of deployments by law enforcement agencies globally. Smaller cameras with the ability to read license plates at higher speeds, along with smaller, more durable processors that fit in the trunks of police vehicles, allowed law enforcement officers to patrol daily with the benefit of license plate reading in real time, when they can interdict immediately. Despite their effectiveness, there are noteworthy challenges related with mobile ANPRs. One of the biggest is that the processor and the cameras must work fast enough to accommodate relative speeds of more than 160 km/h (100 mph), a likely scenario in the case of oncoming traffic. This equipment must also be very efficient since the power source is the vehicle electrical system, and equipment must have minimal space requirements. Relative speed is only one issue that affects the camera's ability to read a license plate. Algorithms must be able to compensate for all the variables that can affect the ANPR's ability to produce an accurate read, such as time of day, weather and angles between the cameras and the license plates. A system's illumination wavelengths can also have a direct impact on the resolution and accuracy of a read in these conditions. Installing ANPR cameras on law enforcement vehicles requires careful consideration of the juxtaposition of the cameras to the license plates they are to read. Using the right number of cameras and positioning them accurately for optimal results can prove challenging, given the various missions and environments at hand. Highway patrol requires forward-looking cameras that span multiple lanes and are able to read license plates at high speeds. City patrol needs shorter range, lower focal length cameras for capturing plates on parked cars. Parking lots with perpendicularly parked cars often require a specialized camera with a very short focal length. Most technically advanced systems are flexible and can be configured with a number of cameras ranging from one to four which can easily be repositioned as needed. States with rear-only license plates have an additional challenge since a forward-looking camera is ineffective with oncoming traffic. In this case one camera may be turned backwards. === Algorithms === There are seven primary algorithms that the software requires for identifying a license plate: Plate localization – responsible for finding and isolating the plate on the picture Plate orientation and sizing – compensates for the skew of the plate and adjusts the dimensions to the required size Normalization – adjusts the brightness and contrast of the image Character segmentation – finds the individual characters on the plates Optical character recognition Syntactical/Geometrical analysis – check characters and positions against country-specific rules The averaging of the recognised value over multiple fields/images to produce a more reliable or confident result, especially given that any single image may contain a reflected light flare, be partially obscured, or possess other obfuscating effects. The complexity of each of these subsections of the program determines the accuracy of the system. During the third phase (normalization), some systems use edge detection techniques to increase the picture difference between the letters and the plate backing. A median filter may also be used to reduce the visual noise on the image. Contemporary ANPR systems use multiple data sources and analytical techniques that go beyond simple number

    Read more →
  • Texture compression

    Texture compression

    Texture compression is a specialized form of image compression designed for storing texture maps in 3D computer graphics rendering systems. Unlike conventional image compression algorithms, texture compression algorithms are optimized for random access. Texture compression can be applied to reduce memory usage at runtime. Texture data is often the largest source of memory usage in a mobile application. == Tradeoffs == In their seminal paper on texture compression, Beers, Agrawala and Chaddha list four features that tend to differentiate texture compression from other image compression techniques. These features are: Decoding Speed It is highly desirable to be able to render directly from the compressed texture data and so, in order not to impact rendering performance, decompression must be fast. Random Access Since predicting the order that a renderer accesses texels would be difficult, any texture compression scheme must allow fast random access to decompressed texture data. This tends to rule out many better-known image compression schemes such as JPEG or run-length encoding. Compression Rate and Visual Quality In a rendering system, lossy compression can be more tolerable than for other use cases. Some texture compression libraries, such as crunch, allow the developer to flexibly trade off compression rate vs. visual quality, using methods such as rate–distortion optimization (RDO). Encoding Speed Texture compression is more tolerant of asymmetric encoding/decoding rates as the encoding process is often done only once during the application authoring process. Given the above, most texture compression algorithms involve some form of fixed-rate lossy vector quantization of small fixed-size blocks of pixels into small fixed-size blocks of coding bits, sometimes with additional extra pre-processing and post-processing steps. Block Truncation Coding is a very simple example of this family of algorithms. Because their data access patterns are well-defined, texture decompression may be executed on-the-fly during rendering as part of the overall graphics pipeline, reducing overall bandwidth and storage needs throughout the graphics system. As well as texture maps, texture compression may also be used to encode other kinds of rendering map, including bump maps and surface normal maps. Texture compression may also be used together with other forms of map processing such as mipmaps and anisotropic filtering. == Availability == Some examples of practical texture compression systems are S3 Texture Compression (S3TC), PVRTC, Ericsson Texture Compression (ETC) and Adaptive Scalable Texture Compression (ASTC); these may be supported by special function units in modern graphics processing units (GPUs). OpenGL and OpenGL ES, as implemented on many video accelerator cards and mobile GPUs, can support multiple common kinds of texture compression - generally through the use of vendor extensions. == Supercompression == A compressed-texture can be further compressed in what is called "supercompression". Fixed-rate texture compression formats are optimized for random access and are much less efficient compared to image formats such as PNG. By adding further compression, a programmer can reduce the efficiency gap. The extra layer can be decompressed by the CPU so that the GPU receives a normal compressed texture, or in newer methods, decompressed by the GPU itself. Supercompression saves the same amount of VRAM as regular texture compression, but saves more disk space and download size. == Neural Texture Compression == Random-Access Neural Compression of Material Textures (Neural Texture Compression) is a Nvidia's technology which enables two additional levels of detail (16× more texels, so four times higher resolution) while maintaining similar storage requirements as traditional texture compression methods. The key idea is compressing multiple material textures and their mipmap chains together, and using a small neural network, that is optimized for each material, to decompress them.

    Read more →
  • Richard Zemel

    Richard Zemel

    Richard Stanley Zemel (born 1963) is a Canadian-American computer scientist and professor at Columbia University, Department of Computer Science, and a leading figure in the field of machine learning and computer vision. Zemel studied the history of science at Harvard University and obtained his B.A. in 1984. He continued his study at the Department of Computer Science of the University of Toronto under the supervision of Geoffrey Hinton. He obtained his M.Sc. and Ph.D. both in computer science in 1989 and 1994, respectively.

    Read more →
  • Yun Sing Koh

    Yun Sing Koh

    Yun Sing Koh (born 1978) is a New Zealand computer science academic, and is a full professor at the University of Auckland, specialising in machine learning and artificial intelligence. She is a co-director of the Centre of Machine Learning for Social Good, and the Advanced Machine Learning and Data Analytics Research (MARS) Lab at Auckland. == Academic career == Koh earned a Bachelor of Science with Honours and a Master of Software Engineering at the University of Malaya. She then completed a PhD titled Generating sporadic association rules at the University of Otago in 2007. Koh joined the faculty of the University of Auckland in 2010, rising to full professor. As of 2024, she is director of the Centre of Machine Learning for Social Good at Auckland, alongside Gillian Dobbie and Daniel Wilson, and is director of the Master of AI course at the university. Koh also co-directs the Advanced Machine Learning and Data Analytics Research (MARS) Lab. Koh's research covers machine learning and artificial intelligence. She is especially interested in designing machine learning algorithms for data streams, and has led research using AI systems to identify individual stoats for pest population research. In 2018 she was awarded a Marsden grant for a research project "An Adaptive Predictive System for Life-long Learning on Data Streams", and has been part of three MBIE projects. In 2025 the stoat identification project Koh co-leads with Daniel Wilson was awarded $1 million per annum by the MBIE Smart Ideas fund. Koh was a finalist in the AI in Climate section of the Women in AI Australia and New Zealand Awards in 2022. She was a 2023 Fellow at the United States National Science Foundation-funded Convergence Research (CORE) Institute. Koh has chaired a number of sessions at international conferences on data mining. In March 2026 it was announced that Koh would be a member of the New Zealand Human Rights Commission's Expert Advisory Group on Artificial Intelligence, Emerging Digital Technologies and Human Rights. == Selected works == Philippe Fournier-Viger; Jerry Chun-Wei Lin; Rage Uday Kiran; Yun Sing Koh; Rincy Thomas (2017). "A Survey of Sequential Pattern Mining". Data Science and Pattern Recognition. 1 (1): 54–77. Wikidata Q138719481. Yun Sing Koh; Nathan Rountree; Richard O’Keefe (1 April 2006). "Finding Non-Coincidental Sporadic Rules Using Apriori-Inverse". International Journal of Data Warehousing and Mining (in Ndonga). 2 (2): 38–54. doi:10.4018/JDWM.2006040102. ISSN 1548-3924. Wikidata Q125185222. Russel Pears; Sripirakas Sakthithasan; Yun Sing Koh (11 January 2014). "Detecting concept change in dynamic data streams". Machine Learning. 97 (3): 259–293. doi:10.1007/S10994-013-5433-9. ISSN 1573-0565. Zbl 1319.68186. Wikidata Q125185156. David Tse Jung Huang; Yun Sing Koh; Gillian Dobbie; Russel Pears (December 2014), Detecting Volatility Shift in Data Streams, Institute of Electrical and Electronics Engineers, doi:10.1109/ICDM.2014.50, Wikidata Q125185151 Sidney Tsang; Yun Sing Koh; Gillian Dobbie (2011). "RP-Tree: Rare Pattern Tree Mining". Lecture Notes in Computer Science: 277–288. doi:10.1007/978-3-642-23544-3_21. ISSN 0302-9743. Wikidata Q125185206. Yun Sing Koh; Sri Devi Ravana (24 May 2016). "Unsupervised Rare Pattern Mining". ACM Transactions on Knowledge Discovery from Data. 10 (4): 1–29. doi:10.1145/2898359. ISSN 1556-4681. Wikidata Q125185136. Jack Julian; Yun Sing Koh; Albert Bifet (1 October 2025), Building adaptive knowledge bases for evolving continual learning models (PDF), vol. 1, doi:10.1038/S44387-025-00028-4, Wikidata Q138719496

    Read more →