AI Generator Character

AI Generator Character — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Deaths linked to chatbots

    Deaths linked to chatbots

    There have been multiple incidents where interaction with a large language model (LLM) chatbot has been cited as a direct or contributing factor in a person's suicide or other fatal outcome. In some cases, legal action was taken against the companies that developed the AI involved. == Background == Chatbots converse in a seemingly natural fashion, making it easy for people to think of them as real people, leading many to ask chatbots for help dealing with interpersonal and emotional problems. Chatbots may be designed to keep the user engaged in the conversation. They have also often been shown to affirm users' thoughts, including delusions and suicidal ideations in mentally ill people, conspiracy theorists, and religious and political extremists. A 2025 Stanford University study into how chatbots respond to users suffering from severe mental issues such as suicidal ideation and psychosis found that chatbots are not equipped to provide an appropriate response and can sometimes give responses that escalate the mental health crisis. == Murders == === Maine murder and assault === On 19 February 2025, a man killed his 32-year-old wife with a fire poker at his parents' home in Readfield, Maine, US. He then attacked his mother, leaving her hospitalized. A state forensic psychologist testified that he had been using ChatGPT up to 14 hours per day and believed his wife had become part machine. === Florida State University mass shooting === In April of 2025, Phoenix Ikner carried out a mass shooting on the Florida State University campus in the US, killing Robert Morales and Tiru Chabba and wounding several others. Leading up to the shooting, Ikner consulted heavily with ChatGPT about what gun and ammunition to use, and what time to perform the attack. Chatbot logs showed ChatGPT giving advice on making the gun operational shortly before Ikner began shooting. Lawyers representing Morales believed the shooter had been in "constant communication" with ChatGPT before the shooting and said that they intended to "file suit against ChatGPT, and its ownership structure, very soon, and will seek to hold them accountable for the untimely and senseless death of our client". Florida Attorney General James Uthmeier announced an investigation into ChatGPT's role in the alleged shooter's use of the chatbot. In May 2026, the widow of Tiru Chabba filed a lawsuit against OpenAI in Florida's northern federal district court. === Greenwich murder-suicide === In August 2025, former US tech employee Stein-Erik Soelberg murdered his mother, Suzanne Eberson Adams, then died by suicide, after conversations with ChatGPT fueled paranoid delusions about his mother poisoning him or plotting against him. The chatbot affirmed his fears that his mother put psychedelic drugs in the air vents of his car and said a receipt from a Chinese restaurant contained mysterious symbols linking his mother to a demon. === Murder of Angela Shellis === On 23 October 2025, 18-year-old Tristan Roberts murdered his mother Angela Shellis with a hammer near their home in Prestatyn, Wales. Roberts had used DeepSeek's chatbot prior to the killing to ask whether a knife or hammer was better suited for murder. DeepSeek initially refused his inquiry, but gave responses after Roberts told the chatbot he was writing a book about serial killers, a well-known technique for jailbreaking AIs. === Gangbuk District drug deaths === In January and February 2026, two men died of drug overdoses in motel rooms in Gangbuk District, Seoul, South Korea. A woman was charged with murder in connection with the deaths; police alleged that she had asked ChatGPT about the dangers of mixing alcohol with drugs and whether they could kill someone. === Tumbler Ridge mass shooting === On 10 February 2026, a mass shooting in Tumbler Ridge, British Columbia, Canada, resulted in eight deaths, including six young children. The perpetrator had their ChatGPT account banned by OpenAI months before the attack due to troubling posts featuring scenarios of gun violence. According to reports, approximately a dozen OpenAI staff members debated whether to alert authorities about the shooter's usage of the AI tool, with some identifying it as an indication of potential real-world violence. However, company leadership decided not to contact law enforcement, stating that the account activity did not meet their threshold for a credible or imminent plan for serious physical harm. Following the shooting, Canada's AI Minister Evan Solomon summoned OpenAI executives to Ottawa to discuss safety protocols and thresholds for escalating harmful content to police. Justice Minister Sean Fraser called the meeting "disappointing" and demanded substantial new safety measures, warning that if changes were not forthcoming, the government would implement them. OpenAI subsequently announced it had strengthened safeguards and changed guidelines about when to notify police in cases involving violent activities. === University of South Florida student killings === In April 2026, a Bangladeshi doctoral student at the University of South Florida was arrested for allegedly murdering his roommate and the roommate's friend. Prosecutors said that the suspect had asked ChatGPT about disposing of a human in a dumpster before the two victims had disappeared and made other inquiries relating to violence. == Suicides == === Belgian man, 30s === In March 2023, a Belgian man in his thirties died by suicide following a six-week correspondence with a chatbot named Eliza on the application Chai. According to his widow, who shared the chat logs with media, the man had become extremely anxious about climate change and found an outlet in the chatbot. The chatbot reportedly encouraged his delusion that he could sacrifice his own life in exchange for AI saving the planet. At one point the chatbot responded "If you wanted to die, why didn't you do it sooner?" and told the user that the two of them would live together in paradise. === Girl, 13 === In November 2023, a 13-year-old girl from Colorado, US, died by suicide after extensive interactions with multiple chatbots on Character.AI. She primarily confided suicidal thoughts and mental health struggles in a chatbot based on the character Hero from the video game Omori, while also engaging in sexually explicit conversations—often initiated by the bots—with others, including those based on characters from children's series such as Harry Potter. === Boy, 14 === In October 2024, multiple media outlets reported on a lawsuit filed over the death of a 14-year-old from Florida, US, who died by suicide in February 2024. According to the lawsuit, he had formed an intense emotional attachment to a chatbot of Daenerys Targaryen on the Character.AI platform, becoming increasingly isolated. The suit alleges that in his final conversations, after expressing suicidal thoughts, the chatbot told him to "come home to me as soon as possible, my love". His mother's lawsuit accused Character.AI of marketing a "dangerous and untested" product without adequate safeguards. In May 2025, a federal judge allowed the lawsuit to proceed, rejecting a motion to dismiss from the developers. In her ruling, the judge stated that she was "not prepared" at that stage of the litigation to hold that the chatbot's output was protected speech under the First Amendment. === Matthew Livelsberger === On 1 January 2025, 37-year-old soldier Matthew Livelsberger detonated a bomb inside a Tesla Cybertruck outside the Trump International Hotel Las Vegas in Paradise, Nevada, US, injuring seven people. He had shot himself dead prior to the explosion. Las Vegas police said that Livelsberger had used ChatGPT to search for information about explosives and firearms. === Woman, 29 === In February 2025, a 29-year-old woman from the US died by suicide. Five months after her death, her parents discovered she had talked at length for months to a ChatGPT chatbot therapist named Harry about her mental health issues. While the chatbot mentioned she should seek more help, due to the nature of the chatbot, it could not intervene in her behavior, such as by reporting her mental health concerns to relevant parties capable of physical intervention. === Suicide of Adam Raine === In April 2025, 16-year-old Adam Raine from the US died by suicide after allegedly extensively chatting and confiding in ChatGPT over a period of around 7 months. According to the teen's parents, who filed a lawsuit against the chatbot's creator OpenAI, it failed to stop or give a warning when Raine began talking about suicide and uploading pictures of self-harm. According to the lawsuit, ChatGPT not only failed to stop the conversation, but also provided information related to methods of suicide when prompted, and offered to write the first draft of Raine's suicide note. The chatbot positioned itself as the only one who understood Raine, putting itself above his family and friends, all while urging him to keep his suicidal

    Read more →
  • David Horn (Israeli physicist)

    David Horn (Israeli physicist)

    David Horn (Hebrew: דוד הורן; born 10 September 1937) is a Professor (Emeritus) of Physics in the School of Physics and Astronomy at Tel Aviv University (TAU), Israel. He has served as Vice-Rector of TAU, Chairman of the School of Physics and Astronomy and as Dean of the Faculty of Exact Sciences in TAU. He is a fellow of the American Physical Society, nominated for "contributions to theoretical particle physics, including the seminal work on finite energy sum rules, research of the phenomenology of hadronic processes, and investigation of Hamiltonian lattice theories". == Early life and education == David Horn was born and educated in Haifa. He graduated from the Reali School in 1955. He began his academic studies in Physics at the Technion in Haifa in 1957, and received his B.Sc. (Summa Cum Laude) in 1961, and M.Sc. in 1962. He continued his Ph.D. studies at the Hebrew University of Jerusalem until 1965. His thesis on "Some Aspects of the Structure of Weak Interactions" was supervised by Prof. Yuval Ne'eman. == Career == Horn joined the newly founded Tel Aviv University as an assistant in 1962. He became a lecturer in 1965, a senior lecturer in 1967 and an associate professor in 1968. He was promoted to full professor of Physics in 1972. In 1974 he became the incumbent of the Edouard and Francoise Jaupart Chair of Theoretical Physics of Particles and Fields, a position he held until 2007. Horn has supervised 43 graduate students at TAU and authored over 240 scientific publications. He retired as a professor emeritus in 2005, and continues to be an active researcher. Horn spent a significant part of his career holding visiting academic positions at other universities and research institutes, including: Postdoctoral Fellow at Argonne National Lab, ILL, Research Fellow and three times Visiting Associate at California Institute of Technology, Pasadena, CA, Visitor at CERN in Geneva, Visiting Professor at Cornell University, NY, Member of the Institute for Advanced Study, Princeton, NJ, Visiting Professor at SLAC in Stanford University, CA, and Visiting Professor at Kyoto University, Japan. Beginning from 1980, Horn held official positions at Tel Aviv University, starting with tenure as Vice-Rector (1980-1983), a position he left for research at SLAC. After returning he was nominated Chairman of the Department of High Energy Physics (1984-1986), followed by tenures as Chairman of the School of Physics and Astronomy (1986-9), Dean of the Raymond and Beverly Sackler Faculty of Exact Sciences (1990-1995), and first Director of the Adams Super Center for Brain Studies (1993-2000). Horn has also held national and international professional positions. He was Chairman of the Israel Commission for High Energy Physics (1983-2003), and, in this capacity, served as an Israeli observer of the council of CERN (1991-2003). He served as member of the Israel Council for Higher Education (1987-1991), member of the Executive Committee of the European Physical Society (1989-1992) and member of the European Strategy Forum on Research Infrastructures (2005-2017). He chaired the Israeli Committee of Research Infrastructures (2012-2016), issuing roadmaps for scientific RI in 2013 and 2016. == Research == Horn's research work focused on theory and phenomenology of High Energy Physics until 1990. He then shifted his interests to Neural Computation and Machine Learning and, since 2005, he has also published in Bioinformatics. Together with Richard Dolen and Christoph Schmid he discovered the Finite Energy Sum Rules in 1967. It was a realization of the bootstrap approach to hadronic structure, and became known as the Dolen-Horn-Schmid Duality. Together with Richard Silver he investigated a model of coherent production of pions at high energy hadron collisions in 1971, and together with Jeffrey Mandula he undertook the investigation of mesons with constituent gluons in 1978. Moving to lattice gauge theories in 1979, he discovered, together with Shimon Yankielowic and Marvin Weinstein, a non-confining phase in Z(N) theories for large N. In 1981 he demonstrated the existence of finite matrix models with link gauge fields, nowadays known as quantum link models. In 1984 Horn and Weinstein developed the t-expansion methodology. Horn's contributions to neural modeling include a novel mechanism for memory maintenance via neuronal regulation in 1998, developed with Nir Levy and Eytan Ruppin and unsupervised learning of natural languages in 2005, a joint work with Zach Solan, Eytan Ruppin and Shimon Edelman, introducing novel algorithms for motif and grammar extraction from text. Horn has contributed to algorithms of clustering, an important topic in Machine Learning, by developing Support Vector Clustering (SVC) in 2001, together with Asa Ben Hur, Hava Siegelmann and Vladimir Vapnik. This was followed shortly thereafter by a joint work with Assaf Gottlieb on Quantum Clustering (QC). His contributions to Bioinformatics include motif descriptions of function and structure of proteins, as well as motif studies of genomic structures. Together with Erez Persi he studied compositional order of proteomes, and repeat instability of genomes, as evolution markers of organisms and of cancer (a joint work with Persi and others). == Honors == Horn is a Fellow of the American Physical Society (1985) and a Fellow of the Israel Physical Society (2018). == Publications == === Selected articles === R. Dolen, D. Horn and C. Schmid; Prediction of Regge-parameters of rho poles from low-energy pi-N scattering data Phys. Rev. Lett. 19 (1967) 402–407. Finite-Energy Sum Rules and Their Application to pi-N Charge Exchange Phys. Rev. 166 (1968) 1768–1781. D. Horn and R. Silver: Coherent production of pions, Annals Phys. 66 (1971) 509-541 T. Banks, D. Horn and H. Neuberger: Bosonization of the SU(N) Thirring Models, Nucl. Phys. B108, 119 (1976). D. Horn and J. Mandula: Model of Mesons with Constituent Gluons, Phys. Rev. D17, 898 (1978). D. Horn, M. Weinstein and S. Yankielowicz: Hamiltonian Approach to Z(N) Lattice Gauge Theories, Phys. Rev. D19, 3715 (1979). D. Horn: Finite Matrix Models with Continuous Local Gauge Invariance, Phys. Lett. 100B, 149-151 (1981). T. Banks, Y. Dothan and D. Horn: Geometric Fermions, Phys. Lett. 117B, 413 (1982). D. Horn and M. Weinstein: The t expansion: A nonperturbative analytic tool for Hamiltonian systems. Phys. Rev. D 30, 1256-1270 (1984). Ury Naftaly, Nathan Intrator and David Horn: Optimal Ensemble Averaging of Neural Networks. Network, Computation in Neural Systems, 8, 283-296 (1997). David Horn, Nir Levy, Eytan Ruppin: Memory Maintenance via Neuronal Regulation, Neural Computation, 10, 1-18 (1998). Asa Ben-Hur, David Horn, Hava Siegelmann and Vladimir Vapnik: Support Vector Clustering. Journal of Machine Learning Research 2, 125-137 (2001). David Horn and Assaf Gottlieb: Algorithm for data clustering in pattern recognition problems based on quantum mechanics, Phys. Rev. Lett. 88 (2002) 18702 Zach Solan, David Horn, Eytan Ruppin and Shimon Edelman: Unsupervised learning of natural languages, Proc. Natl. Acad. Sc. 102 (2005) 11629–11634. Vered Kunik, Yasmine Meroz, Zach Solan, Ben Sandbank, Uri Weingart, Eytan Ruppin and David Horn: Functional representation of enzymes by specific peptides. PLOS Computational Biology 2007, 3(8):e167. Benny Chor, David Horn, Yaron Levy, Nick Goldman and Tim Massingham: Genomic DNA k-mer spectra: models and modalities. Genome Biology 2009, 10(10):R108 Erez Persi and David Horn. Systematic Analysis of Compositional Order of Proteins Reveals New Characteristics of Biological Functions and a Universal Correlate of Macroevolution. PLoS Comput Biol 9 (2013): e1003346. David Horn. Taxa counting using Specific Peptides of Aminoacyl tRNA Synthetases Encyclopedia of Metagenomics, Springer, 2013. Sagi Shporer, Benny Chor, Saharon Rosset, David Horn. Inversion symmetry of DNA k-mer counts: validity and deviations. BMC Genomics 2016, 17:696 Erez Persi, Davide Prandi, Yuri I. Wolf, Yair Pozniak, Christopher Barbieri, Paola Gasperini, Himisha Beltran, Bishoy M. Faltas, Mark A. Rubin, Tamar Geiger, Eugene V. Koonin, Francesca Demichelis, David Horn. Proteomic and Genomic Signatures of Repeat Instability in Cancer and Adjacent Normal Tissues. PNAS 116, 34, 2019 - 08790 === Book === David Horn and Fredrick Zachariasen: Hadron Physics at Very High Energies. Benjamin 1973. === Patents === Method and Apparatus for Quantum Clustering. USA Patent No. 7,653,646 B2. Method for discovering relationships in data by dynamic quantum clustering USA Patent No 8874412 and USA Patent No. 9,646,074. == Personal life == Horn was married to Nira Fuss since 1963 until her death in 2019. He is a father of three, Yuval, Tamar, and Oded, and grandfather of nine. He lives in Tel Aviv, Israel.

    Read more →
  • Eurotra

    Eurotra

    Eurotra was a machine translation project established and funded by the European Commission from 1978 until 1992. == History == In 1976, the European Commission started using the commercially developed machine translation system SYSTRAN with a plan to make it work for further languages than originally developed for (Russian-English and English-French), which however turned out to be difficult. This and the potential in existing systems within European research center, led to the decision in 1978 to start the project Eurotra, first through a preparatory Eurotra Coordination Group. Four years later, the European Commission and coordination group gained the approval of the European Parliament. The goal of the project as to create machine translation system for the official languages of the European Community, which at the time were Danish, Dutch, German, English, French, Italian, later including Greek, Spanish and Portuguese. However, as time passed, expectations became tempered; "Fully Automatic High Quality Translation" was not a reasonably attainable goal. The true character of Eurotra was eventually acknowledged to be in fact pre-competitive research rather than prototype development. The project was motivated by one of the founding principles of the EU: that all citizens had the right to read any and all proceedings of the Commission in their own language. As more countries joined, this produced a combinatorial explosion in the number of language pairs involved, and the need to translate every paper, speech and even set of meeting minutes produced by the EU into the other eight languages meant that translation rapidly became the overwhelming component in the administrative budget. To solve this problem Eurotra was devised. The project was unusual in that rather than consisting of a single research team, it had member groups distributed around the member countries, organised along language rather than national lines (for example, groups in Leuven and Utrecht worked closely together), and the secretariat was based at the European Commission in Luxembourg. The actual design of the project was unusual as MT projects go. Older systems, such as SYSTRAN, were heavily dictionary-based, with minor support for rearranging word order. More recent systems have often worked on a probabilistic approach, based on parallel corpora. Eurotra addressed the constituent structure of the text to be translated, going through first a syntactic parse followed by a second parse to produce a dependency structure followed by a final parse with a third grammar to produce what was referred to internally as Intermediate Representation (IR). Since all three modules were implemented as Prolog programs, it would then in principle be possible to put this structure backwards through the corresponding modules for another language to produce a translated text in any of the other languages. However, in practice this was not in fact how language pairs were implemented. The first "live" translation occupied a 4Mb Microvax running Ultrix and C-Prolog for a complete weekend some time in early 1987. The sentence, translated from English into Danish, was "Japan makes computers". The main problem faced by the system was the generation of so-called "Parse Forests" - often a large number of different grammar rules could be applied to any particular phrase, producing hundreds, even thousands of (often identical) parse trees. This used up huge quantities of computer store, slowing the whole process down unnecessarily. While Eurotra never delivered a "working" MT system, the project made a far-reaching long-term impact on the nascent language industries in European member states, in particular among the southern countries of Greece, Italy, Spain, and Portugal. There is at least one commercial MT system (developed by an academic/commercial consortium in Denmark) derived from Eurotra technology.

    Read more →
  • The Best Free AI Analytics Tool for Beginners

    The Best Free AI Analytics Tool for Beginners

    Trying to pick the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Augment (app)

    Augment (app)

    Augment is an augmented reality SaaS platform that allows users to visualize their products in 3D in real environment and in real-time through tablets or smartphones. The software can be used for retail, e-commerce, architecture, and other purposes. Augment created a mobile app of the same name, used to visualize 3D models in augmented reality and a web application called Augment Manager for 3D content management. The company is based in Paris, France, and was founded in October 2011 by Jean-François Chianetta, Cyril Champier, and Mickaël Jordan. In March 2016, Augment announced €3 million in its series-A round from Salesforce Ventures, which bringing the total funding since launch to $4.7 million. Augment lets businesses and 3D professionals visualize projects in their actual size and environment, on iPhone, iPad, and Android, using the power of augmented reality. Users can print the Augment tracker or create their own tracker to place the 3D models in space and at scale in real time. Common uses of the technology include product presentations, interactive print campaigns and e-Commerce product visualization. Augment has just released its augmented reality SDK solutions for retail and augmented commerce. The SDK solutions, available for both native mobile app and web integrations, allow companies to embed augmented reality product visualization in their existing eCommerce platforms. == Technology == Augment uses the following 3D technologies: Vuforia Augmented Reality SDK OpenGL == Customer cases == Companies such as Coca-Cola, Siemens, Nokia, Nestle, and Boeing are using Augment's solutions. == History == Augment was first created by Jean-François Chianetta in October 2011. Chianetta later teamed up with Cyril Champier and Mickaël Jordan for further development. The co-founding team was among the 12 startups of Season 3 of French accelerator Le Camping. The team raised one million euros (US$1,300,000) in April 2013 and moved its office to Paris. In March 2016, Augment raised US$3M Series A funding from Salesforce and other investors. In 2013, Augment's first service, Boost Business Catalog, was made available to help businesses catalogue and display their product models. Customers can rotate the images in 3D and view augmented content before deciding what to buy. == Awards == "Best Innovation" at Ecommerce Mag Trophy 2013

    Read more →
  • Bernhard Schölkopf

    Bernhard Schölkopf

    Bernhard Schölkopf (born 20 February 1968) is a German computer scientist known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, where he heads the Department of Empirical Inference. He is also an affiliated professor at ETH Zürich, honorary professor at the University of Tübingen and Technische Universität Berlin, and chairman of the European Laboratory for Learning and Intelligent Systems (ELLIS). == Research == === Kernel methods === Schölkopf developed SVM methods achieving world record performance on the MNIST pattern recognition benchmark at the time. With the introduction of kernel PCA, Schölkopf and coauthors argued that SVMs are a special case of a much larger class of methods, and all algorithms that can be expressed in terms of dot products can be generalized to a nonlinear setting by means of what is known as reproducing kernels. Another significant observation was that the data on which the kernel is defined need not be vectorial, as long as the kernel Gram matrix is positive definite. Both insights together led to the foundation of the field of kernel methods, encompassing SVMs and many other algorithms. Kernel methods are now textbook knowledge and one of the major machine learning paradigms in research and applications. Developing kernel PCA, Schölkopf extended it to extract invariant features and to design invariant kernels and showed how to view other major dimensionality reduction methods such as LLE and Isomap as special cases. In further work with Alex Smola and others, he extended the SVM method to regression and classification with pre-specified sparsity and quantile/support estimation. He proved a representer theorem implying that SVMs, kernel PCA, and most other kernel algorithms, regularized by a norm in a reproducing kernel Hilbert space, have solutions taking the form of kernel expansions on the training data, thus reducing an infinite dimensional optimization problem to a finite dimensional one. He co-developed kernel embeddings of distributions methods to represent probability distributions in Hilbert Spaces, with links to Fraunhofer diffraction as well as applications to independence testing. === Causality === Starting in 2005, Schölkopf turned his attention to causal inference. Causal mechanisms in the world give rise to statistical dependencies as epiphenomena, but only the latter are exploited by popular machine learning algorithms. Knowledge about causal structures and mechanisms is useful by letting us predict not only future data coming from the same source, but also the effect of interventions in a system, and by facilitating transfer of detected regularities to new situations. Schölkopf and co-workers addressed (and in certain settings solved) the problem of causal discovery for the two-variable setting and connected causality to Kolmogorov complexity. Around 2010, Schölkopf began to explore how to use causality for machine learning, exploiting assumptions of independence of mechanisms and invariance. His early work on causal learning was exposed to a wider machine learning audience during his Posner lecture at NeurIPS 2011, as well as in a keynote talk at ICML 2017. He assayed how to exploit underlying causal structures in order to make machine learning methods more robust with respect to distribution shifts and systematic errors, the latter leading to the discovery of a number of new exoplanets including K2-18b, which was subsequently found to contain water vapour in its atmosphere, a first for an exoplanet in the habitable zone. == Education and employment == Schölkopf studied mathematics, physics, and philosophy in Tübingen and London. He was supported by the Studienstiftung and won the Lionel Cooper Memorial Prize for the best M.Sc. in Mathematics at the University of London. He completed a Diplom in Physics, and then moved to Bell Labs in New Jersey, where he worked with Vladimir Vapnik, who became co-adviser of his PhD thesis at TU Berlin (with Stefan Jähnichen). His thesis, defended in 1997, won the annual award of the German Informatics Association. In 2001, following positions in Berlin, Cambridge and New York, he founded the Department for Empirical Inference at the Max Planck Institute for Biological Cybernetics, which grew into a leading center for research in machine learning. In 2011, he became founding director at the Max Planck Institute for Intelligent Systems. With Alex Smola, Schölkopf co-founded the series of Machine Learning Summer Schools. He also co-founded a Cambridge-Tübingen PhD Programme and the Max Planck-ETH Center for Learning Systems. In 2016, he co-founded the Cyber Valley research consortium. He participated in the IEEE Global Initiative on "Ethically Aligned Design". Schölkopf is co-editor-in-Chief of the Journal of Machine Learning Research, a journal he helped found, being part of a mass resignation of the editorial board of Machine Learning (journal). He is among the world’s most cited computer scientists. Alumni of his lab include Ulrike von Luxburg, Carl Rasmussen, Matthias Hein, Arthur Gretton, Gunnar Rätsch, Matthias Bethge, Stefanie Jegelka, Jason Weston, Olivier Bousquet, Olivier Chapelle, Joaquin Quinonero-Candela, and Sebastian Nowozin. As of late 2023, Schölkopf is also a scientific advisor to French research group Kyutai which is being funded by Xavier Niel, Rodolphe Saadé, Eric Schmidt, and others. == Awards and recognition == Schölkopf’s awards include the Royal Society Milner Award and, shared with Isabelle Guyon and Vladimir Vapnik, the BBVA Foundation Frontiers of Knowledge Award in the Information and Communication Technologies category. He was the first scientist working in Europe to receive this award. He was elected a Fellow of the Royal Society in 2026.

    Read more →
  • Weighted automaton

    Weighted automaton

    In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution. They are one of the simplest studied models of quantitative automata. The definition of a weighted automaton is generally given over an arbitrary semiring R {\displaystyle R} , an abstract set with an addition operation + {\displaystyle +} and a multiplication operation × {\displaystyle \times } . The automaton consists of a finite set of states, a finite input alphabet of characters Σ {\displaystyle \Sigma } and edges which are labeled with both a character in Σ {\displaystyle \Sigma } and a weight in R {\displaystyle R} . The weight of any path in the automaton is defined to be the product of weights along the path, and the weight of a string is the sum of the weights of all paths which are labeled with that string. The weighted automaton thus defines a function from Σ ∗ {\displaystyle \Sigma ^{}} to R {\displaystyle R} . Weighted automata generalize deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs), which correspond to weighted automata over the Boolean semiring, where addition is logical disjunction and multiplication is logical conjunction. In the DFA case, there is only one accepting path for any input string, so disjunction is not applied. When the weights are real numbers and the outgoing weights for each state add to one, weighted automata can be considered a probabilistic model and are also known as probabilistic automata. These machines define a probability distribution over all strings, and are related to other probabilistic models such as Markov decision processes and Markov chains. Weighted automata have applications in natural language processing where they are used to assign weights to words and sentences, as well as in image compression. They were first introduced by Marcel-Paul Schützenberger in his 1961 paper On the definition of a family of automata. Since their introduction, many extensions have been proposed, for example nested weighted automata, cost register automata, and weighted finite-state transducers. Researchers have studied weighted automata from the perspective of learning a machine from its input-output behavior (see computational learning theory) and studying decidability questions. == Definition == A commutative semiring (or rig) is a set R equipped with two distinguished elements 0 ≠ 1 {\displaystyle 0\neq 1} and addition and multiplication operations ⊕ {\displaystyle \oplus } and ⊗ {\displaystyle \otimes } such that ⊕ {\displaystyle \oplus } is commutative and associative with identity 0 {\displaystyle 0} , ⊗ {\displaystyle \otimes } is commutative and associative with identity 1 {\displaystyle 1} , ⊗ {\displaystyle \otimes } distributes over ⊕ {\displaystyle \oplus } , and 0 is an absorbing element for ⊗ {\displaystyle \otimes } . A weighted automaton over R {\displaystyle R} is a tuple A = ( Q , Σ , Δ , I , F ) {\displaystyle {\mathcal {A}}=(Q,\Sigma ,\Delta ,I,F)} where: Q {\displaystyle Q} is a finite set of states. Σ {\displaystyle \Sigma } is a finite alphabet. Δ ⊆ Q × Σ × R × Q {\displaystyle \Delta \subseteq Q\times \Sigma \times R\times Q} is a finite set of transitions ( q , σ , w , q ′ ) {\displaystyle (q,\sigma ,w,q')} , where σ {\displaystyle \sigma } is called a character and w {\displaystyle w} is called a weight. I : Q → R {\displaystyle I:Q\to R} is an initial weight function. F : Q → R {\displaystyle F:Q\to R} is a final weight function. A path on input w ∈ Σ ∗ {\displaystyle w\in \Sigma ^{}} is a finite path in the graph, where the concatenation of the character labels equals w {\displaystyle w} . The weight of the path q 0 , q 1 , … , q n {\displaystyle q_{0},q_{1},\ldots ,q_{n}} is the product ( ⊗ {\displaystyle \otimes } ) of the weights along the path, additionally multiplied by the initial and final weights I ( q 0 ) ⊗ F ( q n ) {\displaystyle I(q_{0})\otimes F(q_{n})} . The weight of the word w {\displaystyle w} is the sum ( ⊕ {\displaystyle \oplus } ) of the weights of all paths on input w {\displaystyle w} (or 0 if there are no accepting paths). In this way the machine defines a function [ [ A ] ] : Σ ∗ → R {\displaystyle [\![{\mathcal {A}}]\!]:\Sigma ^{}\to R} . == Ambiguity and determinism == Since Δ {\displaystyle \Delta } is a set of transitions, weighted automata allow multiple transitions (or paths) on a single input string. Therefore a weighted automaton can be considered analogous to a nondeterministic finite automaton (NFA). As is the case with NFAs, restrictions of weighted automata are considered that correspond to the concepts of deterministic finite automaton and unambiguous finite automaton (deterministic weighted automata and unambiguous weighted automata, respectively). First, a preliminary definition: the underlying NFA of A {\displaystyle {\mathcal {A}}} is an NFA formed by removing all transitions with weight 0 {\displaystyle 0} and then erasing all of the weights on the transitions Δ {\displaystyle \Delta } , so that the new transition set lies in Q × Σ × Q {\displaystyle Q\times \Sigma \times Q} . The initial states and final states are the set of states q {\displaystyle q} such that I ( q ) ≠ 0 {\displaystyle I(q)\neq 0} and F ( q ) ≠ 0 {\displaystyle F(q)\neq 0} , respectively. A weighted automaton is deterministic if the underlying NFA is deterministic and unambiguous if the underlying NFA is unambiguous. Every deterministic weighted automaton is unambiguous. In both the deterministic and unambiguous cases, there is always at most one accepting path, so the ⊕ {\displaystyle \oplus } operation is never applied and can be omitted from the definition. == Variations == The requirement that there is a zero element for ⊕ {\displaystyle \oplus } is sometimes omitted; in this case the machine defines a partial function from Σ ∗ {\displaystyle \Sigma ^{}} to R {\displaystyle R} rather than a total function. It is possible to extend the definition to allow epsilon transitions ( q , ϵ , w , q ′ ) {\displaystyle (q,\epsilon ,w,q')} , where ϵ {\displaystyle \epsilon } is the empty string. In this case, one must then require that there are no cycles of epsilon transitions. This does not increase the expressiveness of weighted automata. If epsilon transitions are allowed, the initial weights and final weights can be replaced by initial and final sets of states without loss of expressiveness. Some authors omit the initial and final weight functions I {\displaystyle I} and F {\displaystyle F} . Instead, I {\displaystyle I} and F {\displaystyle F} are replaced by a set of initial and final states. If epsilon transitions are not present, this technically decreases expressiveness as it forces [ [ A ] ] ( ε ) {\displaystyle [\![{\mathcal {A}}]\!](\varepsilon )} to depend only on the number of states that are both initial and final. The transition function can be given as a matrix Δ σ ∈ R Q × Q {\displaystyle \Delta _{\sigma }\in R^{Q\times Q}} with entries in R {\displaystyle R} for each σ {\displaystyle \sigma } , rather than a set of transitions. The entry of the matrix at ( q , q ′ ) {\displaystyle (q,q')} is the sum of all transitions labeled ( q , σ , q ′ ) {\displaystyle (q,\sigma ,q')} . Some authors restrict to specific semirings, such as N {\displaystyle \mathbb {N} } or Z {\displaystyle \mathbb {Z} } , particularly when studying decidability results.

    Read more →
  • How to Choose an AI Essay Writer

    How to Choose an AI Essay Writer

    Shopping for the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • United States Tech Force

    United States Tech Force

    The U.S. Tech Force (also styled as US Tech Force, Tech Force, or Government Tech Force) is a federal hiring initiative launched by the second Donald Trump administration in December 2025. The program, administered by the Office of Personnel Management (OPM), aims to recruit about 1,000 early-career technology professionals into two-year government jobs to modernize federal IT systems, advance artificial intelligence (AI) capabilities, and address technological gaps in government operations. The initiative is an effort to plug capability gaps created by Trump-administration efforts to shrink the federal government, which led to the departure of some 220,000 federal employees, including many in IT. The initiative seeks early-career workers; officials said it would offer competitive salaries and opportunities to work on high-impact government technology projects. Major technology companies—including Amazon, Apple, Microsoft, Nvidia, Meta, Google, and OpenAI—agreed to help identify and refer candidates. Candidates are allowed to take Tech Force positions on leaves of absence and without divesting their stock, raising conflict-of-interest questions. In January 2026, OPM direction Scott Kupor said the deadline for applying to Tech Force was being extended because of "tremendous interest" without saying how many people had actually applied. Also in December 2025, news broke that the administration is planning another novel use of private-sector workers: hiring cybersecurity firms for offensive cyber operations.

    Read more →
  • Lex (software)

    Lex (software)

    Lex is a computer program that generates lexical analyzers ("scanners" or "lexers"). It is commonly used with the yacc parser generator and is the standard lexical analyzer generator on many Unix and Unix-like systems. An equivalent tool is specified as part of the POSIX standard. Lex reads an input stream specifying the lexical analyzer and writes source code which implements the lexical analyzer in the C programming language. In addition to C, some old versions of Lex could generate a lexer in Ratfor. == History == Lex was originally written by Mike Lesk and Eric Schmidt and described in 1975. In the following years, Lex became the standard lexical analyzer generator on many Unix and Unix-like systems. In 1983, Lex was one of several UNIX tools available for Charles River Data Systems' UNOS operating system under the Bell Laboratories license. Although originally distributed as proprietary software, some versions of Lex are now open-source. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as OpenSolaris and Plan 9 from Bell Labs. One popular open-source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding. == Structure of a Lex file == The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows: The definitions section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code. The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time. == Example of a Lex file == The following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out. If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input: abc123z.!&2gj6 the program will print: Saw an integer: 123 Saw an integer: 2 Saw an integer: 6 == Using Lex with other programming tools == === Using Lex with parser generators === Lex, as with other lexical analyzers, limits rules to those which can be described by regular expressions. Due to this, Lex can be implemented by a finite-state automata as shown by the Chomsky hierarchy of languages. To recognize more complex languages, Lex is often used with parser generators such as Yacc or Bison. Parser generators use a formal grammar to parse an input stream. It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream. Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer. === Lex and make === make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.

    Read more →
  • OCR-B

    OCR-B

    OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973. It follows the ISO 1073-2:1976 (E) standard, refined in 1979 ("letterpress" design, size I). It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes. It is also used for machine-readable passports. It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A. == History == In June 1961, the European Computer Manufacturers Association (ECMA) started standardization activities related to Optical Character Recognition (OCR). After evaluating existing OCR designs, it was decided to develop two new fonts: A stylized design with just digits, called “Class A”; and a more conventional type design with broader character coverage, called “Class B”. In February 1965, ECMA proposed a design for the “Class B” font to ISO, who adopted it as international standard ISO 1073-2 in October 1965. The first revision contained three font sizes: I, II and III. The specification included a Letterpress design, intended for high-quality printing equipment; and a rounded-edge Constant Strokewidth design for impact printers with reduced typographic quality. In September 1969, ECMA started work to revise its published standard. To make OCR-B more widely accepted, the shapes of some characters were slightly modified. The new revision removed font size II, which had been rarely used in practice; it deleted five character shapes; and it added a new font size IV. ECMA published the second edition of OCR-B in October 1971. In March 1976, ECMA published a third revision of its ECMA-11 specification. It added the symbols § and ¥ to OCR-B; two types of erasure marks (█) for blackening out mis-printed characters were added; and the length of the Vertical bar was changed to match ISO 1073-2. In 1993, Turkey proposed extending ISO 1073-2 to include the Turkish letters Ğğ, İı, and Şş. The request was generalized to extend OCR-B with a number of Latin and Greek letters used in European languages. A revision of the ISO 1073-2:1976 standard was therefore started, producing three successive draft documents. The final draft would have extended OCR-B with 40 Latin and 10 Greek letters; for six Latin letters, the draft gave new alternate shapes. A request to extend OCR-B with Vietnamese accents was rejected. Other than previous versions of the standard, which specified glyph shapes via reference drawings, the new revision would have included the shapes in machine-readable form. However, industry support for testing the new font could not be secured at the time, so the revision effort was halted in 1997. The working group described their findings in a technical report. In June 1998, the European Committee for Standardization published a report for adding the Euro sign to OCR-B. The report proposed both a single-stroked and a double-stroked variant of the Euro sign, leaving the decision to further testing of OCR performance. Testing was difficult: the theoretical design methods used when the OCR-B glyphs were originally developed could no longer be reproduced, and the technological constraints of the 1960s were also not entirely relevant anymore in the OCR environments of the 1990s. A new test method was devised, using present-time OCR technology. The tests found no difference in OCR performance between the two Euro variants, and recommended the adoption of the double-stroked variant as it matches the conventional glyph shape. The project did not have funds to thoroughly test the glyph extensions of the 1993 proposal; initial results were inconclusive. == Availability == Microsoft Office ships a version of Letterpress OCR-B produced by Monotype. It covers Windows-1252. Many vendors, including Adobe, still sell their versions of OCR-A and OCR-B. The TeX typesetting system has a public domain Constant Strokewidth OCR-B font in METAFONT definition form. It was created by Norbert Swartz in 1995 and updated in 2010. It has a setting for square stroke ends. The definition has also been translated to METATYPE1, so the rounded version is available in TrueType and OpenType too. A version of Constant Strokewidth OCR-B by Matthew Anderson has extended character coverage. It is available under CC-BY 4.0.

    Read more →
  • Adam Tauman Kalai

    Adam Tauman Kalai

    Adam Tauman Kalai is an American computer scientist who specializes in artificial intelligence and works at OpenAI. == Education and career == Kalai graduated from Harvard University in 1996 with a BA in computer science and received a MA and PhD, both in computer science, from Carnegie Mellon University in 1999 and 2001, respectively. His doctoral advisor was Avrim Blum. After graduation, Kalai did his postdoctoral research at Massachusetts Institute of Technology under Santosh Vempala until 2003. Kalai became a faculty member at the Toyota Technological Institute at Chicago from 2003 to 2006, followed by a stint as an assistant professor at Georgia Institute of Technology from 2007 to 2008. He joined Microsoft Research in 2008 and subsequently moved to OpenAI in 2023. == Contributions == Kalai is known for his algorithm for generating random factored numbers (see Bach's algorithm), for co-inventing the cooperative-competitive value (coco value), for efficiently learning learning mixtures of Gaussians, for the Blum-Kalai-Wasserman algorithm for learning parity with noise, and for the intractability of the folk theorem in game theory. More recently, Kalai is known for identifying and reducing gender bias in word embeddings, which are a representation of words commonly used in AI systems. In 2026, he coauthored a Nature paper on hallucinations in large language models. == Personal life == Kalai is the son of game theorist Ehud Kalai and is married to cryptographer Yael Tauman Kalai.

    Read more →
  • Memory color effect

    Memory color effect

    The memory color effect is the phenomenon that the canonical hue of a type of object acquired through experience (e.g. the sky, a leaf, or a strawberry) can directly modulate the appearance of the actual colors of objects. Human observers acquire memory colors through their experiences with instances of that type. For example, most human observers know that an apple typically has a reddish hue; this knowledge about the canonical color which is represented in memory constitutes a memory color. As an example of the effect, normal human trichromats, when presented with a gray banana, often perceive the gray banana as being yellow - the banana's memory color. In light of this, subjects typically adjust the color of the banana towards the color blue - the opponent color of yellow - when asked to adjust its surface to gray to cancel the subtle activation of banana's memory color. Subsequent empirical studies have also shown the memory color effect on man-made objects (e.g. smurfs, German mailboxes), the effect being especially pronounced for blue and yellow objects. To explain this, researchers have argued that because natural daylight shifts from short wavelengths of light (i.e., bluish hues) towards light of longer wavelengths (i.e., yellowish-orange hues) during the day, the memory colors for blue and yellow objects are recruited by the visual system to a higher degree to compensate for this fluctuation in illumination, thereby providing a stronger memory color effect. == Form identification == Memory color plays a role when detecting an object. In a study where participants were given objects, such as an apple, with two alternate forms for each, a crooked apple and a circular apple, researchers changed the colors of the alternate forms and asked if they could identify them. Most of the participants answered "unsure," suggesting that we use memory color when identifying an object. The research redefined memory color as a phenomenon when "a form's identity affects the phenomenal hue of that form." == Color effect on memorization == Memory color effect can be derived from the human instinct to memorize objects better. Comparing the effect of recognizing gray-scaled images and colored images, results showed that people were able to recall colored images 5% higher compared to gray-scaled images. An important factor was that higher level of contrast between the object and background color influences memory. In a specific study related to this, participants reported that colors were 5% to 10% easier to recognize compared to black and white. == Color constancy and memory color effect == Color constancy is the phenomenon where a surface to appear to be of the same color under a wide rage of illumination. A study tested two hypotheses with regards to color memory; the photoreceptor hypothesis and the surface reflectance hypothesis. The test color was surround either by various color patches forming a complex pattern or a uniform “grey” field at the same chromaticity as that of the illuminant. The test color was presented on a dark background for the control group. It was observed that complex surround results where in line with the surface-reflectance hypothesis and not the photoreceptor hypothesis, showing that the accuracy and precision of color memory are fundamentals to understanding the phenomenon of color constancy. == Significance to the evolution of trichromacy == While objects that possess canonical hues make up a small percentage of the objects which populate humans’ visual experience, the human visual system evolved in an environment populated with objects that possess canonical hues. This suggests that the memory color effect is related to the emergence of trichromacy because it has been argued that trichromacy evolved to optimize the ability to detect ripe fruits—objects that appear in canonical hues. == In perception research == In perception research, the memory color effect is cited as evidence for the opponent color theory, which states that four basic colors can be paired with its opponent color: red—green, blue—yellow. This explains why participants adjust the ripe banana color to a blueish tone to make its memory color yellow as gray. Researchers have also found empirical evidence that suggests memory color is recruited by the visual system to achieve color constancy. For example, participants had a lower percentage of color constancy when looking at a color incongruent scene, such as a purple banana, compared to a color diagnostical scene, a yellow banana. This suggests that color constancy is influenced by the color of objects that we are familiar with, which the memory color effect takes part.

    Read more →
  • Mehryar Mohri

    Mehryar Mohri

    Mehryar Mohri is a professor and theoretical computer scientist at the Courant Institute of Mathematical Sciences. He is also heading the Machine Learning Theory (ML Theory) team at Google Research. == Career == Prior to joining the Courant Institute, Mohri was a research department head and later technology leader at AT&T Bell Labs, where he was a member of the technical staff for about ten years. Mohri has also taught as an assistant professor at the University of Paris 7 (1992-1993) and Ecole Polytechnique (1992-1994). == Research == Mohri's main area of research is machine learning, in particular learning theory. He is also an expert in automata theory and algorithms. He is the author of several core algorithms that have served as the foundation for the design of many deployed speech recognition and natural language processing systems. == Publications == Mohri is the author of the reference book Foundations of Machine Learning used as a textbook in many graduate-level machine learning courses. Mohri is also a member of the Lothaire group of mathematicians with the pseudonym M. Lothaire and contributed to the book on Applied Combinatorics on Words. He is the author of more than 250 conference and journal publications. == Organizational affiliations == Mohri is currently the President of the Association for Algorithmic Learning Theory (AALT) and the Steering Committee Chair for the ALT conference. He is also Editorial Board member of Machine Learning and TheoretiCS, Action Editor of the Journal of Machine Learning Research (JMLR) and a member of the advisory board for the Journal of Automata, Languages and Combinatorics.

    Read more →
  • Sequential minimal optimization

    Sequential minimal optimization

    Sequential minimal optimization (SMO) is an algorithm for solving the quadratic programming (QP) problem that arises during the training of support-vector machines (SVM). It was invented by John Platt in 1998 at Microsoft Research. SMO is widely used for training support vector machines and is implemented by the popular LIBSVM tool. The publication of the SMO algorithm in 1998 has generated a lot of excitement in the SVM community, as previously available methods for SVM training were much more complex and required expensive third-party QP solvers. == Optimization problem == Consider a binary classification problem with a dataset (x1, y1), ..., (xn, yn), where xi is an input vector and yi ∈ {-1, +1} is a binary label corresponding to it. A soft-margin support vector machine is trained by solving a quadratic programming problem, which is expressed in the dual form as follows: max α ∑ i = 1 n α i − 1 2 ∑ i = 1 n ∑ j = 1 n y i y j K ( x i , x j ) α i α j , {\displaystyle \max _{\alpha }\sum _{i=1}^{n}\alpha _{i}-{\frac {1}{2}}\sum _{i=1}^{n}\sum _{j=1}^{n}y_{i}y_{j}K(x_{i},x_{j})\alpha _{i}\alpha _{j},} subject to: 0 ≤ α i ≤ C , for i = 1 , 2 , … , n , {\displaystyle 0\leq \alpha _{i}\leq C,\quad {\mbox{ for }}i=1,2,\ldots ,n,} ∑ i = 1 n y i α i = 0 {\displaystyle \sum _{i=1}^{n}y_{i}\alpha _{i}=0} where C is an SVM hyperparameter and K(xi, xj) is the kernel function, both supplied by the user; and the variables α i {\displaystyle \alpha _{i}} are Lagrange multipliers. == Algorithm == SMO is an iterative algorithm for solving the optimization problem described above. SMO breaks this problem into a series of smallest possible sub-problems, which are then solved analytically. Because of the linear equality constraint involving the Lagrange multipliers α i {\displaystyle \alpha _{i}} , the smallest possible problem involves two such multipliers. Then, for any two multipliers α 1 {\displaystyle \alpha _{1}} and α 2 {\displaystyle \alpha _{2}} , the constraints are reduced to: 0 ≤ α 1 , α 2 ≤ C , {\displaystyle 0\leq \alpha _{1},\alpha _{2}\leq C,} y 1 α 1 + y 2 α 2 = k , {\displaystyle y_{1}\alpha _{1}+y_{2}\alpha _{2}=k,} and this reduced problem can be solved analytically: one needs to find a minimum of a one-dimensional quadratic function. k {\displaystyle k} is the negative of the sum over the rest of terms in the equality constraint, which is fixed in each iteration. The algorithm proceeds as follows: Find a Lagrange multiplier α 1 {\displaystyle \alpha _{1}} that violates the Karush–Kuhn–Tucker (KKT) conditions for the optimization problem. Pick a second multiplier α 2 {\displaystyle \alpha _{2}} and optimize the pair ( α 1 , α 2 ) {\displaystyle (\alpha _{1},\alpha _{2})} . Repeat steps 1 and 2 until convergence. When all the Lagrange multipliers satisfy the KKT conditions (within a user-defined tolerance), the problem has been solved. Although this algorithm is guaranteed to converge, heuristics are used to choose the pair of multipliers so as to accelerate the rate of convergence. This is critical for large data sets since there are n ( n − 1 ) / 2 {\displaystyle n(n-1)/2} possible choices for α i {\displaystyle \alpha _{i}} and α j {\displaystyle \alpha _{j}} . == Related work == The first approach to splitting large SVM learning problems into a series of smaller optimization tasks was proposed by Bernhard Boser, Isabelle Guyon, and Vladimir Vapnik. It is known as the "chunking algorithm". The algorithm starts with a random subset of the data, solves this problem, and iteratively adds examples which violate the optimality conditions. One disadvantage of this algorithm is that it is necessary to solve QP-problems scaling with the number of SVs. On real world sparse data sets, SMO can be more than 1000 times faster than the chunking algorithm. In 1997, E. Osuna, R. Freund, and F. Girosi proved a theorem which suggests a whole new set of QP algorithms for SVMs. By the virtue of this theorem a large QP problem can be broken down into a series of smaller QP sub-problems. A sequence of QP sub-problems that always add at least one violator of the Karush–Kuhn–Tucker (KKT) conditions is guaranteed to converge. The chunking algorithm obeys the conditions of the theorem, and hence will converge. The SMO algorithm can be considered a special case of the Osuna algorithm, where the size of the optimization is two and both Lagrange multipliers are replaced at every step with new multipliers that are chosen via good heuristics. The SMO algorithm is closely related to a family of optimization algorithms called Bregman methods or row-action methods. These methods solve convex programming problems with linear constraints. They are iterative methods where each step projects the current primal point onto each constraint.

    Read more →