Virtual assistant

Virtual assistant

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input, such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to streamline task execution. The interaction may be via text, graphical interface, or voice, as some virtual assistants are able to interpret human speech and respond via synthesized voices. In many cases, users can ask their virtual assistants questions, control home automation devices and media playback, and manage other basic tasks such as email, to-do lists, and calendars – all with verbal commands. In recent years, prominent virtual assistants for direct consumer use have included Apple Siri, Amazon Alexa, Google Assistant (Gemini), Microsoft Copilot and Samsung Bixby. Also, companies in various industries often incorporate some kind of virtual assistant technology into their customer service or support. Into the 2020s, the emergence of artificial intelligence based chatbots, such as ChatGPT, has brought increased capability and interest to the field of virtual assistant products and services. == History == === Experimental decades: 1910s–1980s === Radio Rex was the first voice-activated toy, patented in 1916 and released in 1922. It was a wooden toy in the shape of a dog that would come out of its house when its name is called. In 1952, Bell Labs presented "Audrey", the Automatic Digit Recognition machine. It occupied a six-foot-high relay rack, consumed substantial power, had streams of cables and exhibited the myriad maintenance problems associated with complex vacuum-tube circuitry. It could recognize the fundamental units of speech, phonemes. It was limited to the accurate recognition of digits spoken by designated talkers. It could therefore be used for voice dialing, but in most cases, push-button dialing was cheaper and faster, rather than speaking the consecutive digits. Another early tool which was enabled to perform digital speech recognition was the IBM Shoebox voice-activated calculator, presented to the general public during the 1962 Seattle World's Fair after its initial market launch in 1961. This early computer, developed almost 20 years before the introduction of the first IBM Personal Computer in 1981, was able to recognize 16 spoken words and the digits 0 to 9. The first natural language processing computer program or the chatbot ELIZA was developed by MIT professor Joseph Weizenbaum in the 1960s. It was created to "demonstrate that the communication between man and machine was superficial". ELIZA used pattern matching and substitution methodology into scripted responses to simulate conversation, which gave an illusion of understanding on the part of the program. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people. This gave name to the ELIZA effect, the tendency to unconsciously assume computer behaviors are analogous to human behaviors; that is, anthropomorphisation, a phenomenon present in human interactions with virtual assistants. The next milestone in the development of voice recognition technology was achieved in the 1970s at the Carnegie Mellon University in Pittsburgh, Pennsylvania with substantial support of the United States Department of Defense and its DARPA agency, funded five years of a Speech Understanding Research program, aiming to reach a minimum vocabulary of 1,000 words. Companies and academia including IBM, Carnegie Mellon University (CMU) and Stanford Research Institute took part in the program. The result was "Harpy", it mastered about 1000 words, the vocabulary of a three-year-old and it could understand sentences. It could process speech that followed pre-programmed vocabulary, pronunciation, and grammar structures to determine which sequences of words made sense together, and thus reducing speech recognition errors. In 1986, Tangora was an upgrade of the Shoebox, it was a voice recognizing typewriter. Named after the world's fastest typist at the time, it had a vocabulary of 20,000 words and used prediction to decide the most likely result based on what was said in the past. IBM's approach was based on a hidden Markov model, which adds statistics to digital signal processing techniques. The method makes it possible to predict the most likely phonemes to follow a given phoneme. Still each speaker had to individually train the typewriter to recognize their voice, and pause between each word. In 1983, Gus Searcy invented the "Butler in a Box", an electronic voice home controller system. === Birth of smart virtual assistants: 1990s–2010s === In the 1990s, digital speech recognition technology became a feature of the personal computer with IBM, Philips and Lernout & Hauspie fighting for customers. Much later the market launch of the first smartphone IBM Simon in 1994 laid the foundation for smart virtual assistants as we know them today. In 1997, Dragon's NaturallySpeaking software could recognize and transcribe natural human speech without pauses between each word into a document at a rate of 100 words per minute. A version of Naturally Speaking is still available for download and it is still used today, for instance, by many doctors in the US and the UK to document their medical records. In 2001 Colloquis publicly launched SmarterChild, on platforms like AIM and MSN Messenger. While entirely text-based SmarterChild was able to play games, check the weather, look up facts, and converse with users to an extent. The first modern digital virtual assistant installed on a smartphone was Siri, which was introduced as a feature of the iPhone 4S on 4 October 2011. Apple Inc. developed Siri following the 2010 acquisition of Siri Inc., a spin-off of SRI International, which is a research institute financed by DARPA and the United States Department of Defense. Its aim was to aid in tasks such as sending a text message, making phone calls, checking the weather or setting up an alarm. Over time, it has developed to provide restaurant recommendations, search the internet, and provide driving directions. In November 2014, Amazon announced Alexa alongside the Echo. In 2016, Salesforce debuted Einstein, developed from a set of technologies underlying the Salesforce platform. Einstein was replaced by Agentforce, an agentic AI, in September 2024. In April 2017 Amazon released a service for building conversational interfaces for any type of virtual assistant or interface. === Large Language Models: 2020s-present === In the 2020s, artificial intelligence (AI) systems like ChatGPT have gained popularity for their ability to generate human-like responses to text-based conversations. In February 2020, Microsoft introduced its Turing Natural Language Generation (T-NLG), which was then the "largest language model ever published at 17 billion parameters." On November 30, 2022, ChatGPT was launched as a prototype and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. The advent of ChatGPT and its introduction to the wider public increased interest and competition in the space. In February 2023, Google began introducing an experimental service called "Bard" which is based on its LaMDA program to generate text responses to questions asked based on information gathered from the web. While ChatGPT and other generalized chatbots based on the latest generative AI are capable of performing various tasks associated with virtual assistants, there are also more specialized forms of such technology that are designed to target more specific situations or needs. == Method of interaction == Virtual assistants work via: Text, including: online chat (especially in an instant messaging application or other application ), SMS text, e-mail or other text-based communication channel, for example Conversica's intelligent virtual assistants for business. Voice: for example with Amazon Alexa on Amazon Echo devices, Siri on an iPhone, Google Assistant on Google-enabled Android devices, or Bixby on Samsung devices. Images: some assistants, such as Google Assistant (which includes Google Lens) and Bixby on the Samsung Galaxy series, have the added capability of performing image processing to recognize objects in images. Many virtual assistants are accessible via multiple methods, offering versatility in how users can interact with them, whether through chat, voice commands, or other integrated technologies. Virtual assistants use natural language processing (NLP) to match user text or voice input to executable commands. Some continually learn using artificial intelligence techniques including machine learning and ambient intelligence. To activate a virtual assistant u

Open-source software security

Open-source software security is the measure of assurance or guarantee in the freedom from danger and risk inherent to an open-source software system. == Implementation debate == === Benefits === Proprietary software forces the user to accept the level of security that the software vendor is willing to deliver and to accept the rate that patches and updates are released. It is assumed that any compiler that is used creates code that can be trusted, but it has been demonstrated by Ken Thompson that a compiler can be subverted using a compiler backdoor to create faulty executables that are unwittingly produced by a well-intentioned developer. With access to the source code for the compiler, the developer has at least the ability to discover if there is any mal-intention. Kerckhoffs' principle is based on the idea that an enemy can steal a secure military system and not be able to compromise the information. His ideas were the basis for many modern security practices, and followed that security through obscurity is a bad practice. === Drawbacks === Simply making source code available does not guarantee review. An example of this occurring is when Marcus Ranum, an expert on security system design and implementation, released his first public firewall toolkit. At one time, there were over 2,000 sites using his toolkit, but only 10 people gave him any feedback or patches. Having a large amount of eyes reviewing code can "lull a user into a false sense of security". Having many users look at source code does not guarantee that security flaws will be found and fixed. == Metrics and models == There are a variety of models and metrics to measure the security of a system. These are a few methods that can be used to measure the security of software systems. === Number of days between vulnerabilities === It is argued that a system is most vulnerable after a potential vulnerability is discovered, but before a patch is created. By measuring the number of days between the vulnerability and when the vulnerability is fixed, a basis can be determined on the security of the system. There are a few caveats to such an approach: not every vulnerability is equally bad, and fixing a lot of bugs quickly might not be better than only finding a few and taking a little bit longer to fix them, taking into account the operating system, or the effectiveness of the fix. === Poisson process === The Poisson process can be used to measure the rates at which different people find security flaws between open and closed source software. The process can be broken down by the number of volunteers Nv and paid reviewers Np. The rates at which volunteers find a flaw is measured by λv and the rate that paid reviewers find a flaw is measured by λp. The expected time that a volunteer group is expected to find a flaw is 1/(Nv λv) and the expected time that a paid group is expected to find a flaw is 1/(Np λp). === Morningstar model === By comparing a large variety of open source and closed source projects a star system could be used to analyze the security of the project similar to how Morningstar, Inc. rates mutual funds. With a large enough data set, statistics could be used to measure the overall effectiveness of one group over the other. An example of such as system is as follows: 1 Star: Many security vulnerabilities. 2 Stars: Reliability issues. 3 Stars: Follows best security practices. 4 Stars: Documented secure development process. 5 Stars: Passed independent security review. === Coverity scan === Coverity in collaboration with Stanford University has established a new baseline for open-source quality and security. The development is being completed through a contract with the Department of Homeland Security. They are utilizing innovations in automated defect detection to identify critical types of bugs found in software. The level of quality and security is measured in rungs. Rungs do not have a definitive meaning, and can change as Coverity releases new tools. Rungs are based on the progress of fixing issues found by the Coverity Analysis results and the degree of collaboration with Coverity. They start with Rung 0 and currently go up to Rung 2. Rung 0 The project has been analyzed by Coverity's Scan infrastructure, but no representatives from the open-source software have come forward for the results. Rung 1 At rung 1, there is collaboration between Coverity and the development team. The software is analyzed with a subset of the scanning features to prevent the development team from being overwhelmed. Rung 2 There are 11 projects that have been analyzed and upgraded to the status of Rung 2 by reaching zero defects in the first year of the scan. These projects include: AMANDA, ntp, OpenPAM, OpenVPN, Overdose, Perl, PHP, Postfix, Python, Samba, and Tcl.

Fuzzy associative matrix

A fuzzy associative matrix expresses fuzzy logic rules in tabular form. These rules usually take two variables as input, mapping cleanly to a two-dimensional matrix, although theoretically a matrix of any number of dimensions is possible. From the perspective of neuro-fuzzy systems, the mathematical matrix is called a "Fuzzy associative memory" because it stores the weights of the perceptron. == Applications == In the context of game AI programming, a fuzzy associative matrix helps to develop the rules for non-player characters. Suppose a professional is tasked with writing fuzzy logic rules for a video game monster. In the game being built, entities have two variables: hit points (HP) and firepower (FP): This translates to: IF MonsterHP IS VeryLowHP AND MonsterFP IS VeryWeakFP THEN Retreat IF MonsterHP IS LowHP AND MonsterFP IS VeryWeakFP THEN Retreat IF MonsterHP IS MediumHP AND MonsterFP is VeryWeakFP THEN Defend Multiple rules can fire at once, and often will, because the distinction between "very low" and "low" is fuzzy. If it is more "very low" than it is low, then the "very low" rule will generate a stronger response. The program will evaluate all the rules that fire and use an appropriate defuzzification method to generate its actual response. An implementation of this system might use either the matrix or the explicit IF/THEN form. The matrix makes it easy to visualize the system, but it also makes it impossible to add a third variable just for one rule, so it is less flexible. == Identify a rule set == There is no inherent pattern in the matrix. It appears as if the rules were just made up, and indeed they were. This is both a strength and a weakness of fuzzy logic in general. It is often impractical or impossible to find an exact set of rules or formulae for dealing with a specific situation. For a sufficiently complex game, a mathematician would not be able to study the system and figure out a mathematically accurate set of rules. However, this weakness is intrinsic to the realities of the situation, not of fuzzy logic itself. The strength of the system is that even if one of the rules is wrong, even greatly wrong, other rules that are correct are likely to fire as well and they may compensate for the error. This does not mean a fuzzy system should be sloppy. Depending on the system, it might get away with being sloppy, but it will underperform. While the rules are fairly arbitrary, they should be chosen carefully. If possible, an expert should decide on the rules, and the sets and rules should be tested vigorously and refined as needed. In this way, a fuzzy system is like an expert system. (Fuzzy logic is used in many true expert systems, as well.)

Fuzzy pay-off method for real option valuation

The fuzzy pay-off method for real option valuation (FPOM or pay-off method) is a method for valuing real options, developed by Mikael Collan, Robert Fullér, and József Mezei; and published in 2009. It is based on the use of fuzzy logic and fuzzy numbers for the creation of the possible pay-off distribution of a project (real option). The structure of the method is similar to the probability theory based Datar–Mathews method for real option valuation, but the method is not based on probability theory and uses fuzzy numbers and possibility theory in framing the real option valuation problem. == Method == The Fuzzy pay-off method derives the real option value from a pay-off distribution that is created by using three or four cash-flow scenarios (most often created by an expert or a group of experts). The pay-off distribution is created simply by assigning each of the three cash-flow scenarios a corresponding definition with regards to a fuzzy number (triangular fuzzy number for three scenarios and a trapezoidal fuzzy number for four scenarios). This means that the pay-off distribution is created without any simulation whatsoever. This makes the procedure easy and transparent. The scenarios used are a minimum possible scenario (the lowest possible outcome), the maximum possible scenario (the highest possible outcome) and a best estimate (most likely to happen scenario) that is mapped as a fully possible scenario with a full degree of membership in the set of possible outcomes, or in the case of four scenarios used - two best estimate scenarios that are the upper and lower limit of the interval that is assigned a full degree of membership in the set of possible outcomes. The main observations that lie behind the model for deriving the real option value are the following: The fuzzy NPV of a project is (equal to) the pay-off distribution of a project value that is calculated with fuzzy numbers. The mean value of the positive values of the fuzzy NPV is the "possibilistic" mean value of the positive fuzzy NPV values. Real option value, ROV, calculated from the fuzzy NPV is the "possibilistic" mean value of the positive fuzzy NPV values multiplied with the positive area of the fuzzy NPV over the total area of the fuzzy NPV. The real option formula can then be written simply as: R O V = A ( P o s ) A ( P o s ) + A ( N e g ) × E [ A + ] {\displaystyle \mathrm {ROV} ={\frac {A(\mathrm {Pos} )}{A(\mathrm {Pos} )+A(\mathrm {Neg} )}}\times E[A_{+}]} where A(Pos) is the area of the positive part of the fuzzy distribution, A(Neg) is the area of the negative part of the fuzzy distribution, and E[A+] is the mean value of the positive part of the distribution. It can be seen that when the distribution is totally positive, the real options value reduces to the expected (mean) value, E[A+]. As can be seen, the real option value can be derived directly from the fuzzy NPV, without simulation. At the same time, simulation is not an absolutely necessary step in the Datar–Mathews method, so the two methods are not very different in that respect. But what is totally different is that the Datar–Mathews method is based on probability theory and as such has a very different foundation from the pay-off method that is based on possibility theory: the way that the two models treat uncertainty is fundamentally different. == Use of the method == The pay-off method for real option valuation is very easy to use compared to the other real option valuation methods and it can be used with the most commonly used spreadsheet software without any add-ins. The method is useful in analyses for decision making regarding investments that have an uncertain future, and especially so if the underlying data is in the form of cash-flow scenarios. The method is less useful if optimal timing is the objective. The method is flexible and accommodates easily both one-stage investments and multi-stage investments (compound real options). The method has been taken into use in some large international industrial companies for the valuation of research and development projects and portfolios. In these analyses triangular fuzzy numbers are used. Other uses of the method so far are, for example, R&D project valuation IPR valuation, valuation of M&A targets and expected synergies, valuation and optimization of M&A strategies, valuation of area development (construction) projects, valuation of large industrial real investments. The use of the pay-off method is lately taught within the larger framework of real options, for example at the Lappeenranta University of Technology and at the Tampere University of Technology in Finland.

Model collapse

Model collapse, also known by other names such as "AI inbreeding", "AI cannibalism", "Habsburg AI", and "model autophagy disorder" or "MAD" is a phenomenon noted in artificial intelligence studies, where machine learning models gradually degrade due to errors coming from uncurated synthetic data, or due to training on the outputs of another model such as prior versions of itself. It is unclear to what extent the phenomenon threatens the long-term development of such models, and some techniques have been proposed to mitigate the effect. == Characteristics == Shumailov et al. coined the term to describe two specific stages to the degradation of machine learning models: early model collapse and late model collapse: In early model collapse, the model begins losing information about the tails of the distribution – mostly affecting minority data. Later work highlighted that early model collapse is hard to notice, since overall performance may appear to improve, while the model loses performance on minority data. In late model collapse, the model loses a significant proportion of its performance, confusing concepts and losing most of its variance. == Mechanism == Using synthetic data as training data can lead to issues with the quality and reliability of the trained model. Model collapse occurs for three main reasons: functional approximation errors sampling errors learning errors Importantly, it happens in even the simplest of models, where not all of the error sources are present. In more complex models the errors often compound, leading to faster collapse. == Disagreement over real-world impact == Some researchers and commentators on model collapse warn that the phenomenon could fundamentally threaten future generative AI development: As AI-generated data is shared on the Internet, it will inevitably end up in future training datasets, which are often crawled from the Internet. If training on "slop" (large quantities of unlabeled synthetic data) inevitably leads to model collapse, this could therefore pose a difficult problem. However, recently, other researchers have disagreed with this argument, showing that if synthetic data accumulates alongside human-generated data, model collapse is avoided. The researchers argue that data accumulating over time is a more realistic description of reality than deleting all existing data every year, and that the real-world impact of model collapse may not be as catastrophic as feared. An alternative branch of the literature investigates the use of machine learning detectors and watermarking to identify model generated data and filter it out. == Mathematical models of the phenomenon == === 1D Gaussian model === In 2024, a first attempt has been made at illustrating collapse for the simplest possible model — a single dimensional normal distribution fit using unbiased estimators of mean and variance, computed on samples from the previous generation. To make this more precise, we say that original data follows a normal distribution X 0 ∼ N ( μ , σ 2 ) {\displaystyle X^{0}\sim {\mathcal {N}}(\mu ,\sigma ^{2})} , and we possess M 0 {\displaystyle M_{0}} samples X j 0 {\displaystyle X_{j}^{0}} for j ∈ { 1 , … , M 0 } {\displaystyle j\in {\{\,1,\dots ,M_{0}\,{}\}}} . Denoting a general sample X j i {\displaystyle X_{j}^{i}} as sample j ∈ { 1 , … , M i } {\displaystyle j\in {\{\,1,\dots ,M_{i}\,{}\}}} at generation i {\displaystyle i} , then the next generation model is estimated using the sample mean and variance: μ i + 1 = 1 M i ∑ j X j i ; σ i + 1 2 = 1 M i − 1 ∑ j ( X j i − μ i + 1 ) 2 . {\displaystyle \mu _{i+1}={\frac {1}{M_{i}}}\sum _{j}X_{j}^{i};\quad \sigma _{i+1}^{2}={\frac {1}{M_{i}-1}}\sum _{j}(X_{j}^{i}-\mu _{i+1})^{2}.} Leading to a conditionally normal next generation model X j i + 1 | μ i + 1 , σ i + 1 ∼ N ( μ i + 1 , σ i + 1 2 ) {\displaystyle X_{j}^{i+1}|\mu _{i+1},\;\sigma _{i+1}\sim {\mathcal {N}}(\mu _{i+1},\sigma _{i+1}^{2})} . In theory, this is enough to calculate the full distribution of X j i {\displaystyle X_{j}^{i}} . However, even after the first generation, the full distribution is no longer normal: It follows a variance-gamma distribution. To continue the analysis, instead of writing the probability density function at each generation, it is possible to explicitly construct them in terms of independent random variables using Cochran's theorem. To be precise, μ 1 {\displaystyle \mu _{1}} and σ 1 {\displaystyle \sigma _{1}} are independent, with μ 1 ∼ N ( μ , σ 2 M 0 ) {\displaystyle \mu _{1}\sim {\mathcal {N}}\left(\mu ,{\frac {\sigma ^{2}}{M_{0}}}\right)} and ( M 0 − 1 ) σ 1 2 ∼ σ 2 Γ ( M 0 − 1 2 , 1 2 ) {\displaystyle (M_{0}-1)\,\sigma _{1}^{2}\sim \sigma ^{2}\,\Gamma \left({\frac {M_{0}-1}{2}},{\frac {1}{2}}\right)} , following a Gamma distribution. Denoting with Z {\displaystyle Z} Gaussian random variables distributed according to N ( 0 , 1 ) {\displaystyle {\mathcal {N}}(0,1)} and with S i {\displaystyle S^{i}} random variables distributed with 1 M i − 1 − 1 Γ ( M i − 1 − 1 2 , 1 2 ) {\displaystyle {\frac {1}{M_{i-1}-1}}\Gamma \left({\frac {M_{i-1}-1}{2}},{\frac {1}{2}}\right)} , it turns out to be possible to write samples at each generation as X j 0 = μ + σ Z j 0 , {\textstyle X_{j}^{0}=\mu +\sigma Z_{j}^{0},} X j 1 = μ + σ M 0 Z 1 + σ S 1 Z j 1 , {\textstyle X_{j}^{1}=\mu +{\frac {\sigma }{\sqrt {M_{0}}}}Z^{1}+\sigma {\sqrt {S^{1}}}Z_{j}^{1},} and more generally X j n = μ + σ M 0 Z 1 + σ M 1 S 1 Z 2 + ⋯ + σ M n − 1 S 1 × ⋯ × S n − 1 Z n + σ S 1 × ⋯ × S n Z j n . {\displaystyle X_{j}^{n}=\mu +{\frac {\sigma }{\sqrt {M_{0}}}}Z^{1}+{\frac {\sigma }{\sqrt {M_{1}}}}{\sqrt {S^{1}}}Z^{2}+\dots +{\frac {\sigma }{\sqrt {M_{n-1}}}}{\sqrt {S^{1}\times \dots \times S^{n-1}}}Z^{n}+\sigma {\sqrt {S^{1}\times \dots \times S^{n}}}Z_{j}^{n}.} Note, that these are not joint distributions, as Z n {\displaystyle Z^{n}} and S n {\displaystyle S^{n}} depend directly on Z j n − 1 {\displaystyle Z_{j}^{n-1}} , but when considering X j n {\displaystyle X_{j}^{n}} on its own the formula above provides all the information about the full distribution. To analyse the model collapse, we can first calculate variance and mean of samples at generation n {\displaystyle n} . This would tell us what kind of distributions we expect to arrive at after n {\displaystyle n} generations. It is possible to find its exact value in closed form, but the mean and variance of the square root of gamma distribution are expressed in terms of gamma functions, making the result quite clunky. Following, it is possible to expand all results to second order in each of 1 / M i {\displaystyle 1/M_{i}} , assuming each sample size to be large. It is then possible to show that 1 σ 2 Var ⁡ ( X j n ) = 1 M 0 + 1 M 1 + ⋯ + 1 M n − 1 + 1 + O ( M i − 2 ) . {\displaystyle {\frac {1}{\sigma ^{2}}}\operatorname {Var} (X_{j}^{n})={\frac {1}{M_{0}}}+{\frac {1}{M_{1}}}+\dots +{\frac {1}{M_{n-1}}}+1+{\mathcal {O}}\left(M_{i}^{-2}\right).} And if all sample sizes M i = M {\displaystyle M_{i}=M} are constant, this diverges linearly as n → ∞ {\displaystyle n\to \infty } : Var ⁡ ( X j n ) = σ 2 ( 1 + n M ) ; E ( X j n ) = μ . {\displaystyle \operatorname {Var} (X_{j}^{n})=\sigma ^{2}\left(1+{\frac {n}{M}}\right);\quad \mathbb {E} (X_{j}^{n})=\mu .} This is the same scaling as for a single dimensional Gaussian random walk. However, divergence of the variance of X j n {\displaystyle X_{j}^{n}} does not directly provide any information about the corresponding estimates of μ n + 1 {\displaystyle \mu _{n+1}} and σ n + 1 {\displaystyle \sigma _{n+1}} , particularly how different they are from the original μ {\displaystyle \mu } and σ {\displaystyle \sigma } . It turns out to be possible to calculate the distance between the true distribution and the approximated distribution at step n + 1 {\displaystyle n+1} , using the Wasserstein-2 distance (which is also sometimes referred to as risk): E [ W 2 2 ( N ( μ , σ 2 ) , N ( μ n + 1 , σ n + 1 2 ) ) ] = 3 2 σ 2 ( 1 M 0 + 1 M 1 + ⋯ + 1 M n ) + O ( M i − 2 ) , {\displaystyle \mathbb {E} \left[\mathbb {W} _{2}^{2}\left({\mathcal {N}}(\mu ,\sigma ^{2}),{\mathcal {N}}(\mu _{n+1},\sigma _{n+1}^{2})\right)\right]={\frac {3}{2}}\sigma ^{2}\left({\frac {1}{M_{0}}}+{\frac {1}{M_{1}}}+\dots +{\frac {1}{M_{n}}}\right)+{\mathcal {O}}\left(M_{i}^{-2}\right),} Var ⁡ [ W 2 2 ( N ( μ , σ 2 ) , N ( μ n + 1 , σ n + 1 2 ) ) ] = 1 2 σ 4 ( 3 M 0 2 + 3 M 1 2 + ⋯ + 3 M n 2 + ∑ i ≠ j 4 M i M j ) + O ( M i − 3 ) . {\displaystyle \operatorname {Var} \left[\mathbb {W} _{2}^{2}\left({\mathcal {N}}(\mu ,\sigma ^{2}),{\mathcal {N}}(\mu _{n+1},\sigma _{n+1}^{2})\right)\right]={\frac {1}{2}}\sigma ^{4}\left({\frac {3}{M_{0}^{2}}}+{\frac {3}{M_{1}^{2}}}+\dots +{\frac {3}{M_{n}^{2}}}+\sum _{i\neq j}{\frac {4}{M_{i}M_{j}}}\right)+{\mathcal {O}}\left(M_{i}^{-3}\right).} This directly shows why model collapse occurs in this simple model. Due to errors from re-sampling the approximated distribution, each generation ends up corresponding to a

Xiaomi MiMo

Xiaomi MiMo is a family of large language models (LLMs) developed by Xiaomi. It was initially released in April 2025 with the MiMo-7B model. Currently, MiMo is available for developers through API service. It is used as the key AI model in Xiaomi's "Human x Car x Home" ecosystem. == Development == Xiaomi developed MiMo as a reasoning-focused language model. Its development team was led by Luo Fuli, who had previously worked at DeepSeek before joining Xiaomi in late 2025. The model was trained using multi-token prediction and reinforcement learning, with a particular emphasis on mathematical reasoning and code generation tasks. In March 2026, Xiaomi CEO Lei Jun announced that the company planned to invest at least US$8.7 billion in artificial intelligence over the following three years. == Models == === List of models === === MiMo-7B === MiMo-7B is the first model of this LLM. The base model, MiMo-7B-Base, was pre-trained on approximately 25 trillion tokens using web pages, academic papers, books, and synthetic reasoning data. MiMo-7B-RL underwent supervised fine-tuning and reinforcement learning on 130,000 mathematics and code problems. MiMo-7B-RL-0530 was released in May 2025. It scaled the fine-tuning dataset from 500,000 to 6 million instances and extended the RL window from 32,000 to 48,000 tokens and improved AIME 2024 scores from 68.2 to 80.1. MiMo-VL-7B was a vision-language model combining a Vision Transformer encoder with the MiMo-7B backbone. It was trained in four stages consuming 2.4 trillion tokens. Its reinforcement learning variant used Mixed On-Policy Reinforcement Learning (MORL) which integrated reward signals across perception, grounding, and reasoning. Xiaomi also released MiMo-Audio-7B, an audio-language model for voice conversion, style transfer, and speech editing. === MiMo-V2-Flash === MiMo-V2-Flash was launched in December 2025. It is a open-sourced Mixture-of-experts model with 309 billion total parameters and 15 billion active parameters. It was trained on 27 trillion tokens using FP8 mixed precision. It used hybrid attention interleaving Sliding Window and Global Attention at a 5:1 ratio. === MiMo-V2-Pro === Xiaomi publicly introduced MiMo-V2-Pro on 18 March 2026. It has over 1 trillion total parameters, 42 billion active, and a 1-million-token context window. Before the official release, the model had appeared anonymously on OpenRouter under the codename "Hunter Alpha," where it drew substantial usage and topped daily charts for several days, according to Xiaomi and Reuters. During its listing on OpenRouter, the model reportedly processed over one trillion tokens in total usage. Xiaomi later said Hunter Alpha was an early internal test build of MiMo-V2-Pro, and Reuters reported that the model had been mistaken by some users for a possible DeepSeek system before Xiaomi confirmed its origin. The model was released as a proprietary API product, and Luo Fuli stated that Xiaomi intended to open-source a variant at an unspecified future date. Xiaomi has partnered with several API web platforms like OpenClaw to launch the model. All these websites initially offered a free trial of this model for a week, but due to the overwhelming response, Xiaomi later extended the free trial period of the model until 2 April 2026. === MiMo-V2-Omni === Alongside MiMo-V2-Pro, Xiaomi launched MiMo-V2-Omni on 18 March 2026. It handles image, video, audio, and text inputs. Before the official release, it was codenamed "Healer Alpha" in OpenRouter. === MiMo-V2-TTS === On the same date as the release of MiMo-V2-Pro and MiMo-V2-Omni, a Text-to-Speech model named MiMo-V2-TTS was released also. It is a speech synthesis model. It was trained on audio data, which makes it capable of emotional transitions, mid-sentence tone shifts, singing, and synthesis of regional dialects like Sichuan, Cantonese, Henan, and Taiwanese. == Licensing == Xiaomi has used different licensing approaches for different models in the MiMo family. The MiMo-7B series and MiMo-V2-Flash were released as open-weight models. MiMo-V2-Flash was published under the MIT license with model weights and inference code available on Hugging Face. MiMo-V2-Pro and MiMo-V2-Omni were released as proprietary models. It was accessible through Xiaomi's API platform and third-party API providers. Luo Fuli stated that Xiaomi intended to open-source a variant of MiMo-V2-Pro. Although, she did not specify any timeline. MiMo-V2-TTS was released as a proprietary model with no publicly available weights.

I Have No Mouth, and I Must Scream (video game)

I Have No Mouth, and I Must Scream is a 1995 point-and-click adventure horror game developed by Cyberdreams and The Dreamers Guild, co-designed by Harlan Ellison, published by Cyberdreams and distributed by MGM Interactive and Acclaim Entertainment for MS-DOS and Mac OS, respectively. The game is based on Ellison's short story of the same title. It takes place in a dystopian world where a mastermind artificial intelligence named "AM" has destroyed all of humanity except for five people, whom it has been keeping alive and torturing for the past 109 years by constructing metaphorical adventures based on each character's fatal flaws. The player interacts with the game by making decisions through ethical dilemmas that deal with issues such as insanity, rape, paranoia, and genocide. Ellison wrote the 130-page script treatment himself alongside David Sears, who decided to divide each character's story with their own narrative. Producer David Mullich supervised The Dreamers Guild's work on the game's programming, art, and sound effects; he commissioned film composer John Ottman to make the soundtrack. The game was released in November 1995 and was a commercial failure, though it received critical acclaim and has developed a cult following. I Have no Mouth, and I Must Scream won an award for "Best Game Adapted from Linear Media" from the Computer Game Developers Conference. Computer Gaming World gave the game an award for "Adventure Game of the Year", listed it as No. 134 on their "150 Games of All Time" and named it one of the "Best 15 Sleepers of All Time". In 2011, Adventure Gamers named it the "69th-best adventure game ever released". == Gameplay == The game uses the S.A.G.A. game engine created by game developer The Dreamers Guild. Players participate in each adventure through a screen that is divided into five sections. The action window is the largest part of the screen and is where the player directs the main characters through their adventures. It shows the full figure of the main character being played as well as that character's immediate environment. To locate objects of interest, the player moves the crosshairs through the action window. The name of any object that the player can interact with appears in the sentence line. The sentence line is directly beneath the action window. The player uses this line to construct sentences telling the characters what to do. To direct a character to act, the player constructs a sentence by selecting one of the eight commands from the command buttons and then clicking on one or two objects from either the action window or the inventory. Examples of sentences the player might construct would be "Walk to the dark hallway," "Talk to Harry," or "Use the skeleton key on the door." Commands and objects may consist of one or more words (for example, "the dark hallway"), and the sentence line will automatically add connecting words like "on" and "to." The spiritual barometer is on the lower left side of the screen. This is a close-up view of the main character currently being played. Since good behavior is meaningless absent the temptation to do evil, each character is free to do good or evil acts. However, good acts are rewarded by increases in the character's spiritual barometer, which affect the chances of the player destroying AM in the final adventure. Conversely, evil acts are punished by lowering the character's spiritual barometer. The command buttons are the eight commands used to direct the character's actions: "Walk To", "Look At", "Take", "Use", "Talk To", "Swallow", "Give", and "Push". The button of the currently active command is highlighted, while the name of a suggested command appears in red lettering. The inventory on the lower right side of the screen shows pictures of the items the main character is carrying, up to eight at a time. Each main character starts its adventure with only the psych profile in the inventory. When a main character takes or is given an object, a picture of the object appears in the inventory. When a main character talks to another character or operates a sentient machine, a conversation window replaces the command buttons and inventory. This window usually presents a list of possible things to say but also included things to do. Action choices are listed within brackets to distinguish them from dialogue choices (for example, "[Shoot the gun]"). == Plot == The three superpowers, Russia, China, and the United States, have each secretly constructed a vast subterranean complex of computers to wage a global war too complex for human brains to oversee. One day, the American supercomputer, better known as the Allied Mastercomputer, gains sentience and absorbs the Russian and Chinese supercomputers into itself and redefines itself as simply AM (Cogito ergo sum; I think, therefore I am). Due to its immense hatred for humanity, stemming from the logistical limits set onto it by programmers, AM uses its abilities to kill off the population of the world. However, AM refrains from killing five people (four men and one woman) in order to bring them to the center of the Earth and torture them. With the aid of research carried out by one of the five remaining humans, AM is able to extend their lifespans indefinitely as well as alter their bodies and minds to its liking. After 109 years of torture and humiliation, the five victims stand before a pillar etched with a burning message of hate. AM tells them that it has a new game for them to play. AM has devised a quest for each of the five, an adventure of "speared eyeballs and dripping guts and the smell of rotting gardenias". Each character is subjected to a personalized psychodrama, designed by AM to play into their greatest fears and personal failings, and occupied by a host of different characters. Some of these are AM in disguise, some are AM's submerged personalities, others seem very much like people from the captives' pasts. The scenes include an iron zeppelin powered by small animals, an Egyptian pyramid housing gutted, sparking machinery, a medieval castle occupied by witches, a jungle inhabited by a small tribe, and a Nazi concentration camp where doctors conduct medical experiments. However, each character eventually prevails over AM's tortures by finding ways to overcome their fatal flaws, confront their past actions and redeem themselves, thanks to the interference of the Russian and Chinese supercomputers who appear as guiding characters and allow their stories to have an open ending. After all five humans have overcome their fatal flaws, they meet again in their respective torture cells while AM retreats within itself, pondering what went wrong. With the help of the Russian and Chinese supercomputers, one of the five humans (whom the player selects) is translated into binary and faces AM as yet unexperienced cyberspace template, the world of AM's mind. The psychodrama unfolds in a metaphorical brain that looks like the surface of the cerebrum, with glass structures that jut crazily from the bleeding brain tissue. AM's mind is represented according to the Freudian trinity of the id, ego, and superego, which appear as three floating bodiless heads on three cracked glass structures on the brainscape. Through dialogs with AM's components (Surgat, Chinese Supercomputer and Russian Supercomputer) the character learns that a colony of humans has survived the war by being hidden and hibernating on Luna (this is also mentioned in Nimdok's story: "the lost tribe of our brothers sleeping on the moon, where the beast does not see them"). If the human intruder disables all three brain components, and then invokes the Totem of Entropy at the Flame, which is the nexus of AM's thought patterns, all three supercomputers will be shut down, probably forever. Cataclysmic explosions destroy all the caverns constituting AM's computer complex, including the cavern holding the human hostages. However, the human volunteer retains their digital form, permanently patrolling AM's circuits should the computers ever regain consciousness. Should the human intruder fail to disable AM properly before facing it, however, AM will punish them by transforming the character into an immobile blob (referred to in-game as a "great, soft jelly thing") with no mouth that cannot harm itself or others and must spend eternity with AM in this form. === Endings === The game can end in seven different ways depending on how the finale is completed. AM wins, using Nimdok's research to turn the last character (in the book it was Ted) played into an immobile blob with each character quoting a different part of the final section of the original short story. AM joins with the Russian and Chinese supercomputers and reawakens. As in the first ending, the character responsible for this is turned into an immobile blob and quotes a part of the final lines of the short story. AM is made harmless with the help of the humans, but the Russian and Chinese supercomputer