Top 10 AI Text-to-video Tools Compared (2026)

Top 10 AI Text-to-video Tools Compared (2026)

Trying to pick the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

LTX (text-to-video model)

LTX is a family of open source artificial intelligence video foundation models developed by Lightricks, and first released in November 2024. The latest models, LTX-2, create videos based on user prompts. They were preceded by LTX Video, which was released in 2024 as the company's first text-to-video model. LTX-2 is part of the LTX family of video generation models, which form the core technology, alongside LTX Studio, of the LTX ecosystem. == History == === Origins: LTX Video (2024–2025) === In November 2024 Lightricks publicly released its first text-to-video model, LTX Video. It was a 2-billion parameter model, available as open source. In May 2025 Lightricks launched LTXV-13b, a version with 13-billion parameters. Two months later, the model broke the 60 second barrier for generated video. === Release of LTX-2 (2025) === In October 2025 Lightricks announced its latest model, and renamed it LTX-2. The model was described as capable of generating synchronized audio and video at native 4K resolution and up to 50 frames per second (fps), using a variety of conditions and prompts, including text-to-video and image-to-video. Google highlighted the fact that LTX-2 was trained on its infrastructure, and saying it was "The first open source AI video generation model, powered by Google Cloud". Upon its release it was ranked in the top-3 models for image-to-video creation by Artificial Analysis, behind Kling 3.5 by Kling AI and Veo 3.1 by Google. Its text-to-image option was ranked 7th. In addition to its open-source release, Lightricks offers API access to LTX-2, allowing developers to generate videos from text and image prompts through a hosted service without running the model locally. === Open Source Release (2026) === In January 2026, Lightricks officially released the full open-source version of LTX-2, making the model’s complete codebase, weights, and associated tooling publicly available. In March 2026 the company released LTX-2.3, which was accompanied by a desktop video editor enabling the entire model to run locally on consumer hardware. == Technical features == === Advancements over LTX Video === LTX-2 builds upon the LTX Video architecture with several major improvements: Unified audio-video generation producing synchronized dialogue, ambience, and motion Native 4K rendering 50-fps output for cinematic motion Three operational modes (Fast, Pro, Ultra) More efficient diffusion pipelines enabling high fidelity on consumer GPUs === Core capabilities === Text-to-video generation Image-to-video generation Multimodal audiovisual synthesis High-resolution spatial and temporal coherence Configurable quality/performance settings Open-source distribution of weights and datasets == Reception == Initial reception to LTX-2 was broadly positive, with several technology and media outlets highlighting its open-source approach and multimodal capabilities. Open Source For You described LTX-2 as “one of the first AI video systems to combine 4K output, synchronized audio, and an open model release,” noting that it positioned Lightricks as a significant competitor to proprietary systems such as OpenAI's Sora and Google's Veo. IEA Green said that the model “could rewrite the AI filmmaking game,” emphasizing that its 50-fps rendering and unified audio-video generation made it suitable for professional studios and independent creators alike. AI News characterized LTX-2 as a “major step forward in the democratization of cinematic-quality video generation,” praising its consumer-grade hardware efficiency and multi-tier generation modes, while also noting ongoing challenges in long-form temporal stability. FinancialContent reported strong interest among creative agencies, attributing the attention to Lightricks’ decision to release model weights and datasets, which reviewers said enabled “a level of transparency not typically seen in commercial AI video models.” === Benchmarks and rankings === Upon release, LTX-2 ranked third for image-to-video creation in the Artificial Analysis benchmark, behind Kling 3.5 and Veo 3.1, while its text-to-video option ranked seventh. As of early 2026, it was the highest-ranked open-source model in the benchmark. === Limitations === Some early reviewers also pointed out quality limitations. The Ray3 technical review noted occasional inconsistencies in lip-sync and motion tracking during long scenes, though it stated these were “in line with the challenges faced by all current AI video diffusion models” and expected to improve with continued iteration. Like other diffusion-based video generators, LTX-2 can produce artifacts in complex multi-person scenes and may struggle with precise text rendering within generated video.

Gödel machine

A Gödel machine is a hypothetical self-improving computer program that solves problems in an optimal way. It uses a recursive self-improvement protocol in which it rewrites its own code when it can prove the new code provides a better strategy. The machine was invented by Jürgen Schmidhuber (first proposed in 2003), but is named after Kurt Gödel who inspired the mathematical theories. The Gödel machine is often discussed when dealing with issues of meta-learning, also known as "learning to learn." Applications include automating human design decisions and transfer of knowledge between multiple related tasks, and may lead to design of more robust and general learning architectures. Though theoretically possible, no full implementation has been created. The Gödel machine is often compared with Marcus Hutter's AIXI, another formal specification for an artificial general intelligence. Schmidhuber points out that the Gödel machine could start out by implementing AIXItl as its initial sub-program, and self-modify after it finds proof that another algorithm for its search code will be better. == Limitations == Traditional problems solved by a computer only require one input and provide some output. Computers of this sort had their initial algorithm hardwired. This does not take into account the dynamic natural environment, and thus was a goal for the Gödel machine to overcome. The Gödel machine has limitations of its own, however. According to Gödel's First Incompleteness Theorem, any formal system that encompasses arithmetic is either flawed or allows for statements that cannot be proved in the system. Hence even a Gödel machine with unlimited computational resources must ignore those self-improvements whose effectiveness it cannot prove. == Variables of interest == There are three variables that are particularly useful in the run time of the Gödel machine. At some time t {\displaystyle t} , the variable time {\displaystyle {\text{time}}} will have the binary equivalent of t {\displaystyle t} . This is incremented steadily throughout the run time of the machine. Any input meant for the Gödel machine from the natural environment is stored in variable x {\displaystyle x} . It is likely the case that x {\displaystyle x} will hold different values for different values of variable time {\displaystyle {\text{time}}} . The outputs of the Gödel machine are stored in variable y {\displaystyle y} , where y ( t ) {\displaystyle y(t)} would be the output bit-string at some time t {\displaystyle t} . At any given time t {\displaystyle t} , where ( 1 ≤ t ≤ T ) {\displaystyle (1\leq t\leq T)} , the goal is to maximize future success or utility. A typical utility function follows the pattern u ( s , E n v ) : S × E → R {\displaystyle u(s,\mathrm {Env} ):S\times E\rightarrow \mathbb {R} } : u ( s , E n v ) = E μ [ ∑ τ = time T r ( τ ) ∣ s , E n v ] {\displaystyle u(s,\mathrm {Env} )=E_{\mu }{\Bigg [}\sum _{\tau ={\text{time}}}^{T}r(\tau )\mid s,\mathrm {Env} {\Bigg ]}} where r ( t ) {\displaystyle r(t)} is a real-valued reward input (encoded within s ( t ) {\displaystyle s(t)} ) at time t {\displaystyle t} , E μ [ ⋅ ∣ ⋅ ] {\displaystyle E_{\mu }[\cdot \mid \cdot ]} denotes the conditional expectation operator with respect to some possibly unknown distribution μ {\displaystyle \mu } from a set M {\displaystyle M} of possible distributions ( M {\displaystyle M} reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned time = time ⁡ ( s ) {\displaystyle {\text{time}}=\operatorname {time} (s)} is a function of state s {\displaystyle s} which uniquely identifies the current cycle. Note that we take into account the possibility of extending the expected lifespan through appropriate actions. == Instructions used by proof techniques == The nature of the six proof-modifying instructions below makes it impossible to insert an incorrect theorem into proof, thus trivializing proof verification. === get-axiom(n) === Appends the n-th axiom as a theorem to the current theorem sequence. Below is the initial axiom scheme: Hardware Axioms formally specify how components of the machine could change from one cycle to the next. Reward Axioms define the computational cost of hardware instruction and the physical cost of output actions. Related Axioms also define the lifetime of the Gödel machine as scalar quantities representing all rewards/costs. Environment Axioms restrict the way new inputs x are produced from the environment, based on previous sequences of inputs y. Uncertainty Axioms/String Manipulation Axioms are standard axioms for arithmetic, calculus, probability theory, and string manipulation that allow for the construction of proofs related to future variable values within the Gödel machine. Initial State Axioms contain information about how to reconstruct parts or all of the initial state. Utility Axioms describe the overall goal in the form of utility function u. === apply-rule(k, m, n) === Takes in the index k of an inference rule (such as Modus tollens, Modus ponens), and attempts to apply it to the two previously proved theorems m and n. The resulting theorem is then added to the proof. === delete-theorem(m) === Deletes the theorem stored at index m in the current proof. This helps to mitigate storage constraints caused by redundant and unnecessary theorems. Deleted theorems can no longer be referenced by the above apply-rule function. === set-switchprog(m, n) === Replaces switchprog S pm:n, provided it is a non-empty substring of S p. === check() === Verifies whether the goal of the proof search has been reached. A target theorem states that given the current axiomatized utility function u (Item 1f), the utility of a switch from p to the current switchprog would be higher than the utility of continuing the execution of p (which would keep searching for alternative switchprogs). === state2theorem(m, n) === Takes in two arguments, m and n, and attempts to convert the contents of Sm:n into a theorem. == Example applications == === Time-limited NP-hard optimization === The initial input to the Gödel machine is the representation of a connected graph with a large number of nodes linked by edges of various lengths. Within given time T it should find a cyclic path connecting all nodes. The only real-valued reward will occur at time T. It equals 1 divided by the length of the best path found so far (0 if none was found). There are no other inputs. The by-product of maximizing expected reward is to find the shortest path findable within the limited time, given the initial bias. === Fast theorem proving === Prove or disprove as quickly as possible that all even integers > 2 are the sum of two primes (Goldbach’s conjecture). The reward is 1/t, where t is the time required to produce and verify the first such proof. === Maximizing expected reward with bounded resources === A cognitive robot that needs at least 1 liter of gasoline per hour interacts with a partially unknown environment, trying to find hidden, limited gasoline depots to occasionally refuel its tank. It is rewarded in proportion to its lifetime, and dies after at most 100 years or as soon as its tank is empty or it falls off a cliff, and so on. The probabilistic environmental reactions are initially unknown but assumed to be sampled from the axiomatized Speed Prior, according to which hard-to-compute environmental reactions are unlikely. This permits a computable strategy for making near-optimal predictions. One by-product of maximizing expected reward is to maximize expected lifetime.

Self-supervised learning

Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals, rather than relying on externally-provided labels. In the context of neural networks, self-supervised learning aims to leverage inherent structures or relationships within the input data to create meaningful training signals. SSL tasks are designed so that solving them requires capturing essential features or relationships in the data. The input data is typically augmented or transformed in a way that creates pairs of related samples, where one sample serves as the input, and the other is used to formulate the supervisory signal. This augmentation can involve introducing noise, cropping, rotation, or other transformations. Self-supervised learning more closely imitates the way humans learn to classify objects. During SSL, the model learns in two steps. First, the task is solved based on an auxiliary or pretext classification task using pseudo-labels, which help to initialize the model parameters. Next, the actual task is performed with supervised or unsupervised learning. Self-supervised learning has produced promising results in recent years, and has found practical application in fields such as audio processing, and is being used by Facebook and others for speech recognition. == Pseudo-labels == Pseudo-labels are automatically generated labels that a model assigns to unlabeled data based on its own predictions. They are widely used in self-supervised and semi-supervised learning, where ground-truth annotations are limited or unavailable. By treating predicted labels as surrogate ground truth, learning algorithms can make use of large quantities of unlabeled data in the training process. Pseudo-labeling also plays an important role in systems that must adapt to concept drift, where the statistical properties of the data change over time. In these scenarios, the model may detect that an incoming instance deviates from previously learned behavior. The system then generates a classification result for that instance, and this predicted class is used as a pseudo-label for updating or retraining model components that are becoming outdated. This approach enables continuous adaptation in dynamic environments without requiring manual annotation. In many adaptive learning pipelines, pseudo-labels are chosen when the classifier produces sufficiently confident predictions, reducing the risk of propagating errors. These pseudo-labeled instances are then incorporated into training to refresh or evolve the model's understanding of emerging data patterns, particularly when existing components show signs of “aging” due to drift or distributional shifts. This strategy reduces reliance on manual labeling while helping maintain long-term model performance. == Types == === Autoassociative self-supervised learning === Autoassociative self-supervised learning is a specific category of self-supervised learning where a neural network is trained to reproduce or reconstruct its own input data. In other words, the model is tasked with learning a representation of the data that captures its essential features or structure, allowing it to regenerate the original input. The term "autoassociative" comes from the fact that the model is essentially associating the input data with itself. This is often achieved using autoencoders, which are a type of neural network architecture used for representation learning. Autoencoders consist of an encoder network that maps the input data to a lower-dimensional representation (latent space), and a decoder network that reconstructs the input from this representation. The training process involves presenting the model with input data and requiring it to reconstruct the same data as closely as possible. The loss function used during training typically penalizes the difference between the original input and the reconstructed output (e.g. mean squared error). By minimizing this reconstruction error, the autoencoder learns a meaningful representation of the data in its latent space. === Contrastive self-supervised learning === For a binary classification task, training data can be divided into positive examples and negative examples. Positive examples are those that match the target. For example, if training a classifier to identify birds, the positive training data would include images that contain birds. Negative examples would be images that do not. Contrastive self-supervised learning uses both positive and negative examples. The loss function in contrastive learning is used to minimize the distance between positive sample pairs, while maximizing the distance between negative sample pairs. An early example uses a pair of 1-dimensional convolutional neural networks to process a pair of images and maximize their agreement. Contrastive Language-Image Pre-training (CLIP) allows joint pretraining of a text encoder and an image encoder, such that a matching image-text pair have image encoding vector and text encoding vector that span a small angle (having a large cosine similarity). InfoNCE (Noise-Contrastive Estimation) is a method to optimize two models jointly, based on Noise Contrastive Estimation (NCE). Given a set X = { x 1 , … x N } {\displaystyle X=\left\{x_{1},\ldots x_{N}\right\}} of N {\displaystyle N} random samples containing one positive sample from p ( x t + k ∣ c t ) {\displaystyle p\left(x_{t+k}\mid c_{t}\right)} and N − 1 {\displaystyle N-1} negative samples from the 'proposal' distribution p ( x t + k ) {\displaystyle p\left(x_{t+k}\right)} , it minimizes the following loss function: L N = − E X [ log ⁡ f k ( x t + k , c t ) ∑ x j ∈ X f k ( x j , c t ) ] {\displaystyle {\mathcal {L}}_{\mathrm {N} }=-\mathbb {E} _{X}\left[\log {\frac {f_{k}\left(x_{t+k},c_{t}\right)}{\sum _{x_{j}\in X}f_{k}\left(x_{j},c_{t}\right)}}\right]} === Non-contrastive self-supervised learning === Non-contrastive self-supervised learning (NCSSL) uses only positive examples. Counterintuitively, NCSSL converges on a useful local minimum rather than reaching a trivial solution, with zero loss. For the example of binary classification, it would trivially learn to classify each example as positive. Effective NCSSL requires an extra predictor on the online side that does not back-propagate on the target side. === Joint-Embedding and Predictive Architectures === A major class of self-supervised learning moves beyond contrastive pairs, instead maximizing the agreement between views while preventing collapse through statistical constraints. Rooted in Deep Canonical Correlation Analysis (Deep CCA), this approach includes Joint-Embedding Architectures (JEA) like Barlow Twins and VICReg, which enforce covariance constraints to learn invariant representations without negative sampling. Deep Latent Variable Path Modelling (DLVPM) generalizes this to multimodal systems, using path models to enforce correlation and orthogonality across diverse data types. In 2022 Yann LeCun introduced Joint-Embedding Predictive Architectures (JEPA) as a step towards decision making, reasoning, and autonomous human intelligence in machines, including self-improvement through autonomous learning. Founded in representation learning, LeCun included the concept of a “world model” in JEPA which aims to enable machines to replicate human intellect by providing machines with a concept for the world in which they exist. Unlike autoencoders, JEPAs operate entirely in latent space, avoiding pixel-level noise to focus on semantic structure. Rather than just learning invariance, JEPAs learn by predicting masked latent representations from visible context. JEPA has been applied to domains such as image analysis, audio processing, and motion in images and video. == Comparison with other forms of machine learning == SSL belongs to supervised learning methods insofar as the goal is to generate a classified output from the input. At the same time, however, it does not require the explicit use of labeled input-output pairs. Instead, correlations, metadata embedded in the data, or domain knowledge present in the input are implicitly and autonomously extracted from the data. These supervisory signals, extracted from the data, can then be used for training. SSL is similar to unsupervised learning in that it does not require labels in the sample data. Unlike unsupervised learning, however, learning is not done using inherent data structures. Semi-supervised learning combines supervised and unsupervised learning, requiring only a small portion of the learning data be labeled. In transfer learning, a model designed for one task is reused on a different task. Training an autoencoder intrinsically constitutes a self-supervised process, because the output pattern needs to become an optimal reconstruction of the input pattern itself. However, in current jargon, the term 'self-supervised' often refers to tasks based on a pretext-task training setup

Label noise

Label noise refers to errors or inaccuracies in the class labels of data instances. This is a widespread issue in machine learning datasets, arising from human annotator mistakes, unclear labeling instructions, automated labeling methods, or adversarial attacks in supervised learning. Label noise can be roughly divided into random noise, where labels are flipped independently of input features, and systematic noise, where mislabeling is dependent on certain patterns or biases in the data. Label noise can be damaging to model performance, especially for complex models that may overfit to noisy labels rather than generalizable patterns. Many approaches have been proposed to deal with the effects of label noise, including robust loss functions, noise-tolerant algorithms, data cleaning methods, and semi-supervised learning approaches. To reduce the impact of wrong labels during training, techniques like label smoothing, sample reweighting and using trusted validation sets are used. The role of noise-robust training paradigms and curriculum learning strategies to improve resilience against mislabeled data is also explored in recent research.

WebPlus

Serif WebPlus was a website design program for Microsoft Windows, developed by the software company, Serif. It allows users to design, create and upload their website onto the internet without any knowledge of HTML or other web technologies. Much like Microsoft Word, WebPlus uses WYSIWYG drag and drop editing to add and position text, images and links as they would appear on the finished web page. Once a user has designed their site, WebPlus can preview the site in a web browser before uploading the site using the in-built FTP. The software comes with a variety of pre-designed sample websites containing Filler text like Lorem ipsum, which can be used as a template for quickly designing a site. It also provides drawing tools for creating and editing buttons and web graphics. == Free WebPlus Starter Edition == Previously Serif had made available feature limited Starter Editions of their software, based on older versions, which could be obtained and used free of charge. For WebPlus the final free edition was based on version X5 and this was released in September 2012. This continued to be available from Serif's server until it was withdrawn around March 2016. WebPlus was then only available as a paid-for version X8. == Program Withdrawal == In March 2016, Serif announced that WebPlus X8 would be the final version, and that there were no current plans to design an application to replace it. Sales of WebPlus X8 by Serif were ended around December 2016. In early 2018, Serif announced that Serif Web Resources, hosted on Serif servers and required to implement some advanced web-site functionality in WebPlus created sites, would no longer work after 31 August 2018. In 2018, Serif also shutdown the servers that generated the "Plus" software registration numbers on-line from the product version and the individual generated installation number. Serif revealed the alternative was to use a universal master registration number, which is 881887. This is known to work with post 2003 Serif "Plus" software (e.g. verified to work with PagePlus v5.02). However, later Serif "Plus" software still registers itself automatically if within a certain recent period of a previous Serif software registration on the same PC. == Supported platforms == WebPlus was developed for Microsoft Windows "Win32" graphical desktop interface and is fully compatible with Windows XP, Windows Vista (32/64bit), Windows 7 (32/64bit) and Windows 8. == Features == Web hosting to upload websites to the internet with the address www.sitename.webplus.net and email [email protected]. E-Commerce tool to create online stores with providers such as PayPal. Form wizard generates online forms to collect information from website visitors. Add blogs, forums, hit counters, online polls and content management systems to websites using Smart Objects. Google Maps tool embeds maps and optional navigation markers within a website. Site navigation bars adopt a website's structure providing a tool for navigating around the website. Photo gallery groups a collection of images together and displays them as an animated slideshow. Search engine optimization (SEO) tools optimise a websites search ranking with the likes of Google, Yahoo! and Bing. Collect website metrics such as page popularity and number of website hits using Google Analytics. WebPlus X5 introduced a button studio for creating button graphics. Restrict access to specific pages on a website with a secure member's area. WebPlus automatically converts images and graphics into a web targeted format, optimising them for fast download. Embed YouTube videos within a web page. Add animated effects to a website with Animated GIFs, Animated Marquees or by importing Flash videos. Stream news and information feeds to a website using RSS and podcasts. Automated Site Checker analyses and corrects potential problems with a website. AdSense tool incorporates Google AdSense advertisements into a website In-built FTP transfers files onto a web server, uploading a website to the internet. In-built Basic Photo Editor the PhotoLab can make automatic adjustments and "Quick Fix's" to photos. From X5, WebPlus offers image editing and filters, through its PhotoLab and also provides a dedicated background-removal tool in the form of Cutout Studio. Display images, Flash videos and web pages using animated Lightboxes. Filter Effects can be applied to the graphical objects, giving convincing, realistic effects such as glass, metallic, plastic and other 2D/3D filters. WebPlus also provides QuickShapes for creating button and web graphics. These predefined shapes can be quickly modified with sliders to adjust certain parameters, for example creating rounded rectangles, etc. Shapes include: rectangles, ellipses, stars, spirals, cogs, petals, etc.

Algorithmic bias

Algorithmic bias describes systematic and repeatable harmful tendency in a computerized sociotechnical system to create "unfair" outcomes, such as "privileging" one category over another in ways that may or may not be different from the intended function of the algorithm. Bias can emerge from many factors, including intentionally biased design decisions or the unintended or unanticipated use or decisions relating to the way data is coded, collected, selected or used to train the algorithm. For example, algorithmic bias has been observed in search engine results and social media platforms. This bias can have impacts ranging from privacy violations to reinforcing social biases of race, gender, sexuality, and ethnicity. The study of algorithmic bias is most concerned with algorithms that reflect "systematic and unfair" discrimination. This bias has only recently been addressed in legal frameworks, such as the European Union's General Data Protection Regulation (enforced in 2018) and the Artificial Intelligence Act (proposed in 2021 and adopted in 2024). As algorithms expand their ability to organize society, politics, institutions, and behavior, sociologists have become concerned with the ways in which unanticipated output and manipulation of data can impact the physical world. Because algorithms are often considered to be neutral and unbiased, they can inaccurately project greater authority than human expertise (in part due to the psychological phenomenon of automation bias), and in some cases, reliance on algorithms can displace human responsibility for their outcomes, without last mile thinking. Bias can enter into algorithmic systems as a result of pre-existing cultural, social, or institutional expectations; by how features and labels are chosen; because of technical limitations of their design; or by being used in unanticipated contexts or by audiences who are not considered in the software's initial design. Algorithmic bias has been cited in cases ranging from election outcomes to the spread of online hate speech. It has also arisen in criminal justice, healthcare, and hiring, compounding existing racial, socioeconomic, and gender biases. The relative inability of facial recognition technology to accurately identify darker-skinned faces has been linked to multiple wrongful arrests of black men, an issue stemming from imbalanced datasets. Problems in understanding, researching, and discovering algorithmic bias persist due to the proprietary nature of algorithms, which are typically treated as trade secrets. Even when full transparency is provided, the complexity of certain algorithms poses a barrier to understanding their functioning. Furthermore, algorithms may change, or respond to input or output in ways that cannot be anticipated or easily reproduced for analysis. In many cases, even within a single website or application, there is no single "algorithm" to examine, but a network of many interrelated programs and data inputs, even between users of the same service. A 2021 survey identified multiple forms of algorithmic bias, including historical, representation, and measurement biases, each of which can contribute to unfair outcomes. == Definitions == Algorithms are difficult to define, but may be generally understood as lists of instructions that determine how programs read, collect, process, and analyze data to generate a usable output. For a rigorous technical introduction, see Algorithms. Advances in computer hardware and software have led to an increased capability to process, store and transmit data. This has in turn made the design and adoption of technologies such as machine learning and artificial intelligence technically and commercially feasible. By analyzing and processing data, algorithms are the backbone of search engines, social media websites, recommendation engines, online retail, online advertising, and more. Contemporary social scientists are concerned with algorithmic processes embedded into hardware and software applications because of their political and social impact, and question the underlying assumptions of an algorithm's neutrality. The term algorithmic bias describes systematic and repeatable errors that create unfair outcomes, such as privileging one arbitrary group of users over others. For example, a credit score algorithm may deny a loan without being unfair, if it is consistently weighing relevant financial criteria. If the algorithm recommends loans to one group of users, but denies loans to another set of nearly identical users based on unrelated criteria, and if this behavior can be repeated across multiple occurrences, an algorithm can be described as biased. This bias may be intentional or unintentional (for example, it can come from biased data obtained from a worker that previously did the job the algorithm is going to do from now on). == Methods == Bias can be introduced to an algorithm in several ways. During the assemblage of a dataset, data may be collected, digitized, adapted, and entered into a database according to human-designed cataloging criteria. Next, programmers assign priorities, or hierarchies, for how a program assesses and sorts that data. This requires human decisions about how data is categorized, and which data is included or discarded. Some algorithms collect their own data based on human-selected criteria, which can also reflect the bias of human designers. Other algorithms may reinforce stereotypes and preferences as they process and display "relevant" data for human users, for example, by selecting information based on previous choices of a similar user or group of users. Beyond assembling and processing data, bias can emerge as a result of design. For example, algorithms that determine the allocation of resources or scrutiny (such as determining school placements) may inadvertently discriminate against a category when determining risk based on similar users (as in credit scores). Meanwhile, recommendation engines that work by associating users with similar users, or that make use of inferred marketing traits, might rely on inaccurate associations that reflect broad ethnic, gender, socio-economic, or racial stereotypes. Another example comes from determining criteria for what is included and excluded from results. These criteria could present unanticipated outcomes for search results, such as with flight-recommendation software that omits flights that do not follow the sponsoring airline's flight paths. Algorithms may also display an uncertainty bias, offering more confident assessments when larger data sets are available. This can skew algorithmic processes toward results that more closely correspond with larger samples, which may disregard data from underrepresented populations. == History == === Early critiques === The earliest computer programs were designed to mimic human reasoning and deductions, and were deemed to be functioning when they successfully and consistently reproduced that human logic. In his 1976 book Computer Power and Human Reason, artificial intelligence pioneer Joseph Weizenbaum suggested that bias could arise both from the data used in a program, but also from the way a program is coded. Weizenbaum wrote that programs are a sequence of rules created by humans for a computer to follow. By following those rules consistently, such programs "embody law", that is, enforce a specific way to solve problems. The rules a computer follows are based on the assumptions of a computer programmer for how these problems might be solved. That means the code could incorporate the programmer's imagination of how the world works, including their biases and expectations. While a computer program can incorporate bias in this way, Weizenbaum also noted that any data fed to a machine additionally reflects "human decision making processes" as data is being selected. Finally, he noted that machines might also transfer good information with unintended consequences if users are unclear about how to interpret the results. Weizenbaum warned against trusting decisions made by computer programs that a user doesn't understand, comparing such faith to a tourist who can find his way to a hotel room exclusively by turning left or right on a coin toss. Crucially, the tourist has no basis of understanding how or why he arrived at his destination, and a successful arrival does not mean the process is accurate or reliable. An early example of algorithmic bias resulted in as many as 60 women and ethnic minorities denied entry to St. George's Hospital Medical School per year from 1982 to 1986, based on implementation of a new computer-guidance assessment system that denied entry to women and men with "foreign-sounding names" based on historical trends in admissions. While many schools at the time employed similar biases in their selection process, St. George was most notable for automating said bias through the use of an algorithm, thus gaining the attention of people on a much