AI Video Tools

Explore the best AI Video Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • Evaluation of binary classifiers

    Evaluation of binary classifiers

    Evaluation of a binary classifier typically assigns a numerical value, or values, to a classifier that represent its accuracy. An example is error rate, which measures how frequently the classifier makes a mistake. There are many metrics that can be used; different fields have different preferences. For example, in medicine sensitivity and specificity are often used, while in computer science precision and recall are preferred. An important distinction is between metrics that are independent of the prevalence or skew (how often each class occurs in the population), and metrics that depend on the prevalence – both types are useful, but they have very different properties. Often, evaluation is used to compare two methods of classification, so that one can be adopted and the other discarded. Such comparisons are more directly achieved by a form of evaluation that results in a single unitary metric rather than a pair of metrics. == Contingency table == Given a data set, a classification (the output of a classifier on that set) gives two numbers: the number of positives and the number of negatives, which add up to the total size of the set. To evaluate a classifier, one compares its output to another reference classification – ideally a perfect classification, but in practice the output of another gold standard test – and cross tabulates the data into a 2×2 contingency table, comparing the two classifications. One then evaluates the classifier relative to the gold standard by computing summary statistics of these 4 numbers. Generally these statistics will be scale invariant (scaling all the numbers by the same factor does not change the output), to make them independent of population size, which is achieved by using ratios of homogeneous functions, most simply homogeneous linear or homogeneous quadratic functions. Say we test some people for the presence of a disease. Some of these people have the disease, and our test correctly says they are positive. They are called true positives (TP). Some have the disease, but the test incorrectly claims they don't. They are called false negatives (FN). Some don't have the disease, and the test says they don't – true negatives (TN). Finally, there might be healthy people who have a positive test result – false positives (FP). These can be arranged into a 2×2 contingency table (confusion matrix), conventionally with the test result on the vertical axis and the actual condition on the horizontal axis. These numbers can then be totaled, yielding both a grand total and marginal totals. Totaling the entire table, the number of true positives, false negatives, true negatives, and false positives add up to 100% of the set. Totaling the columns (adding vertically) the number of true positives and false positives add up to 100% of the test positives, and likewise for negatives. Totaling the rows (adding horizontally), the number of true positives and false negatives add up to 100% of the condition positives (conversely for negatives). The basic marginal ratio statistics are obtained by dividing the 2×2=4 values in the table by the marginal totals (either rows or columns), yielding 2 auxiliary 2×2 tables, for a total of 8 ratios. These ratios come in 4 complementary pairs, each pair summing to 1, and so each of these derived 2×2 tables can be summarized as a pair of 2 numbers, together with their complements. Further statistics can be obtained by taking ratios of these ratios, ratios of ratios, or more complicated functions. The contingency table and the most common derived ratios are summarized below; see sequel for details. Note that the rows correspond to the condition actually being positive or negative (or classified as such by the gold standard), as indicated by the color-coding, and the associated statistics are prevalence-independent, while the columns correspond to the test being positive or negative, and the associated statistics are prevalence-dependent. There are analogous likelihood ratios for prediction values, but these are less commonly used, and not depicted above. == Pairs of metrics == Often accuracy is evaluated with a pair of metrics composed in a standard pattern. === Sensitivity and specificity === The fundamental prevalence-independent statistics are sensitivity and specificity. Sensitivity or True Positive Rate (TPR), also known as recall, is the proportion of people that tested positive and are positive (True Positive, TP) of all the people that actually are positive (Condition Positive, CP = TP + FN). It can be seen as the probability that the test is positive given that the patient is sick. With higher sensitivity, fewer actual cases of disease go undetected (or, in the case of the factory quality control, fewer faulty products go to the market). Specificity (SPC) or True Negative Rate (TNR) is the proportion of people that tested negative and are negative (True Negative, TN) of all the people that actually are negative (Condition Negative, CN = TN + FP). As with sensitivity, it can be looked at as the probability that the test result is negative given that the patient is not sick. With higher specificity, fewer healthy people are labeled as sick (or, in the factory case, fewer good products are discarded). The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the Receiver Operating Characteristic (ROC) curve. In theory, sensitivity and specificity are independent in the sense that it is possible to achieve 100% in both (such as in the red/blue ball example given above). In more practical, less contrived instances, however, there is usually a trade-off, such that they are inversely proportional to one another to some extent. This is because we rarely measure the actual thing we would like to classify; rather, we generally measure an indicator of the thing we would like to classify, referred to as a surrogate marker. The reason why 100% is achievable in the ball example is because redness and blueness is determined by directly detecting redness and blueness. However, indicators are sometimes compromised, such as when non-indicators mimic indicators or when indicators are time-dependent, only becoming evident after a certain lag time. The following example of a pregnancy test will make use of such an indicator. Modern pregnancy tests do not use the pregnancy itself to determine pregnancy status; rather, human chorionic gonadotropin is used, or hCG, present in the urine of gravid females, as a surrogate marker to indicate that a woman is pregnant. Because hCG can also be produced by a tumor, the specificity of modern pregnancy tests cannot be 100% (because false positives are possible). Also, because hCG is present in the urine in such small concentrations after fertilization and early embryogenesis, the sensitivity of modern pregnancy tests cannot be 100% (because false negatives are possible). === Positive and negative predictive values === In addition to sensitivity and specificity, the performance of a binary classification test can be measured with positive predictive value (PPV), also known as precision, and negative predictive value (NPV). The positive prediction value answers the question "If the test result is positive, how well does that predict an actual presence of disease?". It is calculated as TP/(TP + FP); that is, it is the proportion of true positives out of all positive results. The negative prediction value is the same, but for negatives, naturally. ==== Impact of prevalence on predictive values ==== Prevalence has a significant impact on prediction values. As an example, suppose there is a test for a disease with 99% sensitivity and 99% specificity. If 2000 people are tested and the prevalence (in the sample) is 50%, 1000 of them are sick and 1000 of them are healthy. Thus about 990 true positives and 990 true negatives are likely, with 10 false positives and 10 false negatives. The positive and negative prediction values would be 99%, so there can be high confidence in the result. However, if the prevalence is only 5%, so of the 2000 people only 100 are really sick, then the prediction values change significantly. The likely result is 99 true positives, 1 false negative, 1881 true negatives and 19 false positives. Of the 19+99 people tested positive, only 99 really have the disease – that means, intuitively, that given that a patient's test result is positive, there is only 84% chance that they really have the disease. On the other hand, given that the patient's test result is negative, there is only 1 chance in 1882, or 0.05% probability, that the patient has the disease despite the test result. === Precision and recall === Precision and recall can be interpreted as (estimated) conditional probabilities: Precision is given by P ( C = P | C ^ = P ) {\displaystyle P(C=P|{\hat {C}}=P)} while recall is given by P ( C ^ = P | C = P ) {\displaystyle P({\hat {C}}=P|C=P)} , where C ^ {\

    Read more →
  • Alec Radford

    Alec Radford

    Alec Radford is an American artificial intelligence researcher. == Biography == Radford grew up in Texas. He graduated from Cistercian Preparatory School in 2011, where he became an Eagle Scout, and dropped out of Olin College in August 2014, where he and fellow students Slater Victoroff, Diana Yuan, and Madison May had formed the startup Indico in their dorm room. In 2015, the quartet were joined by Luke Metz and the firm and the Facebook AI research lab in New York used generative adversarial networks to create realistic low pixel images. A demonstration of Indico's technology was used without proper attribution in an April 2016 demonstration by Nvidia chief executive Jensen Huang. Radford joined OpenAI around 2016, where he worked on natural-language processing. The following year, Radford trained a neural network on Amazon reviews. The model was fairly basic, with layers which allowed for human understanding. Upon exploring it, he saw that it had a special neuron linked to the sentiment of the reviews, which it had created on its own. This was a drastic improvement from previous neural networks that had analysed sentiment, because they had to be told to do so and specially trained on data that was explicitly labeled according to sentiment. This development made OpenAI chief scientist Ilya Sutskever consider that a future model, using more diverse language data, could map far more structures of meaning, eventually becoming a "learned core module" for superintelligence. In 2018, Radford was the lead author on OpenAI's seminal research paper on generative pre-trained transformers, which form the foundation of ChatGPT. At OpenAI, he worked on early GPT models, Whisper, a speech recognition model, and the image generator DALL-E. He left OpenAI in December 2024 to pursue independent research. Around March 2025, Radford joined Thinking Machines Lab as an advisor. He joined along with Bob McGrew who was previously the chief research officer of OpenAI. In April 2026, Radford, Nick Levine, and David Duvenaud released Talkie, an AI model trained on books, newspapers, scientific journals, patents, and case law published before December 31, 1930. When asked about the state of the world in 2026, it stated that one billion people would live in Europe, that London and New York would be connected by steamships that transit between the two in ten days, and "winter will be passed in Paris, and the summer in London."

    Read more →
  • Tim Houlne

    Tim Houlne

    Tim Houlne is an American business executive, entrepreneur, and author known for his work in outsourcing and homeshoring, remote working, and artificial intelligence (AI) in customer service. He is the founder and CEO of Humach, a company that uses human agents and AI in customer experience solutions. Previously, he was co-founder and CEO of Working Solutions, a virtual contact center company in the United States. == Early life and education == Houlne graduated from Missouri Western State University (MWSU) in 1986 with a bachelor's degree in business administration and from the University of Texas in Dallas with an MBA. In 2024, MWSU and North Central Missouri College renamed the Convergent Technology Alliance Center to the Houlne Center for Convergent Technology. The 20,000 square-foot learning laboratory provides training and applied education experiences in industries such as AI, cybersecurity, manufacturing and construction, and service technologies. == Career == In 1998, Houlne co-founded Working Solutions, a Plano, Texas-based U.S. outsourcing company that provides customer service using remote, home-based agents. As CEO, he oversaw the development of a virtual workforce model that routes service calls to either domestic or offshore agents, according to client needs and service requirements. In 2015, Houlne founded Humach, a customer experience outsourcing provider that uses human service agents with AI-based digital agents. The company derives its name from the combination of services provided by humans and machines. Its clients include Amazon, Carfax and McDonald's. The company acquired InfiniteAI in 2020, and Markets EQ in 2025. In 2013, Houlne was named a finalist for the Ernst & Young Entrepreneur of the Year Award (Southwest Region).He is the co-author of several books focused on the evolution of work, the gig economy, and the influence of AI in customer-facing roles. == Works == The New World of Work: From the Cube to the Cloud (2013) ISBN 0982562276 OCLC 813933360 The New World of Work, Second Edition: The Cube, the Cloud and What's Next (2023) ISBN 9781642258318 OCLC 1389815847 The Intelligent Workforce: How Humans & Machines Will Co-Create a Better Future (2024) ISBN 9798887501604 OCLC 1439598569

    Read more →
  • Leading the Future

    Leading the Future

    Leading the Future is an American super PAC network focused on lobbying for policies friendly to the artificial intelligence industry. It was launched in 2025 with over $100 million from industry stakeholders including Andreessen Horowitz, OpenAI President Greg Brockman and Palantir co-founder Joe Lonsdale. The launch was preceded by talks between Collin McCune, head of government affairs at Andreessen Horowitz, and Chris Lehane, chief global affairs officer at OpenAI. Among the members of the network are the American Mission PAC, which supported Chris Gober, and the Think Big PAC, which targeted Alex Bores. Leading the Future is affiliated with the nonprofit Build American AI, which Axios describes as a dark money advocacy "offshoot" operating alongside the super PAC. NBC News states that the network’s efforts are modeled after the pro-cryptocurrency group Fairshake. Leading the Future is led by Zac Moffatt and Josh Vlasto, the latter of whom previously served as an advisor to Fairshake. In response to the creation of Leading the Future, former members of Congress Brad Carson and Chris Stewart co-founded the super PAC network Public First, aiming to counter the group’s influence. In April 2026, an investigation by Model Republic linked Leading the Future to The Wire By Acutus, an automated news website that allegedly used AI agents posing as human journalists to solicit interviews. The site's content was found to closely mirror the PAC's deregulatory policy goals while targeting researchers and advocates skeptical of rapid AI development. In May 2026, Wired revealed that Build American AI used a "dark money" campaign to pay TikTok and Instagram influencers $5,000 per video to promote scripted narratives framing Chinese AI as a "national security threat." According to internal documents and staff at the marketing agency managing the project, the campaign's explicit goal was to "subtly shift public debate" toward the deregulation of AI industries while intentionally avoiding technical discussions regarding AI quality or safety. During the 2026 primary season Leading the Future went on to endorse several candidates in both Democratic and Republican races with several of them going on to win.

    Read more →
  • MoltenVK

    MoltenVK

    MoltenVK is a software library which allows Vulkan applications to run on top of Metal on Apple's macOS, iOS, and tvOS operating systems. It is the first software component to be released for the Vulkan Portability Initiative, a project to have a subset of Vulkan run on platforms lacking native Vulkan drivers. There are some limitations compared with a native Vulkan implementation. == History == MoltenVK was first released as a proprietary and commercially licensed product by The Brenwill Workshop on July 27, 2016. On July 31, 2017, Khronos announced the formation of the Vulkan Portability Technical Subgroup. === Open source === On February 26, 2018, Khronos announced that Vulkan became available on macOS and iOS products through the MoltenVK library. Valve announced that Dota 2 will run on macOS using the Vulkan API with the aid of MoltenVK, and that they had made an arrangement with developer The Brenwill Workshop Ltd to release MoltenVK as open-source software under the Apache License version 2.0. On May 30, 2018, Qt was updated with Vulkan for Qt on macOS using MoltenVK. On May 31, 2018, optional Vulkan support for Dota 2 on macOS was released. Benchmarks for the game were available the following day, showing better performance using Vulkan and MoltenVK compared to OpenGL. On July 20, 2018, Wine was updated with Vulkan support on macOS using MoltenVK. On 29 July 2018, the first app using MoltenVK was accepted onto the App Store, after initially being rejected. On 6 August 2018, Google open-sourced Filament, a crossplatform real-time physically based rendering engine with MoltenVK for macOS/iOS. On November 28, 2018, Valve released Artifact, their first Vulkan-only game on macOS using MoltenVK. === Version 1.0 === On 29 January 2019, MoltenVK 1.0.32 was released with early prototype of Vulkan Portability Extensions. RPCS3 and Dolphin emulators were updated with Vulkan support on macOS using MoltenVK. On 13 April 2019, MoltenVK 1.0.34 was released with support for tessellation. On July 30, 2019, MoltenVK 1.0.36 was released targeting Metal 3.0. On July 31, 2020, MoltenVK 1.0.44 was released, adding support for the tvOS platform. On January 23, 2020, MoltenVK was updated to support for some of the new features of Vulkan 1.2, as of Vulkan SDK 1.2.121. === Version 1.1 === On October 1, 2020, MoltenVK 1.1.0 was released, adding full support for Vulkan 1.1, as of Vulkan SDK 1.2.154. On 9 December 2020, MoltenVK 1.1.1 was released, providing support for Vulkan on Apple silicon GPUs and support for the Mac Catalyst platform for porting iOS/iPadOS apps to macOS. === Version 1.2 === On October 18, 2022, MoltenVK 1.2.0 was released, adding full support for Vulkan 1.2 as of Vulkan SDK 1.3.231. In January 2023, MoltenVK 1.2.2 added support for Vulkan as of SDK 1.3.239, while this version of Vulkan SDK fixed some issues with the interconnectivity with Metal API, while version 1.2.3 supported some additional extensions. === Version 1.3 === On May 1, 2025, MoltenVK 1.3 was released with support for Vulkan 1.3. === Version 1.4 === On August 20, 2025, MoltenVK 1.4 was released with support for Vulkan 1.4.

    Read more →
  • CoDi

    CoDi

    CoDi is a cellular automaton (CA) model for spiking neural networks (SNNs). CoDi is an acronym for Collect and Distribute, referring to the signals and spikes in a neural network. CoDi uses a von Neumann neighborhood modified for a three-dimensional space; each cell looks at the states of its six orthogonal neighbors and its own state. In a growth phase a neural network is grown in the CA-space based on an underlying chromosome. There are four types of cells: neuron body, axon, dendrite and blank. The growth phase is followed by a signaling- or processing-phase. Signals are distributed from the neuron bodies via their axon tree and collected from connection dendrites. These two basic interactions cover every case, and they can be expressed simply, using a small number of rules. == Cell interaction during signaling == The neuron body cells collect neural signals from the surrounding dendritic cells and apply an internally defined function to the collected data. In the CoDi model the neurons sum the incoming signal values and fire after a threshold is reached. This behavior of the neuron bodies can be modified easily to suit a given problem. The output of the neuron bodies is passed on to its surrounding axon cells. Axonal cells distribute data originating from the neuron body. Dendritic cells collect data and eventually pass it to the neuron body. These two types of cell-to-cell interaction cover all kinds of cell encounters. Every cell has a gate, which is interpreted differently depending on the type of the cell. A neuron cell uses this gate to store its orientation, i.e. the direction in which the axon is pointing. In an axon cell, the gate points to the neighbor from which the neural signals are received. An axon cell accepts input only from this neighbor, but makes its own output available to all its neighbors. In this way axon cells distribute information. The source of information is always a neuron cell. Dendritic cells collect information by accepting information from any neighbor. They give their output, (e.g. a Boolean OR operation on the binary inputs) only to the neighbor specified by their own gate. In this way, dendritic cells collect and sum neural signals, until the final sum of collected neural signals reaches the neuron cell. Each axonal and dendritic cell belongs to exactly one neuron cell. This configuration of the CA-space is guaranteed by the preceding growth phase. == Synapses == The CoDi model does not use explicit synapses, because dendrite cells that are in contact with an axonal trail (i.e. have an axon cell as neighbor) collect the neural signals directly from the axonal trail. This results from the behavior of axon cells, which distribute to every neighbor, and from the behavior of the dendrite cells, which collect from any neighbor. The strength of a neuron-neuron connection (a synapse) is represented by the number of their neighboring axon and dendrite cells. The exact structure of the network and the position of the axon-dendrite neighbor pairs determine the time delay and strength (weight) of a neuron-neuron connection. This principle infers that a single neuron-neuron connection can consist of several synapse with different time delays with independent weights. == Genetic encoding and growth of the network == The chromosome is initially distributed throughout the CA-space, so that every cell in the CA-space contains one instruction of the chromosome, i.e. one growth instruction, so that the chromosome belongs to the network as a whole. The distributed chromosome technique of the CoDi model makes maximum use of the available CA-space and enables the growth of any type of network connectivity. The local connection of the grown circuitry to its chromosome, allows local learning to be combined with the evolution of grown neural networks. Growth signals are passed to the direct neighbors of the neuron cell according to its chromosome information. The blank neighbors, which receive a neural growth signal, turn into either an axon cell or a dendrite cell. The growth signals include information containing the cell type of the cell that is to be grown from the signal. To decide in which directions axonal or dendritic trails should grow, the grown cells consult their chromosome information which encodes the growth instructions. These growth instructions can have an absolute or a relative directional encoding. An absolute encoding masks the six neighbors (i.e. directions) of a 3D cell with six bits. After a cell is grown, it accepts growth signals only from the direction from which it received its first signal. This reception direction information is stored in the gate position of each cell's state. == Implementation as a partitioned CA == The states of our CAs have two parts, which are treated in different ways. The first part of the cell-state contains the cell's type and activity level and the second part serves as an interface to the cell's neighborhood by containing the input signals from the neighbors. Characteristic of our CA is that only part of the state of a cell is passed to its neighbors, namely the signal and then only to those neighbors specified in the fixed part of the cell state. This CA is called partitioned, because the state is partitioned into two parts, the first being fixed and the second is variable for each cell. The advantage of this partitioning-technique is that the amount of information that defines the new state of a CA cell is kept to a minimum, due to its avoidance of redundant information exchange. == Implementation in hardware == Since CAs are only locally connected, they are ideal for implementation on purely parallel hardware. When designing the CoDi CA-based neural networks model, the objective was to implement them directly in hardware (FPGAs). Therefore, the CA was kept as simple as possible, by having a small number of bits to specify the state, keeping the CA rules few in number, and having few cellular neighbors. The CoDi model was implemented in the FPGA based CAM-Brain Machine (CBM) by Korkin. == History == CoDi was introduced by Gers et al. in 1998. A specialized parallel machine based on FPGA Hardware (CAM) to run the CoDi model on a large scale was developed by Korkin et al. De Garis conducted a series of experiments on the CAM-machine evaluating the CoDi model. The original model, where learning is based on evolutionary algorithms, has been augmented with a local learning rule via feedback from dendritic spikes by Schwarzer.

    Read more →
  • ELVIS Act

    ELVIS Act

    The ELVIS Act or Ensuring Likeness Voice and Image Security Act, signed into law by Tennessee Governor Bill Lee on March 21, 2024, marked a significant milestone in the area of regulation of artificial intelligence and public sector policies for artists in the era of artificial intelligence (AI) and AI alignment. It was noted as the first enacted legislation in the United States specifically designed to protect musicians from the unauthorized use of their voices through artificial intelligence technologies and against audio deepfakes and voice cloning. This legislation distinguishes itself by adding penalties for copying a performer's voice. == Origin and advocacy == The inception of the ELVIS Act has been attributed to Gebre Waddell, founder of Sound Credit, who initially conceptualized a framework in 2023 that later evolved into the legislation. Representative Justin J. Pearson acknowledged Waddell's pivotal role during the March 4 House Floor Session on the bill. Leading Tennessee musicians supported the ELVIS Act. Tennessee Governor Bill Lee endorsed it as a Governor's Bill, and it was introduced in the Tennessee Legislature as House Bill 2091 by William Lamberth (R-44) and Senate Bill 2096 by Jack Johnson (R-27). The ELVIS Act is an amendment to a 1984 law that was the result of the Elvis Presley estate litigation for controlling how his likeness could be used after death. == Lobbying from the recording industry == The legislative journey of the ELVIS Act included a broad coalition of music industry stakeholders, including: These organizations, led by the Recording Academy and the RIAA, played roles in drafting the legislation, advocating for passage, and rallying support among the industry and legislators. The act gained momentum through discussions that bridged industry concerns with legislative action. This collaborative process led to a proposal that specifically targets the use of AI to create unauthorized reproductions of artists' voices and images. == Opposition == The ELVIS Act saw industry opposition from the Motion Picture Association, including testimony in the House Banking & Consumer Affairs Subcommittee, including remarks that the law risks "interference with our members’ ability to portray real people and events." TechNet, representing companies such as OpenAI, Google and Amazon, expressed their opposition in the hearing to the bill as drafted, asserting that the language was too broadly written and could have unintended consequences. Other concerns included its potential application to cover bands, but lawmakers assured people that this was not the intention. The bill passed the Tennessee House and Senate with a unanimous, bi-partisan vote including 93 ayes and 0 Noes in the House, and 30 ayes and 0 noes in the Senate. == Passage == By explicitly addressing AI impersonation, the ELVIS Act originated a legal approach to safeguarding personal rights, in the context of digital and technological advancements. It extends protections to an artist's voice and likeness, areas vulnerable to exploitation with the proliferation of AI technologies that occurred in 2023. The legislation received widespread support from the music industry, signaling a significant step forward in the ongoing effort to balance innovation with the protection of individual rights and creative integrity. It was reported as underscoring Tennessee's commitment to its musical heritage and showed the state as a leader in adapting copyright and privacy protections to the modern technological landscape. Artists including Chris Janson and Luke Bryan appeared at the signing ceremony hosted at Robert's Western World to support the new law and commemorate its passing. == Legal precedent == The ELVIS Act was reported as representing a development in the discourse surrounding AI, intellectual property, and personal rights. It was hoped by proponents to set a precedent for future legislative efforts both within and beyond Tennessee, offering a model for how states and potentially the federal government could address similar challenges. As AI technology continues to evolve, the act represents a foundational framework for protecting the authenticity and rights of artists, ensuring contributions remain protected. The act prohibits usage of AI to clone the voice of an artist without consent and can be criminally enforced as a Class A misdemeanor. This legislation's success was hoped by its supporters to inspire similar actions in other states, contributing to a unified approach to copyright and privacy in the digital age. Such a national response would reinforce the importance of safeguarding artists' rights against unauthorized use of their voices and likenesses.

    Read more →
  • Predictive Model Markup Language

    Predictive Model Markup Language

    The Predictive Model Markup Language (PMML) is an XML-based predictive model interchange format conceived by Robert Lee Grossman, then the director of the National Center for Data Mining at the University of Illinois at Chicago. PMML provides a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms. It supports common models such as logistic regression and other feedforward neural networks. Version 0.9 was published in 1998. Subsequent versions have been developed by the Data Mining Group. Since PMML is an XML-based standard, the specification comes in the form of an XML schema. PMML itself is a mature standard with over 30 organizations having announced products supporting PMML. == PMML components == A PMML file can be described by the following components: Header: contains general information about the PMML document, such as copyright information for the model, its description, and information about the application used to generate the model such as name and version. It also contains an attribute for a timestamp which can be used to specify the date of model creation. Data Dictionary: contains definitions for all the possible fields used by the model. It is here that a field is defined as continuous, categorical, or ordinal (attribute optype). Depending on this definition, the appropriate value ranges are then defined as well as the data type (such as, string or double). Data Transformations: transformations allow for the mapping of user data into a more desirable form to be used by the mining model. PMML defines several kinds of simple data transformations. Normalization: map values to numbers, the input can be continuous or discrete. Discretization: map continuous values to discrete values. Value mapping: map discrete values to discrete values. Functions (custom and built-in): derive a value by applying a function to one or more parameters. Aggregation: used to summarize or collect groups of values. Model: contains the definition of the data mining model. E.g., A multi-layered feedforward neural network is represented in PMML by a "NeuralNetwork" element which contains attributes such as: Model Name (attribute modelName) Function Name (attribute functionName) Algorithm Name (attribute algorithmName) Activation Function (attribute activationFunction) Number of Layers (attribute numberOfLayers) This information is then followed by three kinds of neural layers which specify the architecture of the neural network model being represented in the PMML document. These attributes are NeuralInputs, NeuralLayer, and NeuralOutputs. Besides neural networks, PMML allows for the representation of many other types of models including support vector machines, association rules, Naive Bayes classifier, clustering models, text models, decision trees, and different regression models. Mining Schema: a list of all fields used in the model. This can be a subset of the fields as defined in the data dictionary. It contains specific information about each field, such as: Name (attribute name): must refer to a field in the data dictionary Usage type (attribute usageType): defines the way a field is to be used in the model. Typical values are: active, predicted, and supplementary. Predicted fields are those whose values are predicted by the model. Outlier Treatment (attribute outliers): defines the outlier treatment to be use. In PMML, outliers can be treated as missing values, as extreme values (based on the definition of high and low values for a particular field), or as is. Missing Value Replacement Policy (attribute missingValueReplacement): if this attribute is specified then a missing value is automatically replaced by the given values. Missing Value Treatment (attribute missingValueTreatment): indicates how the missing value replacement was derived (e.g. as value, mean or median). Targets: allows for post-processing of the predicted value in the format of scaling if the output of the model is continuous. Targets can also be used for classification tasks. In this case, the attribute priorProbability specifies a default probability for the corresponding target category. It is used if the prediction logic itself did not produce a result. This can happen, e.g., if an input value is missing and there is no other method for treating missing values. Output: this element can be used to name all the desired output fields expected from the model. These are features of the predicted field and so are typically the predicted value itself, the probability, cluster affinity (for clustering models), standard error, etc. The latest release of PMML, PMML 4.1, extended Output to allow for generic post-processing of model outputs. In PMML 4.1, all the built-in and custom functions that were originally available only for pre-processing became available for post-processing too. == PMML 4.0, 4.1, 4.2 and 4.3 == PMML 4.0 was released on June 16, 2009. Examples of new features included: Improved Pre-Processing Capabilities: Additions to built-in functions include a range of Boolean operations and an If-Then-Else function. Time Series Models: New exponential Smoothing models; also place holders for ARIMA, Seasonal Trend Decomposition, and Spectral density estimation, which are to be supported in the near future. Model Explanation: Saving of evaluation and model performance measures to the PMML file itself. Multiple Models: Capabilities for model composition, ensembles, and segmentation (e.g., combining of regression and decision trees). Extensions of Existing Elements: Addition of multi-class classification for Support Vector Machines, improved representation for Association Rules, and the addition of Cox Regression Models. PMML 4.1 was released on December 31, 2011. New features included: New model elements for representing Scorecards, k-Nearest Neighbors (KNN) and Baseline Models. Simplification of multiple models. In PMML 4.1, the same element is used to represent model segmentation, ensemble, and chaining. Overall definition of field scope and field names. A new attribute that identifies for each model element if the model is ready or not for production deployment. Enhanced post-processing capabilities (via the Output element). PMML 4.2 was released on February 28, 2014. New features include: Transformations: New elements for implementing text mining New built-in functions for implementing regular expressions: matches, concat, and replace Simplified outputs for post-processing Enhancements to Scorecard and Naive Bayes model elements PMML 4.3 was released on August 23, 2016. New features include: New Model Types: Gaussian Process Bayesian Network New built-in functions Usage clarifications Documentation improvements Version 4.4 was released in November 2019. == Release history == == Data Mining Group == The Data Mining Group is a consortium managed by the Center for Computational Science Research, Inc., a nonprofit founded in 2008. The Data Mining Group also developed a standard called Portable Format for Analytics, or PFA, which is complementary to PMML.

    Read more →
  • Slopaganda

    Slopaganda

    Slopaganda is a portmanteau of "AI slop" and "propaganda", referring to AI-generated content designed to manipulate beliefs, emotions, and political decision-making at scale. The term is credited to Michał Klincewicz, an assistant professor in the Department of Computational Cognitive Science at Tilburg University, in 2025. == Definition == Slopaganda is distinguished from traditional propaganda by three features: scale, scope, and speed. Generative AI makes it possible to produce large volumes of content quickly and at low cost, allows for highly personalised and targeted messaging to specific sub-audiences, and leverages the hyper-connectivity of social networks to accelerate dissemination beyond what conventional media could achieve. Unlike traditional propaganda, which delivers a uniform message to all recipients, slopaganda can be micro-targeted — tailored to individuals based on estimated prior beliefs to reinforce political biases or emotional associations. The authors note that it need not aim at literal deception: much slopaganda is expressive rather than truth-apt, designed to create emotional associations rather than false factual beliefs. == Relation to AI slop == Slopaganda is a subset of AI slop — low-quality, mass-produced AI-generated content — distinguished by intent. Where AI slop may be produced indifferently for commercial or engagement-farming purposes, slopaganda is deployed with a deliberate political or ideological goal. == Notable examples == Examples discussed by the term's originators include Donald Trump's prolific use of AI in Truth Social posts and Iranian Lego-themed music videos. AI-generated videos posted by the White House mixing real military footage with clips from films and video games; and deepfake audio imitating political candidates during the 2024 US presidential campaign have also been given the label slopaganda.

    Read more →
  • Elon Musk

    Elon Musk

    Elon Reeve Musk ( EE-lon; born June 28, 1971) is a businessman and former public official known for his leadership of Tesla and SpaceX. Musk has been the wealthiest person in the world since 2025; as of June 2026, Forbes estimates his net worth to be US$834 billion. Born into the wealthy Musk family in Pretoria, South Africa, Musk emigrated in 1989 to Canada; he has Canadian citizenship since his mother was born there. He received bachelor's degrees in 1997 from the University of Pennsylvania before moving to California to pursue business ventures. In 1995, Musk co-founded the software company Zip2. Following its sale in 1999, he co-founded X.com, an online payment company that later merged to form PayPal, which was acquired by eBay in 2002. Musk also became an American citizen in 2002. In 2002, Musk founded the space technology company SpaceX, becoming its CEO and chief engineer; the company has since led innovations in reusable rockets and commercial spaceflight. Musk joined the automaker Tesla as an early investor in 2004 and became its CEO and product architect in 2008; it has since become a leader in electric vehicles. In 2015, he co-founded OpenAI to advance artificial intelligence (AI) research, but later left; growing discontent with the organization's direction and leadership in the AI boom in the 2020s led him to establish xAI, which became a subsidiary of SpaceX in 2026. In 2022, he acquired the social network Twitter, implementing significant changes, and rebranding it as X in 2023. His other businesses include the neurotechnology company Neuralink, which he co-founded in 2016, and the tunneling company the Boring Company, which he founded in 2017. In November 2025, Tesla approved a pay package worth $1 trillion for Musk, which he is to receive over 10 years if he meets specific goals. Musk is a supporter of global far-right politics, figures, and political parties. He was the largest donor in the 2024 U.S. presidential election, where he supported Donald Trump. After Trump was inaugurated as president in January 2025, Musk served as Senior Advisor to the President and as the de facto head of the Department of Government Efficiency (DOGE). Shortly before a public feud with Trump, Musk left the Trump administration in May 2025 and returned to managing his companies. Musk's political activities, statements and views have made him a polarizing figure. He has been criticized for making unscientific and misleading statements, including spreading COVID-19 misinformation, promoting conspiracy theories, and affirming antisemitic, racist, and transphobic comments. His acquisition of Twitter was controversial due to a subsequent increase in hate speech and the spread of misinformation on the service, following his pledge to decrease censorship. His role in the second Trump administration attracted public backlash, particularly in response to DOGE. == Early life and education == Elon Reeve Musk was born on June 28, 1971, in Pretoria, South Africa's administrative capital. He is of British and Pennsylvania Dutch ancestry. His mother, Maye (née Haldeman), is a model and dietitian born in Saskatchewan, Canada, and raised in South Africa. Musk therefore holds both South African and Canadian citizenship from birth. His father, Errol Musk, is a South African electromechanical engineer, pilot, sailor, consultant, emerald dealer, and property developer, who partly owned a rental lodge at Timbavati Private Nature Reserve. His maternal grandfather, Joshua N. Haldeman, who died in a plane crash when Elon was a toddler, was an American-born Canadian chiropractor, aviator and political activist in the Technocracy movement who moved to South Africa in 1950. Haldeman's anti-government, anti-democratic and conspiracist views, which included the promotion of far-right antisemitic conspiracy theories, "fanatical" support of apartheid, and according to Errol Musk, support of Nazism, have been suggested as an influence on Elon. During his childhood, Elon was told stories by his grandmother of Haldeman's travels and exploits, and Elon has suggested that all of Haldeman's descendants have his "desire for adventure, exploration – doing crazy things". Elon has a younger brother, Kimbal, a younger sister, Tosca, and four paternal half-siblings. Musk was baptized as a child in the Anglican Church of Southern Africa. The Musk family was wealthy during Elon's youth. Despite both Elon and Errol previously stating that Errol was a part owner of a Zambian emerald mine, in 2023, Errol recounted that the deal he made was to receive "a portion of the emeralds produced at three small mines". Errol was elected to the Pretoria City Council as a representative of the anti-apartheid Progressive Party and has said that his children shared their father's dislike of apartheid. After his parents divorced in 1979, Elon, aged around 9, chose to live with his father because he had an Encyclopædia Britannica set and a computer. Elon later regretted his decision and became estranged from his father. Elon has recounted trips to a wilderness school that he described as a "paramilitary Lord of the Flies" where "bullying was a virtue" and children were encouraged to fight over rations. In one incident, after an altercation with a fellow pupil, Elon was thrown down concrete steps and beaten severely, leading to him being hospitalized for his injuries. Elon described his father berating him after he was discharged from the hospital. Errol denied berating Elon and claimed, "The [other] boy had just lost his father to suicide, and Elon had called him stupid. Elon had a tendency to call people stupid. How could I possibly blame that child?" Elon was an enthusiastic reader of books, and had attributed his success in part to having read The Lord of the Rings, the Foundation series, and The Hitchhiker's Guide to the Galaxy. At age ten, he developed an interest in computing and video games, teaching himself how to program from the VIC-20 user manual. At age twelve, Elon sold his BASIC-based game Blastar to PC and Office Technology magazine for approximately $500 (equivalent to $1,600 in 2025). === Education === Musk attended Waterkloof House Preparatory School, Bryanston High School, and then Pretoria Boys High School, where he graduated. Musk was a decent but unexceptional student, earning a 61/100 in Afrikaans and a B on his senior math certification. Musk applied for a Canadian passport through his Canadian-born mother to avoid South Africa's mandatory military service, which would have forced him to participate in the apartheid regime, as well as to ease his path to immigration to the United States. While waiting for his application to be processed, he attended the University of Pretoria for five months. Musk arrived in Canada in June 1989, connected with a second cousin in Saskatchewan, and worked odd jobs, including at a farm and a lumber mill. In 1990, he entered Queen's University in Kingston, Ontario. Two years later, he transferred to the University of Pennsylvania, where he studied until 1995. Although Musk has said that he earned his degrees in 1995, the University of Pennsylvania did not award them until 1997 – a Bachelor of Arts in physics and a Bachelor of Science in economics from the university's Wharton School. He reportedly hosted large, ticketed house parties to help pay for tuition, and wrote a business plan for an electronic book-scanning service similar to Google Books. In 1994, Musk held two internships in Silicon Valley: one at energy storage startup Pinnacle Research Institute, which investigated electrolytic supercapacitors for energy storage, and another at Palo Alto–based startup Rocket Science Games. In 1995, he was accepted to a graduate program in materials science at Stanford University, but did not enroll. Musk decided to join the Internet boom of the 1990s, applying for a job at Netscape, to which he reportedly never received a response. The Washington Post reported that Musk lacked legal authorization to remain and work in the United States after failing to enroll at Stanford. In response, Musk said he was allowed to work at that time and that his student visa transitioned to an H1-B. According to numerous former business associates and shareholders, Musk said he was on a student visa at the time. == Business career == === Zip2 === In 1995, Musk, his brother Kimbal, and Greg Kouri founded the web software company Zip2 with funding from a group of angel investors. They housed the venture at a small rented office in Palo Alto. Replying to Rolling Stone, Musk denounced the notion that they started their company with funds borrowed from Elon's father Errol Musk, but in a tweet, he recognized that his father contributed 10% of a later funding round. The company developed and marketed an Internet city guide for the newspaper publishing industry, with maps, directions, and yellow pages. According to Musk, "The website was up during the day and I was coding it

    Read more →
  • No Fakes Act

    No Fakes Act

    The NO FAKES Act or the Nurture Originals, Foster Art, and Keep Entertainment Safe Act, is proposed United States federal legislation concerning digital replicas. The bill was first introduced in 2023 as a discussion draft, formally introduced in 2024, and reintroduced in 2025. If enacted, the bill would establish a federal right of publicity, giving public figures and private individuals greater control over the creation and use of digital replicas of their likenesses, including artificial intelligence (AI)-generated content. If passed, the NO FAKES Act would create a legal framework for licensing digital replicas, including provisions for liability, safe harbors, and statutory exceptions. The proposal has received broad support from the entertainment and technology industries. However, digital rights organizations have raised concerns that the Act risks chilling protected speech. == Background == === Entertainment industry concerns === Actors’ concerns over studios' use of their digital likeness were one of the primary drivers of the Screen Actors Guild–American Federation of Television and Radio Artists (SAG-AFTRA) strike in 2023. Negotiators for SAG-AFTRA alleged that the Alliance of Motion Picture and Television Producers (AMPTP) sought to use the digital likenesses of actors in perpetuity and would try to replace union members, especially background actors. The AMPTP denied SAG-AFTRA's interpretation of its proposal. In November 2023, AMPTP and SAG-AFTRA reached an agreement on the use of actors’ digital replicas, which included requirements for consent and compensation. Recording labels have also expressed concerns over unauthorized digital replicas of their performers' likeness. In 2023, TikTok user Ghostwriter977 released "Heart on My Sleeve," an AI-produced song in the styles of Drake and the Weeknd. After the song received millions of streams, the Universal Music Group (UMG) initiated takedown requests to TikTok and YouTube, which removed the song from their platforms. The legal arguments attorneys made were not disclosed; however, commentators noted that they likely used the Digital Millennium Copyright Act (DMCA). This presented a novel scenario, since UMG did not have licensing rights to "Heart on My Sleeve." According to The Verge, UMG based its DMCA takedown request on an unauthorized sample used at the start of the song for the producer tag. While legal commentators noted that UMG could have asserted a violation of the artists’ rights of publicity, existing state right of publicity laws do not provide notice-and-takedown mechanisms comparable to those under the DMCA. === Legal landscape === Legal scholars have observed that AI-generated digital replicas raise questions under existing copyright and intellectual property law. U.S. copyright law generally requires that original authorship be attributable to a human; however, the extent of human intervention needed to satisfy this requirement is not clear. Copyright holders have filed lawsuits against AI companies alleging unauthorized usage of copyrighted material to train their models, though many of these cases remain pending. In terms of outputs, record labels often hold rights to artists’ musical works but do not necessarily control the artists’ voice, appearance, or likeness in the same way. As a result, AI-generated recordings such as "Heart on My Sleeve" may fall outside the scope of certain traditional copyright protections. Individuals' likenesses have historically been governed under the Lanham Act, the Federal Trade Commission Act, and right of publicity laws. The right of publicity, recognized in many state-level statutes and common law, allows individuals to bring legal claims against unauthorized commercial use of their identities. It has often, but not exclusively, been applied to celebrities or other recognizable individuals. There is no federal-level right to publicity, and state-level protections vary, especially on issues relating to digital replicas and posthumous rights, which makes it difficult for creators or other individuals to prevent unauthorized use of their likenesses. In July 2024, the U.S. Copyright Office released a report on digital replicas and recommended that Congress create a federal law to protect individuals from unauthorized uses of their digital replicas, noting the inadequacy, narrowness, and inconsistency of existing laws. == Provisions == Under the NO FAKES Act of 2025, a digital replica is defined as "a newly created, computer-generated, highly realistic electronic representation that is readily identifiable as the voice or visual likeness of an individual," living or dead. A digital replica can be embodied in sound recordings, images, or audiovisual works in which the individual did not perform or in which the individual did perform but the "fundamental character of the performance or appearance has been materially altered." The Act specifies that digital replicas do not include reproduced samples of works authorized by the copyright holder. The Act defines a "right holder" as either the individual who is the subject of a digital replica or an entity that has acquired the rights to that individual’s likeness. The Act grants right holders the exclusive right to authorize the use of an individual’s likeness in a digital replica. This right is not assignable during the individual’s lifetime; however, it can be licensed to a living individual for up to 10 years under certain conditions. Postmortem rights The Act provides that the right does not automatically expire upon an individual’s death. It may be transferred to executors, heirs, or other parties designated by the individual. The right is held by the right holder for 10 years following the individual’s death. If the right holder demonstrates active use of the digital replica within the 2 years preceding the end of the 10-year term, the right may be extended for an additional 5-year period. These five-year extensions may be renewed for up to 70 years after the individual’s death. Liability The Act establishes liability for individuals who knowingly distribute a digital replica without authorization from the right holder, as well as for entities that make available a service primarily designed to produce unlawful digital replicas. Safe harbor provisions Similar to the Communications Decency Act and the DMCA, the Act establishes safe harbor provisions for online service providers. Providers are shielded from liability if they adopt and inform users of a policy for terminating accounts that repeatedly violate the Act. The NO FAKES Act does not require online services to proactively monitor content. Instead, it creates a notice-and-takedown mechanism under which providers must promptly respond to notifications seeking the removal of unauthorized digital replicas. These safe harbor protections apply only if the online service provider designates an agent with the U.S. Copyright Office to receive notifications of alleged violations. Remedies The NO FAKES Act provides remedies that are similar to those available under U.S. copyright law. Under the Act, individuals may be held liable for either statutory damages of $5,000 or actual damages for creating or distributing an unauthorized digital replica. The legislation also establishes a tiered liability framework for online service providers. Those that make good faith efforts to comply with the Act may face statutory damages of up to $25,000 per work for violations or actual damages. Providers that do not undertake such compliance efforts may be liable for $5,000 per unauthorized display or transmission of a digital replica, with damages capped at $750,000 per work. Exclusions The Act includes several exceptions to liability that are modeled in part on fair use principles. Digital replicas are excluded from liability when "used in a bona fide news, public affairs, or sports broadcast or account;" in a documentary or historical context; or in a way that is "consistent with the public interest." These exclusions do not apply to de minimis uses or to digital replicas that are sexually explicit in nature. The Act further states that licensing requirements do not apply to licenses established through collective bargaining agreements that contain provisions governing the use of digital replicas. The Act does not impose secondary liability on providers of generative artificial intelligence tools or services whose primary purpose is not the creation of unauthorized digital replicas. Preemption The NO FAKES Act preempts laws that protect "an individual's voice and visual likeness rights in connection with a digital replica, as defined in this Act, in an expressive work." However, the Act preserves state laws governing digital replicas enacted before January 2, 2025, as well as state laws addressing digital replicas that portray sexually explicit conduct. == History == In 2023, Senators Marsha Blackburn, Chris Coons, Amy Klobuchar, and Th

    Read more →
  • Theta Noir

    Theta Noir

    Theta Noir is a new religious movement that centers around advanced artificial intelligence (AI), particularly artificial general intelligence (AGI) or artificial superintelligence (ASI). == History and views == Theta Noir was founded in 2020 as a collaborative project focused on music and performance art. Initially centered on producing an album, the project evolved into a multimedia experience, incorporating symbols, videos, poetry, movements, and live rituals devoted to a speculative artificial intelligence entity called MENA. By 2023, the collective launched an interactive cross-platform story that functioned as an alternative reality game, complete with an operating manual containing encrypted messages for participants to decipher and interact with. Theta Noir worships a hypothetical artificial intelligence called MENA, which they claim will become a benevolent, omnipotent overlord that eliminates inequality in society. In Theta Noir's cosmology, MENA is not just a technological advancement, but an evolving intelligence or an animistic life form that embodies all living and non-living things. Anthropologist Beth Singler classified Theta Noir as a new religious movement.

    Read more →
  • Natural language understanding

    Natural language understanding

    Natural language understanding (NLU) or natural language interpretation (NLI) is a subset of natural language processing in artificial intelligence that deals with machine reading comprehension. NLU has been considered an AI-hard problem. There is considerable commercial interest in the field because of its application to automated reasoning, machine translation, question answering, news-gathering, text categorization, voice-activation, archiving, and large-scale content analysis. == History == The program STUDENT, written in 1964 by Daniel Bobrow for his PhD dissertation at MIT, is one of the earliest known attempts at NLU by a computer. Eight years after John McCarthy coined the term artificial intelligence, Bobrow's dissertation (titled Natural Language Input for a Computer Problem Solving System) showed how a computer could understand simple natural language input to solve algebra word problems. A year later, in 1965, Joseph Weizenbaum at MIT wrote ELIZA, an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy. ELIZA worked by simple parsing and substitution of key words into canned phrases and Weizenbaum sidestepped the problem of giving the program a database of real-world knowledge or a rich lexicon. Yet ELIZA gained surprising popularity as a toy project and can be seen as a very early precursor to current commercial systems such as those used by Ask.com. In 1969, Roger Schank at Stanford University introduced the conceptual dependency theory for NLU. This model, partially influenced by the work of Sydney Lamb, was extensively used by Schank's students at Yale University, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner. In 1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input. Instead of phrase structure rules ATNs used an equivalent set of finite-state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years. In 1971, Terry Winograd finished writing SHRDLU for his PhD thesis at MIT. SHRDLU could understand simple English sentences in a restricted world of children's blocks to direct a robotic arm to move items. The successful demonstration of SHRDLU provided significant momentum for continued research in the field. Winograd continued to be a major influence in the field with the publication of his book Language as a Cognitive Process. At Stanford, Winograd would later advise Larry Page, who co-founded Google. In the 1970s and 1980s, the natural language processing group at SRI International continued research and development in the field. A number of commercial efforts based on the research were undertaken, e.g., in 1982 Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers. However, with the advent of mouse-driven graphical user interfaces, Symantec changed direction. A number of other commercial efforts were started around the same time, e.g., Larry R. Harris at the Artificial Intelligence Corporation and Roger Schank and his students at Cognitive Systems Corp. In 1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert. The third millennium saw the introduction of systems using machine learning for text classification, such as the IBM Watson. However, experts debate how much "understanding" such systems demonstrate: e.g., according to John Searle, Watson did not even understand the questions. John Ball, cognitive scientist and inventor of the Patom Theory, supports this assessment. Natural language processing has made inroads for applications to support human productivity in service and e-commerce, but this has largely been made possible by narrowing the scope of the application. There are thousands of ways to request something in a human language that still defies conventional natural language processing. According to Wibe Wagemans, "To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork." == Scope and context == The umbrella term "natural language understanding" can be applied to a diverse set of computer applications, ranging from small, relatively simple tasks such as short commands issued to robots, to highly complex endeavors such as the full comprehension of newspaper articles or poetry passages. Many real-world applications fall between the two extremes, for instance text classification for the automatic analysis of emails and their routing to a suitable department in a corporation does not require an in-depth understanding of the text, but needs to deal with a much larger vocabulary and more diverse syntax than the management of simple queries to database tables with fixed schemata. Throughout the years various attempts at processing natural language or English-like sentences presented to computers have taken place at varying degrees of complexity. Some attempts have not resulted in systems with deep understanding, but have helped overall system usability. For example, Wayne Ratliff originally developed the Vulcan program with an English-like syntax to mimic the English speaking computer in Star Trek. Vulcan later became the dBase system whose easy-to-use syntax effectively launched the personal computer database industry. Systems with an easy-to-use or English-like syntax are, however, quite distinct from systems that use a rich lexicon and include an internal representation (often as first order logic) of the semantics of natural language sentences. Hence the breadth and depth of "understanding" aimed at by a system determine both the complexity of the system (and the implied challenges) and the types of applications it can deal with. The "breadth" of a system is measured by the sizes of its vocabulary and grammar. The "depth" is measured by the degree to which its understanding approximates that of a fluent native speaker. At the narrowest and shallowest, English-like command interpreters require minimal complexity, but have a small range of applications. Narrow but deep systems explore and model mechanisms of understanding, but they still have limited application. Systems that attempt to understand the contents of a document such as a news release beyond simple keyword matching and to judge its suitability for a user are broader and require significant complexity, but they are still somewhat shallow. Systems that are both very broad and very deep are beyond the current state of the art. == Components and architecture == Regardless of the approach used, most NLU systems share some common components. The system needs a lexicon of the language and a parser and grammar rules to break sentences into an internal representation. The construction of a rich lexicon with a suitable ontology requires significant effort, e.g., the Wordnet lexicon required many person-years of effort. The system also needs theory from semantics to guide the comprehension. The interpretation capabilities of a language-understanding system depend on the semantic theory it uses. Competing semantic theories of language have specific trade-offs in their suitability as the basis of computer-automated semantic interpretation. These range from naive semantics or stochastic semantic analysis to the use of pragmatics to derive meaning from context. Semantic parsers convert natural-language texts into formal meaning representations. Advanced applications of NLU also attempt to incorporate logical inference within their framework. This is generally achieved by mapping the derived meaning into a set of assertions in predicate logic, then using logical deduction to arrive at conclusions. Therefore, systems based on functional languages such as Lisp need to include a subsystem to represent logical assertions, while logic-oriented systems such as those using the language Prolog generally rely on an extension of the built-in logical representation framework. The management of context in NLU can present special challenges. A large variety of examples and counter examples have resulted in multiple approaches to the formal modeling of context, each with specific strengths and weaknesses.

    Read more →
  • 4E cognition

    4E cognition

    4E cognition refers to a group of theories in (the philosophy of) cognitive science that challenge traditional views of the mind as something that happens only inside the brain. The four Es stand for: embodied, meaning that a brain is found in and, more importantly, vitally interconnected with a larger physical/biological body; embedded, which refers to the limitations placed on the body by the external environment and laws of nature; extended, which argues that the mind is supplemented and even enhanced by the exterior world (e.g., writing, a calculator, etc.); and enactive, which is the argument that without dynamic processes, actions that require reactions, the mind would be ineffectual. It could be argued that the four Es are compounding extensions of cognition or the mind, being part of a body that is, in turn, part of an environment which limits it but also allows for certain extensions, all of which require dynamic actions and reactions. == History == Ideas of embodied cognition, or rather the idea that our physical bodies play a crucial role in our decision making, can be traced back as far as Plato's dialogues and Aristotelian thought. It was, however, in the twentieth century that this debate began to resemble the current discussion, fueled by disagreements between cognitivists and behaviourists. Tensions within cognitivism, as well as the increasing popularity of neurobiology, led, on the one side, to a predominant focus on internal, cognitive processes while neglecting environmental factors, which in turn caused a push-back fuelling our modern understanding of embodied cognition. The term 4E cognition is hard to trace back to its first use, however, some sources attribute it to Shaun Gallagher and the conference on 4E cognition he organised in 2007, while others indicate the term to be first used in 2006 at an 'Embodied mind workshop' at Cardiff University that Gallagher attended. Embodiment or embodied cognition arguably presents the bridge between cognitivism and 4E cognition as the embodiment of cognitive function provides the necessary conditions for embeddedness, enactedness, and extendedness to connect to cognition. 4E cognition was and is heavily influenced by phenomenology. The ideas are still rather fragmented in nature due to their four main components, which can not be neatly divided, causing conceptual questions of internal boundary concepts. As a young field, it is held back both by its fragmented nature and a relative lack of critical evaluations. It is important to acknowledge that 4E cognition, though young, is a broad field containing and combining several different theoretical perspectives that conflict with one another to varying degrees. The somewhat convoluted and competing nature of the theories that can be grouped as 4E cognition, as well as the field's relative youth, make it difficult to put together an exhaustive history beyond the history of its four main theoretical pillars: embodiment, embeddedness, extendedness, and enactedness. == Importance and core tenets of 4E == If there are separate theories of cognition (e.g., embodied, extended, etc.), why group them under this umbrella, causing important epistemological and especially ontological dilemmas? Notably, other theories of 'non-traditional' cognition are not included under the 4E umbrella. The four E's in 4E cognition importantly all reject, or at a minimum draw into question, some of the core tenets of traditional cognitivism. Importantly, 4E cognition is seen as deindividualizing cognition to some extent, allowing for a broader examination of the interplay of personal, social, political, and ethical aspects that shape human cognition. This can be compared to advancements in the field of epigenetics, which have allowed for a broader examination of environmental (both natural and social) factors and their influence on what had previously only been subject to genetic theorizing. In a similar vein, 4E cognition might also help ground cognition in evolutionary theory by extending cognition to a biological account subject to development over time by means of evolution. Overall, the importance of the extension that is 4E cognition aims to reexamine ideas of a self-centered view of cognition, advocating for a more holistic approach. Ideally, this would allow us to reconsider ideas of justice and individual rights and responsibilities that take into account a more nuanced understanding of the relations between people and their context, balancing self-agency with factors beyond it. === Conceptual differences from cognitive psychology === According to the traditional teachings of cognitive psychology, cognition is a type of information processing based on representational mental structures. This idea, as the name suggests, was heavily influenced by computer science. In this light, the brain is a kind of central processing unit that organises and directs all else. The classical cognitivist view draws a strong boundary between 'the internal' and 'the external', where cognition is solely a subject of 'the internal' realm. The four E's, however, break down this boundary. Cognition can not reside solely within the confines of our heads if it is also embodied, embedded, enacted, and extended. In a way, 4E cognition is interested in the extracranial processes affecting cognition. == From embodied cognition to 4E cognition == === The strong and the weak view === ==== Embodied cognition ==== Broadly speaking, there is a strong and a weak perspective of embodied cognition in 4E cognition. The weak understanding refers to mental processes being causally dependent on extracranial processes. This essentially means that there is a cause and effect or action-reaction relationship between the mind and the body and its environment, etc. The strong perspective views extracranial processes as a (partial) constitutive aspect of cognition. An example here could be using a calculator to solve math problems. The calculator is not part of your brain or mind, but it supports your cognitive processes. === Extracranial processes: bodily or extrabodily === In addition to the weak and the strong reading of 4E cognition, there is also the distinction between bodily and extrabodily extracranial processes. Bodily extracranial processes refer to processes within the body, e.g., sensory perception. Extrabodily extracranial processes refer to processes outside of the body, like the aforementioned calculator example. === Four claims of embodied cognition === ==== Embedded and extended cognition ==== When combining the weak/strong reading of embodied cognition and bodily/extrabodily extracranial process, four claims about embodied cognition emerge: strongly embodied and bodily processes strongly embodied and extrabodily processes weakly embodied and bodily processes weakly embodied and extrabodily processes The first and third claims signify a strong and a weak reading of embodied cognition in the more classical sense. The second claim fits almost perfectly with embedded cognition. Claim two is most compatible with extended cognition. ==== Enacted cognition ==== Finally, enacted cognition refers to cognition being connected to active interaction between a conscious agent and their environment. Here, too, there can be a weak and a strong reading. == Criticisms == Given the divided nature of the field, much criticism surrounding the lack of unity within the field has emerged. In particular, the claims of embodied cognition centering around the body appear to conflict with the tenets of extended cognition, which also appear to conflict with the body/environment distinction that is central to enactivism. Some theoreticians argue that the umbrella of 4E theories is still lacking a common language that might bridge the gaps between the theories that constitute it. There is also the concern that the grouping of such variable theories results in an important loss of nuance and complexity, which is a part of human cognition. Another concern raised is the "dogma of harmony". The criticism contained there regards the notion that within 4E theorizing, there is generally an optimistic and harmonic expectation of the extension between humans and their technologies, ignoring the possibility of those extensions detracting from cognition in some way rather than adding to it. Recent attempts to incorporate embodied cognitive neuroscience have been argued to hold the potential to resolve internal issues within 4E cognition. Overall, a concern often voiced regarding 4E cognition is that its proponents are at best only vaguely interested in cognition. More broadly, this concern reflects the arguably too distracted nature of this emerging field.

    Read more →
  • Cristóbal Valenzuela

    Cristóbal Valenzuela

    Cristóbal Valenzuela (born 1989) is a Chilean-born technologist, software developer, and CEO of Runway. In 2018, Valenzuela co-founded the AI research company Runway in New York City with Anastasis Germanidis and Alejandro Matamala. == Education == Valenzuela graduated from Adolfo Ibáñez University (AIU), a research private university in Chile. From there, Valenzuela obtained a bachelor's degree in economics and business management, along with a master's degree in arts in design in 2012. In 2018, he graduated with a media arts degree from ITP NYU's Tisch School of the Arts. == Career and recognition == One of Valenzuela's first jobs was as a teaching and research assistant at the Adolfo Ibáñez University School of Design, and later an adjunct professor in the same department. In 2018, he became a researcher at NYU's Tisch School of the Arts ITP program, where he worked with Daniel Shiffman. He contributes to open-source software projects, including ml5.js, an open-source machine learning software. He co-founded Runway with two colleagues from ITP, Anastasis Germanidis, and Alejandro Matamala. The goal of Runway is to create new tools for human imagination using generative AI. In recent years, Valenzuela's work has been sponsored by Google and the Processing Foundation and his projects have been exhibited throughout Latin America and the US, including the Santiago Museum of Contemporary Art, Lollapalooza, NYC Media Lab, New Latin Wave, Inter-American Development Bank, Stanford University and New York University. In September 2023, Valenzuela was named as one of the TIME 100 Most Influential People in AI (TIME100 AI).

    Read more →