AI Data Analyst Zalando

AI Data Analyst Zalando — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Clips (software)

    Clips (software)

    Clips is a discontinued mobile video editing software application created by Apple Inc. It was released onto the iOS App Store on April 6, 2017, for free. Initially, it was only available on 64-bit devices running iOS 10.3 or later; as of version 3.1.3, it requires iOS 16.0 or later. Apple describes it as an app for "making and sharing fun videos with text, effects, graphics, and more.". Its final release was on May 9, 2024 before was removed from the App Store on October 10, 2025. == Features == After launching of the app, the user sees the view of the front-facing camera. The app allows the user to create a new clip by tapping on a red record button, or use photos or videos from the device's photo library. Once a clip is recorded, it can be added to a project timeline shown at the bottom of the screen. The user can share their project on social media platforms. The user can also add filters and effects to the project. "Live Titles" (available in several styles) can also be created by dictating to the device.

    Read more →
  • ChatScript

    ChatScript

    ChatScript is a combination Natural Language engine and dialog management system designed initially for creating chatbots, but is currently also used for various forms of NL processing. It is written in C++. The engine is an open source project at SourceForge. and GitHub. ChatScript was written by Bruce Wilcox and originally released in 2011, after Suzette (written in ChatScript) won the 2010 Loebner Prize, fooling one of four human judges. == Features == In general ChatScript aims to author extremely concisely, since the limiting scalability of hand-authored chatbots is how much/fast one can write the script. Because ChatScript is designed for interactive conversation, it automatically maintains user state across volleys. A volley is any number of sentences the user inputs at once and the chatbots response. The basic element of scripting is the rule. A rule consists of a type, a label (optional), a pattern, and an output. There are three types of rules. Gambits are something a chatbot might say when it has control of the conversation. Rejoinders are rules that respond to a user remark tied to what the chatbot just said. Responders are rules that respond to arbitrary user input which is not necessarily tied to what the chatbot just said. Patterns describe conditions under which a rule may fire. Patterns range from extremely simplistic to deeply complex (analogous to Regex but aimed for NL). Heavy use is typically made of concept sets, which are lists of words sharing a meaning. ChatScript contains some 2000 predefined concepts and scripters can easily write their own. Output of a rule intermixes literal words to be sent to the user along with common C-style programming code. Rules are bundled into collections called topics. Topics can have keywords, which allows the engine to automatically search the topic for relevant rules based on user input. == Example code == Words starting with ~ are concept sets. For example, ~fruit is the list of all known fruits. The simple pattern (~fruit) reacts if any fruit is mentioned immediately after the chatbot asks for favorite food. The slightly more complex pattern for the rule labelled WHATMUSIC requires all the words what, music, you and any word or phrase meaning to like, but they may occur in any order. Responders come in three types. ?: rules react to user questions. s: rules react to user statements. u: rules react to either. ChatScript code supports standard if-else, loops, user-defined functions and calls, and variable assignment and access. == Data == Some data in ChatScript is transient, meaning it will disappear at the end of the current volley. Other data is permanent, lasting forever until explicitly killed off. Data can be local to a single user or shared across all users at the bot level. Internally all data is represented as text and is automatically converted to a numeric form as needed. === Variables === User variables come in several kinds. Variables purely local to a topic or function are transient. Global variables can be declared as transient or permanent. A variable is generally declared merely by using it, and its type depends on its prefix ($, $$, $_). === Facts === In addition to variables, ChatScript supports facts – triples of data, which can also be transient or permanent. Functions can query for facts having particular values of some of the fields, making them act like an in-memory database. Fact retrieval is very quick and efficient the number of available in-memory facts is largely constrained to the available memory of the machine running the ChatScript engine. Facts can represent record structures and are how ChatScript represents JSON internally. Tables of information can be defined to generate appropriate facts. The above table links people to what they invented (1 per line) with Einstein getting a list of things he did. == External communication == ChatScript embeds the Curl library and can directly read and write facts in JSON to a website. == Server == A ChatScript engine can run in local or server mode. == Pos-tagging, parsing, and ontology == ChatScript comes with a copy of English WordNet embedded within, including its ontology, and creates and extends its own ontology via concept declarations. It has an English language pos-tagger and parser and supports integration with TreeTagger for pos-tagging a number of other languages (TreeTagger commercial license required). == Databases == In addition to an internal fact database, ChatScript supports PostgreSQL, MySQL, MSSQL and MongoDB both for access by scripts, but also as a central filesystem if desired so ChatScript can be scaled horizontally. A common use case is to use a centralized database to host the user files and multiple servers to scale the ChatScript engine. == JavaScript == ChatScript also embeds DukTape, ECMAScript E5/E5.1 compatibility, with some semantics updated from ES2015+. == Spelling Correction == ChatScript has built-in automatic spell checking, which can be augmented in script as both simple word replacements or context sensitive changes. With appropriate simple rules you can change perfect legal words into other words or delete them. E.g., if you have a concept of ~electronic_goods and don't want an input of Radio Shack (a store name) to be detected as an electronic good, you can get the input to change to Radio_Shack (a single word), or allow the words to remain but block the detection of the concept. This is particularly useful when combined with speech-to-text code that is imperfect, but you are familiar with common failings of it and can compensate for them in script. == Control flow == A chatbot's control flow is managed by the control script. This is merely another ordinary topic of rules, that invokes API functions of the engine. Thus control is fully configurable by the scripter (and functions exist to allow introspection into the engine). There are pre-processing control flow and post-processing control flow options available, for special processing.

    Read more →
  • Natural Language Toolkit

    Natural Language Toolkit

    The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania. NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit, plus a cookbook. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning. NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. == Library highlights == Discourse representation Lexical analysis: Word and text tokenizer n-gram and collocations Part-of-speech tagger Tree model and Text chunker for capturing Named-entity recognition

    Read more →
  • Jaggaer

    Jaggaer

    JAGGAER, formerly SciQuest, is a provider of cloud-based business automation technology for Business Spend Management. Its headquarters is in Durham, North Carolina. == Company history == SciQuest was established in 1995 as a B2B eCommerce exchange.The company went public with an IPO in 1999. In 2001, SciQuest transitioned from a B2B exchange company into eProcurement software and supplier enablement platforms. SciQuest was taken private in 2004 and continued to move into eProcurement, inventory management and accounts payable automation. SciQuest completed an IPO in September 2010, raising approximately $57 million. SciQuest, and its 510 person workforce, was taken private in June 2016 as part of a $509 million acquisition by Accel-KKR, a private equity firm headquartered in Menlo Park, CA. In 2017 SciQuest was rebranded as JAGGAER and announced increased focus on offering a complete, integrated source-to-pay suite. Along with the name change, the company expanded its market focus to manufacturing, healthcare, consumer packaged goods, retail, education, life sciences, logistics and the public sector. JAGGAER acquired the European direct materials procurement specialist Pool4Tool in June 2017 giving it end-to-end direct as well as indirect materials procurement coverage. JAGGAER acquired spend management company BravoSolution in 2017, and entered into a joint venture with United Arab Emirates-based Tejari. In February 2019 JAGGAER launched JAGGAER One, which unifies its full product suite on a single platform. In 2019 the UK-based private equity firm Cinven acquired a majority holding in the company. Jim Bureau was subsequently named JAGGAER's Chief Executive Officer. Bureau left the firm in March 2023, and Andy Hovancik was announced as the company's CEO in June. In 2024, JAGGAER was acquired by Vista Equity Partners, a private equity firm specializing in enterprise software investments. == Current positioning == As of April 2025, JAGGAER positions itself as "an enterprise procurement and supplier collaboration SaaS provider." Its core technology platform, which is called JAGGAER One, serves "direct and indirect procurement with specializations in Higher Education, Discrete and Process Manufacturing, and Public Sector." == Product Categories == The JAGGAER One platform supports the following products: Spend Analytics Category Management Supplier Management Sourcing Contracts eProcurement Invoicing Inventory Management Supply Chain Collaboration Quality Management == Acquisitions == SciQuest acquired the following companies: AECsoft - January 2011. Provider of supplier management and sourcing technology. Upside Software, Inc. - August 2012. Provider of contract lifecycle management (CLM) solutions. Spend Radar, LLC - October 2012, Provider of spend analysis software. CombineNet - September 2013, Provider of advanced sourcing software JAGGAER acquired the following companies: POOL4TOOL - June 2017, Provider of direct sourcing and supply chain management software BravoSolution - December 2017, Provider of global platform spend management solutions

    Read more →
  • AI effect

    AI effect

    The AI effect is a phenomenon in which advances in artificial intelligence lead to a redefinition of what is considered intelligence, such that capabilities achieved by AI systems are no longer regarded as examples of "real" intelligence. The concept has been used to describe both a cognitive tendency and a sociotechnical pattern, in which successful AI techniques are reclassified as routine computation or absorbed into other domains. Historian Pamela McCorduck described this as a recurring feature of AI research, noting in her 2004 book Machines Who Think that once a problem is solved, it is no longer considered evidence of intelligence. Researcher Rodney Brooks similarly observed in 2002 that once systems are understood, they are often regarded as "just computation". == Definition == The AI effect refers to a shift in how intelligence is defined as machines acquire new capabilities. Tasks such as playing chess, recognizing speech, or interpreting images were historically considered indicators of intelligence, but after successful automation they are often reclassified as routine computation. McCorduck described this as an "odd paradox", in which successful AI systems are assimilated into other domains, leaving AI researchers to focus on unsolved problems. The phenomenon is often interpreted as an instance of moving the goalposts. A commonly cited formulation is Tesler's theorem, often expressed as "AI is whatever hasn't been done yet". When problems are not fully formalised, they may be described using models involving human computation, such as human-assisted Turing machines. == Historical examples == === Game playing === Early AI systems capable of playing games such as checkers and chess were initially regarded as demonstrations of machine intelligence. As these systems improved and became better understood, their achievements were often reinterpreted as examples of computation rather than intelligence. The victory of IBM's Deep Blue over Garry Kasparov in 1997 is a frequently cited example. Critics argued that the system relied on brute-force methods rather than genuine understanding. === Pattern recognition === Technologies such as optical character recognition and speech recognition were once considered core problems in artificial intelligence. As these systems became reliable and widely deployed, they were increasingly treated as standard engineering solutions. === Integration into applications === Many techniques originally developed within AI research have been incorporated into broader technological systems, including marketing, automation, and software applications. Michael Swaine reported in 2007 that AI advances are often presented as developments in other fields. Marvin Minsky observed that successful AI innovations often evolve into separate disciplines. Nick Bostrom noted in 2006 that widely adopted technologies are often no longer labeled as AI. == Contemporary discussion == The AI effect continues to be discussed in the context of recent advances in machine learning, particularly large language models and other generative AI systems. As these systems have become more widely used, some researchers and commentators have noted that their capabilities are frequently described as statistical or mechanical once understood, rather than as intelligence. A 2016 survey of artificial intelligence also noted that AI systems are increasingly embedded in everyday applications, reinforcing earlier observations that successful AI technologies tend to become normalized and no longer identified as AI. At the same time, the widespread commercial use of artificial intelligence has led to greater visibility of the field, contrasting with earlier periods in which AI techniques were often present but unacknowledged. == Interpretations == === Cognitive bias === Some authors describe the AI effect as a cognitive bias in which expectations of intelligence shift as machines achieve new capabilities. === Sociotechnical perspective === Another interpretation emphasizes how technologies are reclassified over time as they become widespread and commercially successful. === Philosophical debate === Some philosophers argue that reclassification reflects genuine conceptual distinctions rather than bias. == Historical context == During periods such as the AI winter, researchers sometimes avoided the term "artificial intelligence" due to negative perceptions. In the 21st century, however, the term "AI" has become widely used in public discourse and marketing. == Broader implications == The AI effect has been linked to broader questions about human uniqueness and the nature of intelligence. Michael Kearns suggested that people may seek to preserve a special role for humans. Similar patterns have been observed in studies of animal cognition. Herbert A. Simon noted that artificial intelligence can provoke strong emotional reactions.

    Read more →
  • Inverse depth parametrization

    Inverse depth parametrization

    In computer vision, the inverse depth parametrization is a parametrization used in methods for 3D reconstruction from multiple images such as simultaneous localization and mapping (SLAM). Given a point p {\displaystyle \mathbf {p} } in 3D space observed by a monocular pinhole camera from multiple views, the inverse depth parametrization of the point's position is a 6D vector that encodes the optical centre of the camera c 0 {\displaystyle \mathbf {c} _{0}} when in first observed the point, and the position of the point along the ray passing through p {\displaystyle \mathbf {p} } and c 0 {\displaystyle \mathbf {c} _{0}} . Inverse depth parametrization generally improves numerical stability and allows to represent points with zero parallax. Moreover, the error associated to the observation of the point's position can be modelled with a Gaussian distribution when expressed in inverse depth. This is an important property required to apply methods, such as Kalman filters, that assume normality of the measurement error distribution. The major drawback is the larger memory consumption, since the dimensionality of the point's representation is doubled. == Definition == Given 3D point p = ( x , y , z ) {\displaystyle \mathbf {p} =(x,y,z)} with world coordinates in a reference frame ( e 1 , e 2 , e 3 ) {\displaystyle (e_{1},e_{2},e_{3})} , observed from different views, the inverse depth parametrization y {\displaystyle \mathbf {y} } of p {\displaystyle \mathbf {p} } is given by: y = ( x 0 , y 0 , z 0 , θ , ϕ , ρ ) {\displaystyle \mathbf {y} =(x_{0},y_{0},z_{0},\theta ,\phi ,\rho )} where the first five components encode the camera pose in the first observation of the point, being c 0 = ( x 0 , y 0 , z 0 ) {\displaystyle \mathbf {c_{0}} =(x_{0},y_{0},z_{0})} the optical centre, ϕ {\displaystyle \phi } the azimuth, θ {\displaystyle \theta } the elevation angle, and ρ = 1 ‖ p − c 0 ‖ {\displaystyle \rho ={\frac {1}{\left\Vert \mathbf {p} -\mathbf {c} _{0}\right\Vert }}} the inverse depth of p {\displaystyle p} at the first observation.

    Read more →
  • Umoove

    Umoove

    Umoove is a high tech startup company that has developed and patented a software-only face and eye tracking technology. The idea was first conceived as an attempt to aid people with disabilities but has since evolved. The only compatibility qualification for tablet computers and smartphones to run Umoove software is a front-facing camera. Umoove headquarters are in Israel on Jerusalem’s Har Hotzvim. Umoove has 15 employees and received two million dollars in financing in 2012. The company's original founders invested around $800,000 to start the business in 2010. In 2013 Umoove was named one of the top three most promising Israeli start ups by Newsgeeks magazine. The company also participated in the 2013 LeWeb conference in Paris, France, where innovative technology startups are showcased. == Technology == The technology uses information extracted from previous frames, such as the angle of the user's head to predict where to look for facial targets in the next frame. This anticipation minimizes the amount of computation needed to scan each image. Umoove accounts for variances in environment, lighting conditions and user hand shake/movement. The technology is designed to provide a consistent experience, whether you're in a brightly lit area or a darkened basement, and to work fluidly between them by adapting its processing when it detects color and brightness shifts. It uses an active stabilization technique to filter out natural body movements from an unstable camera in order to minimize false-positive motion detection. Running the Umoove software on a Samsung Galaxy S3 is said to take up only 2% CPU. Umoove works exclusively with software and there is no hardware add-on necessary. It can be run on any smartphone or tablet computer that has a front-facing camera. Umoove claims that even a low-quality camera on an old device will run their software flawlessly. == Umoove Experience == In January 2014 Umoove released its first game onto the app store. The Umoove Experience game lets users control where they are 'flying' in the game through simple gestures and motions with their head. The avatar will basically go toward wherever the user looks. The game was created to showcase the technology for game developers but that did not stop some from criticizing its simplicity. Umoove also announced that they raised another one million dollars and that they are opening offices in Silicon Valley, California. In February 2014, Umoove announced that their face-tracking software development kit is available for Android developers as well as iOS. == Reviews == The Umoove Experience garnered mostly positive reviews from bloggers and mainstream media with some predicting that it could be the future of mobile gaming. Mashable wrote that Umoove's technology could be the emergence of gesture recognition technology in the mobile space, similar to Kinect with console gaming and what Leap Motion has done with desktop computers. Some, however, remain skeptical. CNET, for example, did not give the game a positive review and called the eye tracking technology 'freaky but cool'. They also noted that pioneering technologies have been known to fall short of expectations, citing Apple Inc’s Siri as an example. The technology blog GigaOM said that the Umoove Experience is ’awesome’ and technology evangelist Robert Scoble has called Umoove "brilliant". == uHealth == In January 2015, Umoove released uHealth, a mobile application that uses eye tracking game-like exercise to challenge the user's ability to be attentive, continuously focus, follow commands and avoid distractions. The app is designed in the form of two games, one to improve attention and another that hones focus. uHealth is a training tool, not a diagnostic. Umoove has stated that they want to use their technology for diagnosing neurological disorders but this will depend on clinical tests and FDA approval. The company cites the direct relationship between eye movements and brain activity as well as various vision-based therapies have been backed by many scientific studies conducted over the past decades. uHealth is the first time this type of therapy is delivered right to the end user through a simple download. == Collaboration rumors == In March 2013 there were rumors on the internet that Umoove would be the functioning software embedded into the Samsung Galaxy S4, which was due to launch that month. This rumor was perpetrated by, among others, New York Times, Techcrunch and Yahoo. Once Samsung launched without the Umoove technology rumors about a potential collaboration with Apple Inc hit the web. It has been said that due to the fact that Apple Inc is losing market share and stock value to Samsung they will be more aggressive and eye tracking is a logical place to make that move.

    Read more →
  • Open information extraction

    Open information extraction

    In natural language processing, open information extraction (OIE) is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary propositions. == Overview == A proposition can be understood as truth-bearer, a textual expression of a potential fact (e.g., "Dante wrote the Divine Comedy"), represented in an amenable structure for computers [e.g., ("Dante", "wrote", "Divine Comedy")]. An OIE extraction normally consists of a relation and a set of arguments. For instance, ("Dante", "passed away in" "Ravenna") is a proposition formed by the relation "passed away in" and the arguments "Dante" and "Ravenna". The first argument is usually referred as the subject while the second is considered to be the object. The extraction is said to be a textual representation of a potential fact because its elements are not linked to a knowledge base. Furthermore, the factual nature of the proposition has not yet been established. In the above example, transforming the extraction into a full fledged fact would first require linking, if possible, the relation and the arguments to a knowledge base. Second, the truth of the extraction would need to be determined. In computer science transforming OIE extractions into ontological facts is known as relation extraction. In fact, OIE can be seen as the first step to a wide range of deeper text understanding tasks such as relation extraction, knowledge-base construction, question answering, semantic role labeling. The extracted propositions can also be directly used for end-user applications such as structured search (e.g., retrieve all propositions with "Dante" as subject). OIE was first introduced by TextRunner developed at the University of Washington Turing Center headed by Oren Etzioni. Other methods introduced later such as Reverb, OLLIE, ClausIE or CSD helped to shape the OIE task by characterizing some of its aspects. At a high level, all of these approaches make use of a set of patterns to generate the extractions. Depending on the particular approach, these patterns are either hand-crafted or learned. == OIE systems and contributions == Reverb suggested the necessity to produce meaningful relations to more accurately capture the information in the input text. For instance, given the sentence "Faust made a pact with the devil", it would be erroneous to just produce the extraction ("Faust", "made", "a pact") since it would not be adequately informative. A more precise extraction would be ("Faust", "made a pact with", "the devil"). Reverb also argued against the generation of overspecific relations. OLLIE stressed two important aspects for OIE. First, it pointed to the lack of factuality of the propositions. For instance, in a sentence like "If John studies hard, he will pass the exam", it would be inaccurate to consider ("John", "will pass", "the exam") as a fact. Additionally, the authors indicated that an OIE system should be able to extract non-verb mediated relations, which account for significant portion of the information expressed in natural language text. For instance, in the sentence "Obama, the former US president, was born in Hawaii", an OIE system should be able to recognize a proposition ("Obama", "is", "former US president"). ClausIE introduced the connection between grammatical clauses, propositions, and OIE extractions. The authors stated that as each grammatical clause expresses a proposition, each verb mediated proposition can be identified by solely recognizing the set of clauses expressed in each sentence. This implies that to correctly recognize the set of propositions in an input sentence, it is necessary to understand its grammatical structure. The authors studied the case in the English language that only admits seven clause types, meaning that the identification of each proposition only requires defining seven grammatical patterns. The finding also established a separation between the recognition of the propositions and its materialization. In a first step, the proposition can be identified without any consideration of its final form, in a domain-independent and unsupervised way, mostly based on linguistic principles. In a second step, the information can be represented according to the requirements of the underlying application, without conditioning the identification phase. Consider the sentence "Albert Einstein was born in Ulm and died in Princeton". The first step will recognize the two propositions ("Albert Einstein", "was born", "in Ulm") and ("Albert Einstein", "died", "in Princeton"). Once the information has been correctly identified, the propositions can take the particular form required by the underlying application [e.g., ("Albert Einstein", "was born in", "Ulm") and ("Albert Einstein", "died in", "Princeton")]. CSD introduced the idea of minimality in OIE. It considers that computers can make better use of the extractions if they are expressed in a compact way. This is especially important in sentences with subordinate clauses. In these cases, CSD suggests the generation of nested extractions. For example, consider the sentence "The Embassy said that 6,700 Americans were in Pakistan". CSD generates two extractions [i] ("6,700 Americans", "were", "in Pakistan") and [ii] ("The Embassy", "said", "that [i]"). This is usually known as reification.

    Read more →
  • Relational data mining

    Relational data mining

    Relational data mining is the data mining technique for relational databases. Unlike traditional data mining algorithms, which look for patterns in a single table (propositional patterns), relational data mining algorithms look for patterns among multiple tables (relational patterns). For most types of propositional patterns, there are corresponding relational patterns. For example, there are relational classification rules (relational classification), relational regression tree, and relational association rules. There are several approaches to relational data mining: Inductive Logic Programming (ILP) Statistical Relational Learning (SRL) Graph Mining Propositionalization Multi-view learning == Algorithms == Multi-Relation Association Rules: Multi-Relation Association Rules (MRAR) is a new class of association rules which in contrast to primitive, simple and even multi-relational association rules (that are usually extracted from multi-relational databases), each rule item consists of one entity but several relations. These relations indicate indirect relationship between the entities. Consider the following MRAR where the first item consists of three relations live in, nearby and humid: “Those who live in a place which is near by a city with humid climate type and also are younger than 20 -> their health condition is good”. Such association rules are extractable from RDBMS data or semantic web data. == Software == Safarii: a Data Mining environment for analysing large relational databases based on a multi-relational data mining engine. Dataconda: a software, free for research and teaching purposes, that helps mining relational databases without the use of SQL. == Datasets == Relational dataset repository: a collection of publicly available relational datasets.

    Read more →
  • Eigenmoments

    Eigenmoments

    EigenMoments is a set of orthogonal, noise robust, invariant to rotation, scaling and translation and distribution sensitive moments. Their application can be found in signal processing and computer vision as descriptors of the signal or image. The descriptors can later be used for classification purposes. It is obtained by performing orthogonalization, via eigen analysis on geometric moments. == Framework summary == EigenMoments are computed by performing eigen analysis on the moment space of an image by maximizing signal-to-noise ratio in the feature space in form of Rayleigh quotient. This approach has several benefits in Image processing applications: Dependency of moments in the moment space on the distribution of the images being transformed, ensures decorrelation of the final feature space after eigen analysis on the moment space. The ability of EigenMoments to take into account distribution of the image makes it more versatile and adaptable for different genres. Generated moment kernels are orthogonal and therefore analysis on the moment space becomes easier. Transformation with orthogonal moment kernels into moment space is analogous to projection of the image onto a number of orthogonal axes. Nosiy components can be removed. This makes EigenMoments robust for classification applications. Optimal information compaction can be obtained and therefore a few number of moments are needed to characterize the images. == Problem formulation == Assume that a signal vector s ∈ R n {\displaystyle s\in {\mathcal {R}}^{n}} is taken from a certain distribution having correlation C ∈ R n × n {\displaystyle C\in {\mathcal {R}}^{n\times n}} , i.e. C = E [ s s T ] {\displaystyle C=E[ss^{T}]} where E[.] denotes expected value. Dimension of signal space, n, is often too large to be useful for practical application such as pattern classification, we need to transform the signal space into a space with lower dimensionality. This is performed by a two-step linear transformation: q = W T X T s , {\displaystyle q=W^{T}X^{T}s,} where q = [ q 1 , . . . , q n ] T ∈ R k {\displaystyle q=[q_{1},...,q_{n}]^{T}\in {\mathcal {R}}^{k}} is the transformed signal, X = [ x 1 , . . . , x n ] T ∈ R n × m {\displaystyle X=[x_{1},...,x_{n}]^{T}\in {\mathcal {R}}^{n\times m}} a fixed transformation matrix which transforms the signal into the moment space, and W = [ w 1 , . . . , w n ] T ∈ R m × k {\displaystyle W=[w_{1},...,w_{n}]^{T}\in {\mathcal {R}}^{m\times k}} the transformation matrix which we are going to determine by maximizing the SNR of the feature space resided by q {\displaystyle q} . For the case of Geometric Moments, X would be the monomials. If m = k = n {\displaystyle m=k=n} , a full rank transformation would result, however usually we have m ≤ n {\displaystyle m\leq n} and k ≤ m {\displaystyle k\leq m} . This is specially the case when n {\displaystyle n} is of high dimensions. Finding W {\displaystyle W} that maximizes the SNR of the feature space: S N R t r a n s f o r m = w T X T C X w w T X T N X w , {\displaystyle SNR_{transform}={\frac {w^{T}X^{T}CXw}{w^{T}X^{T}NXw}},} where N is the correlation matrix of the noise signal. The problem can thus be formulated as w 1 , . . . , w k = a r g m a x w w T X T C X w w T X T N X w {\displaystyle {w_{1},...,w_{k}}=argmax_{w}{\frac {w^{T}X^{T}CXw}{w^{T}X^{T}NXw}}} subject to constraints: w i T X T N X w j = δ i j , {\displaystyle w_{i}^{T}X^{T}NXw_{j}=\delta _{ij},} where δ i j {\displaystyle \delta _{ij}} is the Kronecker delta. It can be observed that this maximization is Rayleigh quotient by letting A = X T C X {\displaystyle A=X^{T}CX} and B = X T N X {\displaystyle B=X^{T}NX} and therefore can be written as: w 1 , . . . , w k = a r g m a x x w T A w w T B w {\displaystyle {w_{1},...,w_{k}}={\underset {x}{\operatorname {arg\,max} }}{\frac {w^{T}Aw}{w^{T}Bw}}} , w i T B w j = δ i j {\displaystyle w_{i}^{T}Bw_{j}=\delta _{ij}} === Rayleigh quotient === Optimization of Rayleigh quotient has the form: max w R ( w ) = max w w T A w w T B w {\displaystyle \max _{w}R(w)=\max _{w}{\frac {w^{T}Aw}{w^{T}Bw}}} and A {\displaystyle A} and B {\displaystyle B} , both are symmetric and B {\displaystyle B} is positive definite and therefore invertible. Scaling w {\displaystyle w} does not change the value of the object function and hence and additional scalar constraint w T B w = 1 {\displaystyle w^{T}Bw=1} can be imposed on w {\displaystyle w} and no solution would be lost when the objective function is optimized. This constraint optimization problem can be solved using Lagrangian multiplier: max w w T A w {\displaystyle \max _{w}{w^{T}Aw}} subject to w T B w = 1 {\displaystyle {w^{T}Bw}=1} max w L ( w ) = max w ( w T A w − λ w T B w ) {\displaystyle \max _{w}{\mathcal {L}}(w)=\max _{w}(w{T}Aw-\lambda w^{T}Bw)} equating first derivative to zero and we will have: A w = λ B w {\displaystyle Aw=\lambda Bw} which is an instance of Generalized Eigenvalue Problem (GEP). The GEP has the form: A w = λ B w {\displaystyle Aw=\lambda Bw} for any pair ( w , λ ) {\displaystyle (w,\lambda )} that is a solution to above equation, w {\displaystyle w} is called a generalized eigenvector and λ {\displaystyle \lambda } is called a generalized eigenvalue. Finding w {\displaystyle w} and λ {\displaystyle \lambda } that satisfies this equations would produce the result which optimizes Rayleigh quotient. One way of maximizing Rayleigh quotient is through solving the Generalized Eigen Problem. Dimension reduction can be performed by simply choosing the first components w i {\displaystyle w_{i}} , i = 1 , . . . , k {\displaystyle i=1,...,k} , with the highest values for R ( w ) {\displaystyle R(w)} out of the m {\displaystyle m} components, and discard the rest. Interpretation of this transformation is rotating and scaling the moment space, transforming it into a feature space with maximized SNR and therefore, the first k {\displaystyle k} components are the components with highest k {\displaystyle k} SNR values. The other method to look at this solution is to use the concept of simultaneous diagonalization instead of Generalized Eigen Problem. === Simultaneous diagonalization === Let A = X T C X {\displaystyle A=X^{T}CX} and B = X T N X {\displaystyle B=X^{T}NX} as mentioned earlier. We can write W {\displaystyle W} as two separate transformation matrices: W = W 1 W 2 . {\displaystyle W=W_{1}W_{2}.} W 1 {\displaystyle W_{1}} can be found by first diagonalize B: P T B P = D B {\displaystyle P^{T}BP=D_{B}} . Where D B {\displaystyle D_{B}} is a diagonal matrix sorted in increasing order. Since B {\displaystyle B} is positive definite, thus D B > 0 {\displaystyle D_{B}>0} . We can discard those eigenvalues that large and retain those close to 0, since this means the energy of the noise is close to 0 in this space, at this stage it is also possible to discard those eigenvectors that have large eigenvalues. Let P ^ {\displaystyle {\hat {P}}} be the first k {\displaystyle k} columns of P {\displaystyle P} , now P T ^ B P ^ = D B ^ {\displaystyle {\hat {P^{T}}}B{\hat {P}}={\hat {D_{B}}}} where D B ^ {\displaystyle {\hat {D_{B}}}} is the k × k {\displaystyle k\times k} principal submatrix of D B {\displaystyle D_{B}} . Let W 1 = P ^ D B ^ − 1 / 2 {\displaystyle W_{1}={\hat {P}}{\hat {D_{B}}}^{-1/2}} and hence: W 1 T B W 1 = ( P ^ D B ^ − 1 / 2 ) T B ( P ^ D B ^ − 1 / 2 ) = I {\displaystyle W_{1}^{T}BW_{1}=({\hat {P}}{\hat {D_{B}}}^{-1/2})^{T}B({\hat {P}}{\hat {D_{B}}}^{-1/2})=I} . W 1 {\displaystyle W_{1}} whiten B {\displaystyle B} and reduces the dimensionality from m {\displaystyle m} to k {\displaystyle k} . The transformed space resided by q ′ = W 1 T X T s {\displaystyle q'=W_{1}^{T}X^{T}s} is called the noise space. Then, we diagonalize W 1 T A W 1 {\displaystyle W_{1}^{T}AW_{1}} : W 2 T W 1 T A W 1 W 2 = D A {\displaystyle W_{2}^{T}W_{1}^{T}AW_{1}W_{2}=D_{A}} , where W 2 T W 2 = I {\displaystyle W_{2}^{T}W_{2}=I} . D A {\displaystyle D_{A}} is the matrix with eigenvalues of W 1 T A W 1 {\displaystyle W_{1}^{T}AW_{1}} on its diagonal. We may retain all the eigenvalues and their corresponding eigenvectors since most of the noise are already discarded in previous step. Finally the transformation is given by: W = W 1 W 2 {\displaystyle W=W_{1}W_{2}} where W {\displaystyle W} diagonalizes both the numerator and denominator of the SNR, W T A W = D A {\displaystyle W^{T}AW=D_{A}} , W T B W = I {\displaystyle W^{T}BW=I} and the transformation of signal s {\displaystyle s} is defined as q = W T X T s = W 2 T W 1 T X T s {\displaystyle q=W^{T}X^{T}s=W_{2}^{T}W_{1}^{T}X^{T}s} . === Information loss === To find the information loss when we discard some of the eigenvalues and eigenvectors we can perform following analysis: η = 1 − t r a c e ( W 1 T A W 1 ) t r a c e ( D B − 1 / 2 P T A P D B − 1 / 2 ) = 1 − t r a c e ( D B ^ − 1 / 2 P ^ T A P ^ D B ^ − 1 / 2 ) t r a c e ( D B − 1 / 2 P T A P D B − 1 / 2 ) {\displaystyle {\begin{array}{lll}\eta &=&

    Read more →
  • Semantic interpretation

    Semantic interpretation

    Semantic interpretation is an important component in dialog systems. It is related to natural language understanding, but mostly it refers to the last stage of understanding. The goal of interpretation is binding the user utterance to concept, or something the system can understand. Typically it is creating a database query based on user utterance.

    Read more →
  • Deep Learning Indaba

    Deep Learning Indaba

    The Deep Learning Indaba is an annual conference and educational event that aims to strengthen machine learning and artificial intelligence (AI) capacity across Africa. Launched in 2017, it brings together students, researchers, industry practitioners, and policymakers from across the African continent. == History == The Deep Learning Indaba began in 2017 at the University of the Witwatersrand with over 300 participants from 23 African countries, offering tutorials in advanced AI topics and featuring notable speakers like Nando de Freitas. In 2018, it expanded to 650 delegates at Stellenbosch University, introducing parallel sessions to encourage collaboration. The 2019 edition in Nairobi, Kenya, reflected further growth, with increasing sponsorship and support from major tech companies like Google and Microsoft. === Deep Learning IndabaX ===

    Read more →
  • You.com

    You.com

    You.com is an artificial intelligence search startup that has pivoted away from consumer search engine operations toward business-focused AI tools and APIs. The company was founded in 2020 by Richard Socher, the former chief scientist at Salesforce, and Bryan McCann, a former NLP researcher at Salesforce. == History == Following its 2020 founding, You.com opened its public beta on November 9, 2021, and received $20 million in funding led by Salesforce founder and CEO Marc Benioff. Other investors include Breyer Capital, Sound Ventures, and Day One Ventures. The domain You.com was initially purchased in 1996 by Benioff. Benioff invested in You.com and transferred ownership of the You.com domain name to the company. In July 2022, You.com announced its $25 million Series A funding round led by Radical Ventures with participation from Time Ventures, Breyer Capital, Norwest Venture Partners and Day One Ventures. In September 2024, You.com raised $50 million in Series B funding led by Georgian. In September 2025, You.com raised $100 million in Series C funding led by Cox Enterprises at a $1.5 billion valuation, achieving unicorn status. == Business model == You.com generates revenue primarily through enterprise sales of search APIs and AI tools. The platform provides web search capabilities that can be integrated into enterprise applications and AI agents. == Features == On December 23, 2022, You.com was the first search engine to launch an LLM chatbot with live web results alongside its responses. Initially known as YouChat, the chatbot was primarily based on the GPT-3.5 large language model and could answer questions, suggest ideas, translate text, summarize articles, compose emails, and write code snippets, while staying up-to-date with current events and citing sources. Several further versions of YouChat were released. The second version, called YouChat 2.0, was released on February 7, 2023, incorporated improved conversational AI and community-built applications by blending a large language model named C-A-L (Chat, Apps, and Links). This update enabled YouChat to provide results in various formats, such as charts, photos, videos, tables, graphs, text or code, so users can find answers without leaving the search results page. YouChat 3.0, unveiled on May 4, 2023, combined chat functionality with results from Reddit, TikTok, Stack Overflow and Wikipedia. === YouPro === On June 21, 2023, You.com introduced YouPro, a paid subscription. Both free and paid versions provide access to large language models connected to the internet with citation capabilities. === ARI === In February 2025, You.com launched ARI (Advanced Research and Insights), a deep research agent that scans over 400 sources simultaneously to produce research reports with verified citations and interactive graphs, charts, and visualizations. The platform targets regulated industries where comprehensive source verification is critical, with customers including healthcare publishers and advisory firms. == Reception == You.com was named one of TIME's Best Inventions of 2022. You.com's ARI (Advanced Research & Insights) feature was named one of TIME's Best Inventions of 2025.

    Read more →
  • Rule-based machine translation

    Rule-based machine translation

    Rule-based machine translation (RBMT) is a classical approach of machine translation systems based on linguistic information about source and target languages. Such information is retrieved from (unilingual, bilingual or multilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language. Having input sentences, an RBMT system generates output sentences on the basis of analysis of both the source and the target languages involved. RBMT has been progressively superseded by more efficient methods, particularly neural machine translation. == History == The first RBMT systems were developed in the early 1970s. The most important steps of this evolution were the emergence of the following RBMT systems: Systran Japanese MT systems Today, other common RBMT systems include: Apertium GramTrans == Types of RBMT == There are three different types of rule-based machine translation systems: Direct Systems (Dictionary Based Machine Translation) map input to output with basic rules. Transfer RBMT Systems (Transfer Based Machine Translation) employ morphological and syntactical analysis. Interlingual RBMT Systems (Interlingua) use an abstract meaning. RBMT systems can also be characterized as the systems opposite to Example-based Systems of Machine Translation (Example Based Machine Translation), whereas Hybrid Machine Translations Systems make use of many principles derived from RBMT. == Basic principles == The main approach of RBMT systems is based on linking the structure of the given input sentence with the structure of the demanded output sentence, necessarily preserving their unique meaning. The following example can illustrate the general frame of RBMT: A girl eats an apple. Source Language = English; Demanded Target Language = German Minimally, to get a German translation of this English sentence one needs: A dictionary that will map each English word to an appropriate German word. Rules representing regular English sentence structure. Rules representing regular German sentence structure. And finally, we need rules according to which one can relate these two structures together. Accordingly, we can state the following stages of translation: 1st: getting basic part-of-speech information of each source word: a = indef.article; girl = noun; eats = verb; an = indef.article; apple = noun 2nd: getting syntactic information about the verb "to eat": NP-eat-NP; here: eat – Present Simple, 3rd Person Singular, Active Voice 3rd: parsing the source sentence: (NP an apple) = the object of eat Often only partial parsing is sufficient to get to the syntactic structure of the source sentence and to map it onto the structure of the target sentence. 4th: translate English words into German a (category = indef.article) => ein (category = indef.article) girl (category = noun) => Mädchen (category = noun) eat (category = verb) => essen (category = verb) an (category = indef. article) => ein (category = indef.article) apple (category = noun) => Apfel (category = noun) 5th: Mapping dictionary entries into appropriate inflected forms (final generation): A girl eats an apple. => Ein Mädchen isst einen Apfel. == Ontologies == An ontology is a formal representation of knowledge that includes the concepts (such as objects, processes etc.) in a domain and some relations between them. If the stored information is of linguistic nature, one can speak of a lexicon. In NLP, ontologies can be used as a source of knowledge for machine translation systems. With access to a large knowledge base, rule-based systems can be enabled to resolve many (especially lexical) ambiguities on their own. In the following classic examples, as humans, we are able to interpret the prepositional phrase according to the context because we use our world knowledge, stored in our lexicons:I saw a man/star/molecule with a microscope/telescope/binoculars.Since the syntax does not change, a traditional rule-based machine translation system may not be able to differentiate between the meanings. With a large enough ontology as a source of knowledge however, the possible interpretations of ambiguous words in a specific context can be reduced. === Building ontologies === The ontology generated for the PANGLOSS knowledge-based machine translation system in 1993 may serve as an example of how an ontology for NLP purposes can be compiled: A large-scale ontology is necessary to help parsing in the active modules of the machine translation system. In the PANGLOSS example, about 50,000 nodes were intended to be subsumed under the smaller, manually-built upper (abstract) region of the ontology. Because of its size, it had to be created automatically. The goal was to merge the two resources LDOCE online and WordNet to combine the benefits of both: concise definitions from Longman, and semantic relations allowing for semi-automatic taxonomization to the ontology from WordNet. A definition match algorithm was created to automatically merge the correct meanings of ambiguous words between the two online resources, based on the words that the definitions of those meanings have in common in LDOCE and WordNet. Using a similarity matrix, the algorithm delivered matches between meanings including a confidence factor. This algorithm alone, however, did not match all meanings correctly on its own. A second hierarchy match algorithm was therefore created which uses the taxonomic hierarchies found in WordNet (deep hierarchies) and partially in LDOCE (flat hierarchies). This works by first matching unambiguous meanings, then limiting the search space to only the respective ancestors and descendants of those matched meanings. Thus, the algorithm matched locally unambiguous meanings (for instance, while the word seal as such is ambiguous, there is only one meaning of seal in the animal subhierarchy). Both algorithms complemented each other and helped constructing a large-scale ontology for the machine translation system. The WordNet hierarchies, coupled with the matching definitions of LDOCE, were subordinated to the ontology's upper region. As a result, the PANGLOSS MT system was able to make use of this knowledge base, mainly in its generation element. == Components == The RBMT system contains: a SL morphological analyser - analyses a source language word and provides the morphological information; a SL parser - is a syntax analyser which analyses source language sentences; a translator - used to translate a source language word into the target language; a TL morphological generator - works as a generator of appropriate target language words for the given grammatica information; a TL parser - works as a composer of suitable target language sentences; Several dictionaries - more specifically a minimum of three dictionaries: a SL dictionary - needed by the source language morphological analyser for morphological analysis, a bilingual dictionary - used by the translator to translate source language words into target language words, a TL dictionary - needed by the target language morphological generator to generate target language words. The RBMT system makes use of the following: a Source Grammar for the input language which builds syntactic constructions from input sentences; a Source Lexicon which captures all of the allowable vocabulary in the domain; Source Mapping Rules which indicate how syntactic heads and grammatical functions in the source language are mapped onto domain concepts and semantic roles in the interlingua; a Domain Model/Ontology which defines the classes of domain concepts and restricts the fillers of semantic roles for each class; Target Mapping Rules which indicate how domain concepts and semantic roles in the interlingua are mapped onto syntactic heads and grammatical functions in the target language; a Target Lexicon which contains appropriate target lexemes for each domain concept; a Target Grammar for the target language which realizes target syntactic constructions as linearized output sentences. == Advantages == No bilingual texts are required. This makes it possible to create translation systems for languages that have no texts in common, or even no digitized data whatsoever. Domain independent. Rules are usually written in a domain independent manner, so the vast majority of rules will "just work" in every domain, and only a few specific cases per domain may need rules written for them. No quality ceiling. Every error can be corrected with a targeted rule, even if the trigger case is extremely rare. This is in contrast to statistical systems where infrequent forms will be washed away by default. Total control. Because all rules are hand-written, you can easily debug a rule-based system to see exactly where a given error enters the system, and why. Reusability. Because RBMT systems are generally built from a strong source language analysis that is fed to a transfer step and target language generator, the source language analysis and targe

    Read more →
  • Multi-agent reinforcement learning

    Multi-agent reinforcement learning

    Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics. Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems. Its study combines the pursuit of finding ideal algorithms that maximize rewards with a more sociological set of concepts. While research in single-agent reinforcement learning is concerned with finding the algorithm that gets the biggest number of points for one agent, research in multi-agent reinforcement learning evaluates and quantifies social metrics, such as cooperation, reciprocity, equity, social influence, language and discrimination. == Definition == Similarly to single-agent reinforcement learning, multi-agent reinforcement learning is modeled as some form of a Markov decision process (MDP). Fix a set of agents I = { 1 , . . . , N } {\displaystyle I=\{1,...,N\}} . We then define: A set S {\displaystyle S} of environment states. One set A i {\displaystyle {\mathcal {A}}_{i}} of actions for each of the agents i ∈ I = { 1 , … , N } {\displaystyle i\in I=\{1,\dots ,N\}} . P a → ( s , s ′ ) = Pr ( s t + 1 = s ′ ∣ s t = s , a → t = a → ) {\displaystyle P_{\vec {a}}(s,s')=\Pr(s_{t+1}=s'\mid s_{t}=s,{\vec {a}}_{t}={\vec {a}})} is the probability of transition (at time t {\displaystyle t} ) from state s {\displaystyle s} to state s ′ {\displaystyle s'} under joint action a → {\displaystyle {\vec {a}}} . R → a → ( s , s ′ ) {\displaystyle {\vec {R}}_{\vec {a}}(s,s')} is the immediate joint reward after the transition from s {\displaystyle s} to s ′ {\displaystyle s'} with joint action a → {\displaystyle {\vec {a}}} . In settings with perfect information, such as the games of chess and Go, the MDP would be fully observable. In settings with imperfect information, especially in real-world applications like self-driving cars, each agent would access an observation that only has part of the information about the current state. In the partially observable setting, the core model is the partially observable stochastic game in the general case, and the decentralized POMDP in the cooperative case. == Cooperation vs. competition == When multiple agents are acting in a shared environment their interests might be aligned or misaligned. MARL allows exploring all the different alignments and how they affect the agents' behavior: In pure competition settings, the agents' rewards are exactly opposite to each other, and therefore they are playing against each other. Pure cooperation settings are the other extreme, in which agents get the exact same rewards, and therefore they are playing with each other. Mixed-sum settings cover all the games that combine elements of both cooperation and competition. === Pure competition settings === When two agents are playing a zero-sum game, they are in pure competition with each other. Many traditional games such as chess and Go fall under this category, as do two-player variants of video games like StarCraft. Because each agent can only win at the expense of the other agent, many complexities are stripped away. There is no prospect of communication or social dilemmas, as neither agent is incentivized to take actions that benefit its opponent. The Deep Blue and AlphaGo projects demonstrate how to optimize the performance of agents in pure competition settings. One complexity that is not stripped away in pure competition settings is autocurricula. As the agents' policy is improved using self-play, multiple layers of learning may occur. === Pure cooperation settings === MARL is used to explore how separate agents with identical interests can communicate and work together. Pure cooperation settings are explored in recreational cooperative games such as Overcooked, as well as real-world scenarios in robotics. In pure cooperation settings all the agents get identical rewards, which means that social dilemmas do not occur. In pure cooperation settings, oftentimes there are an arbitrary number of coordination strategies, and agents converge to specific "conventions" when coordinating with each other. The notion of conventions has been studied in language and also alluded to in more general multi-agent collaborative tasks. === Mixed-sum settings === Most real-world scenarios involving multiple agents have elements of both cooperation and competition. For example, when multiple self-driving cars are planning their respective paths, each of them has interests that are diverging but not exclusive: Each car is minimizing the amount of time it's taking to reach its destination, but all cars have the shared interest of avoiding a traffic collision. Zero-sum settings with three or more agents often exhibit similar properties to mixed-sum settings, since each pair of agents might have a non-zero utility sum between them. Mixed-sum settings can be explored using classic matrix games such as prisoner's dilemma, more complex sequential social dilemmas, and recreational games such as Among Us, Diplomacy and StarCraft II. Mixed-sum settings can give rise to communication and social dilemmas. == Social dilemmas == As in game theory, much of the research in MARL revolves around social dilemmas, such as prisoner's dilemma, chicken and stag hunt. While game theory research might focus on Nash equilibria and what an ideal policy for an agent would be, MARL research focuses on how the agents would learn these ideal policies using a trial-and-error process. The reinforcement learning algorithms that are used to train the agents are maximizing the agent's own reward; the conflict between the needs of the agents and the needs of the group is a subject of active research. Various techniques have been explored in order to induce cooperation in agents: Modifying the environment rules, adding intrinsic rewards, and more. === Sequential social dilemmas === Social dilemmas like prisoner's dilemma, chicken and stag hunt are "matrix games". Each agent takes only one action from a choice of two possible actions, and a simple 2x2 matrix is used to describe the reward that each agent will get, given the actions that each agent took. In humans and other living creatures, social dilemmas tend to be more complex. Agents take multiple actions over time, and the distinction between cooperating and defecting is not as clear cut as in matrix games. The concept of a sequential social dilemma (SSD) was introduced in 2017 as an attempt to model that complexity. There is ongoing research into defining different kinds of SSDs and showing cooperative behavior in the agents that act in them. == Autocurricula == An autocurriculum (plural: autocurricula) is a reinforcement learning concept that's salient in multi-agent experiments. As agents improve their performance, they change their environment; this change in the environment affects themselves and the other agents. The feedback loop results in several distinct phases of learning, each depending on the previous one. The stacked layers of learning are called an autocurriculum. Autocurricula are especially apparent in adversarial settings, where each group of agents is racing to counter the current strategy of the opposing group. The Hide and Seek game is an accessible example of an autocurriculum occurring in an adversarial setting. In this experiment, a team of seekers is competing against a team of hiders. Whenever one of the teams learns a new strategy, the opposing team adapts its strategy to give the best possible counter. When the hiders learn to use boxes to build a shelter, the seekers respond by learning to use a ramp to break into that shelter. The hiders respond by locking the ramps, making them unavailable for the seekers to use. The seekers then respond by "box surfing", exploiting a glitch in the game to penetrate the shelter. Each "level" of learning is an emergent phenomenon, with the previous level as its premise. This results in a stack of behaviors, each dependent on its predecessor. Autocurricula in reinforcement learning experiments are compared to the stages of the evolution of life on Earth and the development of human culture. A major stage in evolution happened 2-3 billion years ago, when photosynthesizing life forms started to produce massive amounts of oxygen, changing the balance of gases in the atmosphere. In the next stages of evolution, oxygen-breathing life forms evolved, eventually leading up to land mammals and human beings. These later stages could only happen after the photosynthesis stage made oxygen widely available. Similarly, human culture could not have gone through the Industrial Revolution in the 18th century without the resources and insights gaine

    Read more →