AI Detector No Character Limit

AI Detector No Character Limit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Statistical relational learning

    Statistical relational learning

    Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational structure. Typically, the knowledge representation formalisms developed in SRL use (a subset of) first-order logic to describe relational properties of a domain in a general manner (universal quantification) and draw upon probabilistic graphical models (such as Bayesian networks or Markov networks) to model the uncertainty; some also build upon the methods of inductive logic programming. Significant contributions to the field have been made since the late 1990s. As is evident from the characterization above, the field is not strictly limited to learning aspects; it is equally concerned with reasoning (specifically probabilistic inference) and knowledge representation. Therefore, alternative terms that reflect the main foci of the field include statistical relational learning and reasoning (emphasizing the importance of reasoning) and first-order probabilistic languages (emphasizing the key properties of the languages with which models are represented). Another term that is sometimes used in the literature is relational machine learning (RML). == Canonical tasks == A number of canonical tasks are associated with statistical relational learning, the most common ones being. collective classification, i.e. the (simultaneous) prediction of the class of several objects given objects' attributes and their relations link prediction, i.e. predicting whether or not two or more objects are related link-based clustering, i.e. the grouping of similar objects, where similarity is determined according to the links of an object, and the related task of collaborative filtering, i.e. the filtering for information that is relevant to an entity (where a piece of information is considered relevant to an entity if it is known to be relevant to a similar entity) social network modelling object identification/entity resolution/record linkage, i.e. the identification of equivalent entries in two or more separate databases/datasets == Representation formalisms == One of the fundamental design goals of the representation formalisms developed in SRL is to abstract away from concrete entities and to represent instead general principles that are intended to be universally applicable. Since there are countless ways in which such principles can be represented, many representation formalisms have been proposed in recent years. In the following, some of the more common ones are listed in alphabetical order: Bayesian logic program BLOG model Markov logic networks Multi-entity Bayesian network Probabilistic logic programs Probabilistic relational model – a Probabilistic Relational Model (PRM) is the counterpart of a Bayesian network in statistical relational learning. Probabilistic soft logic Recursive random field Relational Bayesian network Relational dependency network Relational Markov network Relational Kalman filtering

    Read more →
  • Computational theory of mind

    Computational theory of mind

    In philosophy of mind, the computational theory of mind (CTM), also known as computationalism, is a family of views that hold that the human mind is an information processing system and that cognition and consciousness together are a form of computation. It is closely related to functionalism, a broader theory that defines mental states by what they do rather than what they are made of. == History == Warren McCulloch and Walter Pitts (1943) were the first to suggest that neural activity is computational. They argued that neural computations explain cognition. A version of the theory was put forward by Peter Putnam and Robert W. Fuller in 1964. The theory was proposed in its modern form by Hilary Putnam in 1960 and 1961, aided by his then PhD student, philosopher and cognitive scientist Jerry Fodor, who continued the research as a post-doc in the 1960s, 1970s, and 1980s. It was later criticized by Putnam himself, John Searle, and others. == Classical computational theory of mind == The CTM holds that the human mind is a computational system that is realized (i.e., physically implemented) by neural activity in the brain. The theory can be elaborated in many ways and varies largely based on how the term computation is understood. In classical computational theory of mind (CCTM), computation is modeled in terms of Turing machines which manipulate symbols according to a rule, in combination with the internal state of the machine. A Turing machine is an abstract machine with unlimited time and storage. CCTM does not pretend that the mind looks like a Turing machine, but instead uses Turing machines as a formalism. Alan Turing argued that any symbolic algorithm executed by a human brain can in theory be replicated on a Turing machine. The critical aspect of such a computational model is that it allows to abstract away from particular physical details of the machine that is implementing the computation. For example, the appropriate computation could be implemented either by silicon chips or biological neural networks, so long as there is a series of outputs based on manipulations of inputs and internal states, performed according to a rule. Computational theories of mind are often said to require mental representation because 'input' into a computation comes in the form of symbols or representations of other objects. A computer cannot compute an actual object but must interpret and represent the object in some form and then compute the representation. Unlike CTM, the representational theory of mind shifts the focus to the symbols being manipulated. This approach better accounts for systematicity and productivity. In Fodor's view, the mind is a computational system that processes the language of thought. == Variants == Connectionist computationalism models the mind as a neural network. Steven Pinker and Alan Prince distinguish two types of connectionists: eliminative and implementationist. Eliminative connectionists generally reject classical CTMs and the idea of a structured, symbolic mind, whereas implementationists view neural networks and Turing machines as two potentially complementary levels of analysis. It is indeed possible in theory to implement a neural network in a Turing machine, or a Turing machine in a neural network. Building from the tradition of McCulloch and Pitts, the computational theory of cognition (CTC) states that neural computations explain cognition. The computational theory of mind asserts that not only cognition, but also phenomenal consciousness or qualia, are computational. That is to say, CTM entails CTC. While phenomenal consciousness could fulfill some other functional role, computational theory of cognition leaves open the possibility that some aspects of the mind could be non-computational. CTC, therefore, provides an important explanatory framework for understanding neural networks, while avoiding counter-arguments that center around phenomenal consciousness. == "Computer metaphor" == Computational theory of mind is not the same as the computer metaphor, comparing the mind to a modern-day digital computer. While the computer metaphor draws an analogy between the mind as software and the brain as hardware, CTM is the claim that the mind is literally a computational system. "Computational system" is not intended to mean a modern-day electronic computer. == Pancomputationalism == CTM raises a question that remains a subject of debate: what does it take for a physical system (such as a mind, or an artificial computer) to perform computations? A very straightforward account is based on a simple mapping between abstract mathematical computations and physical systems: a system performs computation C if and only if there is a mapping between a sequence of states individuated by C and a sequence of states individuated by a physical description of the system. Putnam (1988) and Searle (1992) argue that this simple mapping account (SMA) trivializes the empirical import of computational descriptions. As Putnam put it, "everything is a Probabilistic Automaton under some Description". Even rocks, walls, and buckets of water—contrary to appearances—are computing systems. Gualtiero Piccinini identifies different versions of pancomputationalism. Searle wrote:the wall behind my back is right now implementing the WordStar program, because there is some pattern of molecule movements that is isomorphic with the formal structure of WordStar. But if the wall is implementing WordStar, if it is a big enough wall it is implementing any program, including any program implemented in the brain.In response to the trivialization criticism, and to restrict SMA, philosophers of mind have offered different accounts of computational systems. These typically include causal account, semantic account, syntactic account, and mechanistic account. Instead of a semantic restriction, the syntactic account imposes a syntactic restriction. The mechanistic account was first introduced by Gualtiero Piccinini in 2007. == Criticism == A range of arguments have been proposed against physicalist conceptions used in computational theories of mind. An early, though indirect, criticism of the computational theory of mind comes from philosopher John Searle. In his thought experiment known as the Chinese room, Searle attempts to refute the claims that artificially intelligent agents can be said to have intentionality and understanding and that these systems, because they can be said to be minds themselves, are sufficient for the study of the human mind. Searle asks us to imagine that there is a man in a room with no way of communicating with anyone or anything outside of the room except for a piece of paper with symbols written on it that is passed under the door. With the paper, the man is to use a series of provided rule books to return paper containing different symbols. Unknown to the man in the room, these symbols are of a Chinese language, and this process generates a conversation that a Chinese speaker outside of the room can actually understand. Searle contends that the man in the room does not understand the Chinese conversation. This was originally written as a repudiation of the idea that computers work like minds. Objections like Searle's might be called insufficiency objections. They claim that computational theories of mind fail because computation is insufficient to account for some capacity of the mind. Arguments from qualia, such as Frank Jackson's knowledge argument, can be understood as objections to computational theories of mind in this way—though they take aim at physicalist conceptions of the mind in general, and not computational theories specifically. Objections have also been put forth that are directly tailored for computational theories of mind. Jerry Fodor himself argues that the mind is still a very long way from having been explained by the computational theory of mind. The main reason for this shortcoming is that most cognition is abductive and global, hence sensitive to all possibly relevant background beliefs to (dis)confirm a belief. This creates, among other problems, the frame problem for the computational theory, because the relevance of a belief is not one of its local, syntactic properties but context-dependent. Putnam himself (see in particular Representation and Reality and the first part of Renewing Philosophy) became a prominent critic of computationalism for a variety of reasons, including ones related to Searle's Chinese room arguments, questions of world-word reference relations, and thoughts about the mind-body problem. Regarding functionalism in particular, Putnam has claimed along lines similar to, but more general than Searle's arguments, that the question of whether the human mind can implement computational states is not relevant to the question of the nature of mind, because "every ordinary open system realizes every abstract finite automaton." Computationalists have responded by aiming to develop criteri

    Read more →
  • Parents & Kids Safe AI Coalition

    Parents & Kids Safe AI Coalition

    The Parents & Kids Safe AI Coalition is a political action committee that advocates for regulation of artificial intelligence on child safety. As of April 2026, the group is funded solely by the artificial intelligence company OpenAI, which pledged $10 million to the effort. == History == In October 2025, California Gov. Gavin Newsom vetoed Assembly Bill 1064. Sponsored by Common Sense Media, the bill would have introduced stronger child safety protections for AI chatbots. The following month, Common Sense Media founder Jim Steyer filed a ballot initiative intended to restore the "guardrails" lost in the veto. In response, OpenAI introduced a competing initiative. In January 2026, Common Sense Media and OpenAI announced that they would be working together on a compromise ballot initiative, the Parents & Kids Safe AI Act. Reporting indicated that initial outreach emails to child safety organizations failed to disclose OpenAI's involvement. Several advocacy groups signed an open letter claiming the initiative would shield AI companies from liability and undermine age verification, among other concerns. After Common Sense Media met with opposing groups in February, the ballot initiative was put on hold and the organizations involved sought to negotiate with the Legislature instead. The Parents & Kids Safe AI Coalition was founded to support this effort. In March 2026, the group reached out to some of the same groups contacted earlier, asking them to endorse its list of policy priorities. Again, some organizations reported being unaware of OpenAI's level of involvement. At least two groups withdrew from the coalition after learning about the financial ties. The priorities themselves were described as "vague but fairly uncontroversial" by The San Francisco Standard.

    Read more →
  • Vivid knowledge

    Vivid knowledge

    Vivid knowledge refers to a specific kind of knowledge representation. The idea of a vivid knowledge base is to get an interpretation mostly straightforward out of it – it implies the interpretation. Thus, any query to such a knowledge base can be reduced to a database-like query. == Propositional knowledge base == A propositional knowledge base KB is vivid iff KB is a complete and consistent set of literals (over some vocabulary). Such a knowledge base has the property that it as exactly one interpretation, i.e. the interpretation is unique. A check for entailment of a sentence can simply be broken down into its literals and those can be answered by a simple database-like check of KB. == First-order knowledge base == A first-order knowledge base KB is vivid iff for some finite set of positive function-free ground literals KB+, KB = KB+ ∪ Negations ∪ DomainClosure ∪ UniqueNames, whereby Negations ≔ { ¬p | p is atomic and KB ⊭ p }, DomainClosure ≔ { (ci ≠ cj) | ci, cj are distinct constants }, UniqueNames ≔ { ∀x: (x = c1) ∨ (x = c2) ∨ ..., where the ci are all the constants in KB+ }. All interpretations of a vivid first-order knowledge base are isomorphic.

    Read more →
  • ActivityPub

    ActivityPub

    ActivityPub is a protocol and open standard for decentralized social networking. It provides a client-to-server (C2S) API for creating and modifying content, as well as a federated server-to-server (S2S) protocol for delivering notifications and content to other servers. ActivityPub is the defining standard of the Fediverse, a decentralised social network of various social interaction models, and content types, which consists of independently managed instances of software such as Mastodon, Pixelfed and PeerTube, among others. ActivityPub is considered to be an update to the ActivityPump protocol used in pump.io, and the official W3C repository for ActivityPub is identified as a fork of ActivityPump. The creation of a new standard for decentralized social networking was prompted by the complexity of OStatus, the most commonly used protocol at the time. OStatus was built using a multitude of technologies (such as Atom, Salmon, WebSub and WebFinger), a product of the infrastructure used in GNU social (the originator and largest user of the OStatus protocol), which made it difficult to implement the protocol into new software. OStatus was also only designed to work with microblogging services, with little flexibility to the types of data that it could hold. The standard was first published by the World Wide Web Consortium (W3C) as a W3C Recommendation in January 2018 by the Social Web Working Group (SocialWG), a working group chartered to build the protocols and vocabularies needed to create a standard for social functionality. Shortly after, further development was moved to the Social Web Community Group (SocialCG), the successor to the SocialWG. == Design == ActivityPub uses the ActivityStreams 2.0 format for building its content, which itself uses JSON-LD. The three main data types used in ActivityPub are Objects, Activities and Actors. Objects are the most common data type, and can be images, videos, or more abstract items such as locations or events. Activities are actions that create and modify objects, for example a Create activity creates an object. Actors are representative of an individual, a group, an application or a service, and are the owners of objects. Every actor type contains an inbox and outbox stream, which sends and receives activities for a user. In order to publish data (for example liking an article), a user creates an activity that declares that they liked an Article object and publishes it to their outbox, where it is then delivered by the ActivityPub server via a POST request to the inboxes listed in the activity's to, bto, cc and bcc fields. The receiving servers then account for the newly received activity and update the article by adding the like action to it. === Example data === An example actor object that represents a user account: An example activity that likes an article object: An example article object: == Project status == The SocialCG previously organized a yearly free conference called ActivityPub Conf about the future of ActivityPub. Triages are held regularly to review issues pertaining to the ActivityPub and ActivityStreams 2.0 specifications as part of the SocialCG. In 2023, Germany's Sovereign Tech Fund donated €152,000 to socialweb.coop with the goal of building a new suite for testing various ActivityPub implementations and their compliance with the specification. === Adoption === The initial wave of adoption for ActivityPub (circa 2016–2018) came from software that was already using OStatus as their federation protocol, such as Mastodon, GNU social and Pleroma. Following the acquisition of Twitter by Elon Musk in 2022, many groups of users that were critical of the acquisition migrated to Mastodon, bringing new attention to the ActivityPub protocol with it. Various major social media platforms and corporations have since pledged to implement ActivityPub support, including Tumblr, Flipboard and Meta Platforms' Threads. Threads introduced crossposting to ActivityPub in 2024 for users outside of the European Economic Area, however full 2-way compatibility remains incomplete as of 2025. == Criticism == === Accidental denial-of-service attacks === Poorly optimized ActivityPub implementations can cause unintentional distributed denial-of-service (DDOS) attacks on other websites and servers, due to the decentralized nature of the network. An example would be Mastodon's implementation of OpenGraph link previews, wherein every instance that receives a post that contains a link with OpenGraph metadata will download the associated data, such as a thumbnail, in a very short timeframe, which can slow down or crash servers as a result of the sudden burst of requests. === Account migration === ActivityPub has been criticized for not natively supporting moving accounts from one server to another, forcing implementations to build their own solutions. While there has been work on building a standardized system for migrating accounts using the Move activity via the Fediverse Enhancement Proposal organization, the current proposal only allows for basic follower migration, with all other data remaining linked to the original account. === Missing content and data === ActivityPub implementations have been criticized for missing replies and parts of reply threads from remote posts, and presenting outdated statistics (e.g. likes and reposts) about remote posts. However, this isn't a problem with the ActivityPub protocol itself, but with implementations not refreshing their content for updated data when needed. == Software using ActivityPub == === Future implementations === Flarum, an internet forum software Forgejo, a Git forge and development platform === Uncertain future implementations === GitLab, a Git forge and development platform which had previously had an open issue discussing the topic, but was later closed due to the development team moving focus to other areas. Tumblr, a microblogging platform. Despite previous statements from Automattic CEO Matt Mullenweg, ActivityPub integration has been delayed indefinitely. The integration would have been implemented with its WordPress migration, as the first-party plugin for interoperability would have been used for federation. Flickr, an image and video hosting site.

    Read more →
  • Computational Intelligence (journal)

    Computational Intelligence (journal)

    Computational Intelligence Journal is a peer-reviewed scientific journal covering research on artificial intelligence and computer science. The journal published novel research as well as innovative applications in a broad range of AI, covering Computational Intelligence is an artificial intelligence journal publishing novel research on a broad range of experimental and theoretical topics in AI and computer science. With a broad scope, the journal covers machine learning, knowledge mining, web intelligence, AI language, and philosophical implications. The journal was established in 1985 and is published by Wiley-Blackwell. Currently, the editors-in-chief is Diane Inkpen. The quality of the journal as an academic publishing venue is evaluated according to public citation impact metrics. in 2022, the Computational Intelligence Journal CiteScore of Scopus was 5.3, while Clarivate's Web of Science gives it 0.39 in the Journal Citation Indicator and 2,8 in the Journal Impact Factor.

    Read more →
  • Qualification problem

    Qualification problem

    In philosophy and AI (especially, knowledge-based systems), the qualification problem is concerned with the impossibility of listing all the preconditions required for a real-world action to have its intended effect. It might be posed as how to deal with the things that prevent me from achieving my intended result. It is strongly connected to, and opposite the ramification side of, the frame problem. John McCarthy gives the following motivating example, in which it is impossible to enumerate all the circumstances that may prevent a robot from performing its ordinary function: [T]he successful use of a boat to cross a river requires, if the boat is a rowboat, that the oars and rowlocks be present and unbroken, and that they fit each other. Many other qualifications can be added, making the rules for using a rowboat almost impossible to apply, and yet anyone will still be able to think of additional requirements not yet stated.

    Read more →
  • TD-Gammon

    TD-Gammon

    TD-Gammon is a computer backgammon program developed in the 1990s by Gerald Tesauro at IBM's Thomas J. Watson Research Center. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-Lambda. It explored strategies that humans had not pursued and led to advances in the theory of correct backgammon play. In 1993, TD-Gammon (version 2.1) was trained with 1.5 million games of self-play, and achieved a level of play just slightly below that of the top human backgammon players of the time. In 1998, during a 100-game series, it was defeated by the world champion by a mere margin of 8 points. Its unconventional assessment of some opening strategies had been accepted and adopted by expert players. TD-gammon is commonly cited as an early success of reinforcement learning and neural networks, and was cited in, for example, papers for deep Q-learning and AlphaGo. == Algorithm for play and learning == During play, TD-Gammon examines on each turn all possible legal moves and all their possible responses (lookahead search), feeds each resulting board position into its evaluation function, and chooses the move that leads to the board position that got the highest score. In this respect, TD-Gammon is no different than almost any other computer board-game program. TD-Gammon's innovation was in how it learned its evaluation function. TD-Gammon's learning algorithm consists of updating the weights in its neural net after each turn to reduce the difference between its evaluation of previous turns' board positions and its evaluation of the present turn's board position—hence "temporal-difference learning". The score of any board position is a set of four numbers reflecting the program's estimate of the likelihood of each possible game result: White wins normally, Black wins normally, White wins a gammon, Black wins a gammon. For the final board position of the game, the algorithm compares with the actual result of the game rather than its own evaluation of the board position. The core of TD-gammon is a neural network with 3 layers. The input layer has two types of neurons. One type codes for the board position. They are non-negative integers ranging from 0 to 15, indicating the number of White or Black checkers at each board location. There are 99 input neurons for each, totaling 198 neurons. Another type codes for hand-crafted features previously used in Neurogammon. These features encoded standard concepts used by human experts, such as "advanced anchor," "blockade strength," "home board strength" and the probability of a "blot" (single checker) being hit. The hidden layer contains hidden neurons. Later versions had more of these. The output layer contains 4 neurons, representing the network's estimate of the probability ("equity") that the current board would lead to. The 4 neurons code for: White normal win, White gammon win, Black normal win, Black gammon win. Backgammon win is so rare that Tesauro opted to not represent it. After each turn, the learning algorithm updates each weight in the neural net according to the following rule: w t + 1 − w t = α ( Y t + 1 − Y t ) ∑ k = 1 t λ t − k ∇ w Y k {\displaystyle w_{t+1}-w_{t}=\alpha (Y_{t+1}-Y_{t})\sum _{k=1}^{t}\lambda ^{t-k}\nabla _{w}Y_{k}} where: It was found that picking small λ {\displaystyle \lambda } offered performance roughly equally good, and large λ {\displaystyle \lambda } degraded performance. Because of this, after 1992, TD-Gammon was trained with λ = 0 {\displaystyle \lambda =0} , degenerating into standard TD-learning. This saved compute by a factor of 2. == Development history == Version 1.0 used simple 1-ply search: every next move is scored by the neural net, and the highest-scoring move is selected. Versions 2.0 and 2.1 used 2-ply search: Make a 1-ply analysis to remove unlikely moves ("forward pruning"). Make a 2-play minimax analysis for only the likely moves. Pick the best move, probability-weighted by each of the opponent's 21 possible dice rolls (weighting non-doubles twice as much as doubles). Versions 3.0 and 3.1 used 3-ply search, using 21 2 = 441 {\displaystyle 21^{2}=441} possible dice rolls instead of 21. The last version, 3.1, was trained specifically for an exhibition match against Malcolm Davis at the 1998 AAAI Hall of Champions. It lost at -8 points, mainly due to one blunder, where TD-Gammon opted to double and got gammoned at -32 points. == Experiments and stages of training == Unlike previous neural-net backgammon programs such as Neurogammon (also written by Tesauro), where an expert trained the program by supplying the "correct" evaluation of each position, TD-Gammon was at first programmed "knowledge-free". In early experimentation, using only a raw board encoding with no human-designed features, TD-Gammon reached a level of play comparable to Neurogammon: that of an intermediate-level human backgammon player. Even though TD-Gammon discovered insightful features on its own, Tesauro wondered if its play could be improved by using hand-designed features like Neurogammon's. Indeed, the self-training TD-Gammon with expert-designed features soon surpassed all previous computer backgammon programs. It stopped improving after about 1,500,000 games (self-play) using a three-layered neural network, with 198 input units encoding expert-designed features, 80 hidden units, and one output unit representing predicted probability of winning. == Advances in backgammon theory == TD-Gammon's exclusive training through self-play (rather than imitation learning) enabled it to explore strategies that humans previously had not considered or had ruled out erroneously. Its success with unorthodox strategies had a significant impact on the backgammon community. Late 1991, Bill Robertie, Paul Magriel, and Malcolm Davis, were invited to play against TD-Gammon (version 1.0). A total of 51 games were played, with TD-Gammon losing at -0.25 ppg. Robertie found TD-Gammon to be at the level of a competent advanced player, and better than any previous backgammon program. Robertie subsequently wrote about the use of TD-Gammon for backgammon study. For example, on the opening play, the conventional wisdom was that given a roll of 2-1, 4-1, or 5-1, White should move a single checker from point 6 to point 5. Known as "slotting", this technique trades the risk of a hit for the opportunity to develop an aggressive position. TD-Gammon found that the more conservative play of splitting 24-23 was superior. Tournament players began experimenting with TD-Gammon's move, and found success. Within a few years, slotting had disappeared from tournament play, replaced by splitting, though in 2006 it made a reappearance for 2-1. Backgammon expert Kit Woolsey found that TD-Gammon's positional judgement, especially its weighing of risk against safety, was superior to his own or any human's. TD-Gammon's excellent positional play was undercut by occasional poor endgame play. The endgame requires a more analytical approach, sometimes with extensive lookahead. TD-Gammon's limitation to two-ply lookahead put a ceiling on what it could achieve in this part of the game. TD-Gammon's strengths and weaknesses were the opposite of symbolic artificial intelligence programs and most computer software in general: it was good at matters that require an intuitive "feel" but bad at systematic analysis. It is also poor at doubling strategies. This is likely due to the fact that the neural network is trained without the doubling cube, with the doubling added by feeding the neural network's cubeless equity estimates into theoretically-based heuristic formulae. This was particularly the case in the 1998 exhibition match, where it played 100 games against Malcolm Davis. A single doubling blunder lost the match. TD-gammon was never commercialized or released to the public in some other form, but it inspired commercial backgammon programs based on neural networks, such as JellyFish (1994) and Snowie (1998).

    Read more →
  • Multi-armed bandit

    Multi-armed bandit

    In probability theory and machine learning, the multi-armed bandit problem (sometimes called the K- or N-armed bandit problem) is named from imagining a gambler at a row of slot machines (sometimes known as "one-armed bandits"), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine. More generally, it is a problem in which a decision maker iteratively selects one of multiple fixed choices (i.e., arms or actions) when the properties of each choice are only partially known at the time of allocation, and may become better understood as time passes. A fundamental aspect of bandit problems is that choosing an arm does not affect the properties of the arm or other arms. Instances of the multi-armed bandit problem include the task of iteratively allocating a fixed, limited set of resources between competing (alternative) choices in a way that minimizes the regret. A notable alternative setup for the multi-armed bandit problem includes the "best arm identification (BAI)" problem where the goal is instead to identify the best choice by the end of a finite number of rounds. The multi-armed bandit problem is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma. In contrast to general reinforcement learning, the selected actions in bandit problems do not affect the reward distribution of the arms. The multi-armed bandit problem also falls into the broad category of stochastic scheduling. In the problem, each machine provides a random reward from a probability distribution specific to that machine, that is not known a priori. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls. The crucial tradeoff the gambler faces at each trial is between "exploitation" of the machine that has the highest expected payoff and "exploration" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in machine learning. In practice, multi-armed bandits have been used to model problems such as managing research projects in a large organization, like a science foundation or a pharmaceutical company. In early versions of the problem, the gambler begins with no initial knowledge about the machines. Herbert Robbins in 1952, realizing the importance of the problem, constructed convergent population selection strategies in "some aspects of the sequential design of experiments". A theorem, the Gittins index, first published by John C. Gittins, gives an optimal policy for maximizing the expected discounted reward. == Empirical motivation == The multi-armed bandit problem models an agent that simultaneously attempts to acquire new knowledge (called "exploration") and optimize their decisions based on existing knowledge (called "exploitation"). The agent attempts to balance these competing tasks in order to maximize their total value over the period of time considered. There are many practical applications of the bandit model, for example: clinical trials investigating the effects of different experimental treatments while minimizing patient losses, adaptive routing efforts for minimizing delays in a network, financial portfolio design In these practical examples, the problem requires balancing reward maximization based on the knowledge already acquired with attempting new actions to further increase knowledge. This is known as the exploitation vs. exploration tradeoff in machine learning. The model has also been used to control dynamic allocation of resources to different projects, answering the question of which project to work on, given uncertainty about the difficulty and payoff of each possibility. Originally considered by Allied scientists in World War II, it proved so intractable that, according to Peter Whittle, the problem was proposed to be dropped over Germany so that German scientists could also waste their time on it. The version of the problem now commonly analyzed was formulated by Herbert Robbins in 1952. == The multi-armed bandit model == The multi-armed bandit (short: bandit or MAB) can be seen as a set of real distributions B = { R 1 , … , R K } {\displaystyle B=\{R_{1},\dots ,R_{K}\}} , each distribution being associated with the rewards delivered by one of the K ∈ N + {\displaystyle K\in \mathbb {N} ^{+}} levers. Let μ 1 , … , μ K {\displaystyle \mu _{1},\dots ,\mu _{K}} be the mean values associated with these reward distributions. The gambler iteratively plays one lever per round and observes the associated reward. The objective is to maximize the sum of the collected rewards. The horizon H {\displaystyle H} is the number of rounds that remain to be played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds is defined as the expected difference between the reward sum associated with an optimal strategy and the sum of the collected rewards: ρ = T μ ∗ − ∑ t = 1 T r ^ t {\displaystyle \rho =T\mu ^{}-\sum _{t=1}^{T}{\widehat {r}}_{t}} , where μ ∗ {\displaystyle \mu ^{}} is the maximal reward mean, μ ∗ = max k { μ k } {\displaystyle \mu ^{}=\max _{k}\{\mu _{k}\}} , and r ^ t {\displaystyle {\widehat {r}}_{t}} is the reward in round t {\displaystyle t} . A zero-regret strategy is a strategy whose average regret per round ρ / T {\displaystyle \rho /T} tends to zero with probability 1 when the number of played rounds tends to infinity. Intuitively, zero-regret strategies are guaranteed to converge to a (not necessarily unique) optimal strategy if enough rounds are played. == Variations == A common formulation is the Binary multi-armed bandit or Bernoulli multi-armed bandit, which issues a reward of one with probability p {\displaystyle p} , and otherwise a reward of zero. Another formulation of the multi-armed bandit has each arm representing an independent Markov machine. Each time a particular arm is played, the state of that machine advances to a new one, chosen according to the Markov state evolution probabilities. There is a reward depending on the current state of the machine. In a generalization called the "restless bandit problem", the states of non-played arms can also evolve over time. There has also been discussion of systems where the number of choices (about which arm to play) increases over time. Computer science researchers have studied multi-armed bandits under worst-case assumptions, obtaining algorithms to minimize regret in both finite and infinite (asymptotic) time horizons for both stochastic and non-stochastic arm payoffs. === Best arm identification === An important variation of the classical regret minimization problem in multi-armed bandits is best arm identification (BAI), also known as pure exploration. This problem is crucial in various applications, including clinical trials, adaptive routing, recommendation systems, and A/B testing. In BAI, the objective is to identify the arm having the highest expected reward. An algorithm in this setting is characterized by a sampling rule, a decision rule, and a stopping rule, described as follows: Sampling rule: ( a t ) t ≥ 1 {\displaystyle (a_{t})_{t\geq 1}} is a sequence of actions at each time step Stopping rule: τ {\displaystyle \tau } is a (random) stopping time which suggests when to stop collecting samples Decision rule: a ^ τ {\displaystyle {\hat {a}}_{\tau }} is a guess on the best arm based on the data collected up to time τ {\displaystyle \tau } There are two predominant settings in BAI: Fixed budget setting: Given a time horizon T ≥ 1 {\displaystyle T\geq 1} , the objective is to identify the arm with the highest expected reward a ⋆ ∈ arg ⁡ max k μ k {\displaystyle a^{\star }\in \arg \max _{k}\mu _{k}} minimizing probability of error δ {\displaystyle \delta } . Fixed confidence setting: Given a confidence level δ ∈ ( 0 , 1 ) {\displaystyle \delta \in (0,1)} , the objective is to identify the arm with the highest expected reward a ⋆ ∈ arg ⁡ max k μ k {\displaystyle a^{\star }\in \arg \max _{k}\mu _{k}} with the least possible amount of trials and with probability of error P ( a ^ τ ≠ a ⋆ ) ≤ δ {\displaystyle \mathbb {P} ({\hat {a}}_{\tau }\neq a^{\star })\leq \delta } . For example using a decision rule, we could use m 1 {\displaystyle m_{1}} where m {\displaystyle m} is the machine no.1 (you can use a different variable respectively) and 1 {\displaystyle 1} is the amount for each time an attempt is made at pulling the lever, where ∫ ∑ m 1 , m 2 , ( . . . ) = M {\displaystyle \int \sum m_{1},m_{2},(...)=M} , identify M {\displaystyle M} as the sum of each attempts m 1 + m 2 {\displaystyle m_{1}+m_{2}} , (...) as needed, and from there you can get a ratio, sum or mean as quantitative probability and sample your formulation for each slots. You can also do ∫ ∑ k ∝ i N − (

    Read more →
  • Connectionism

    Connectionism

    Connectionism is an approach to the study of human mental processes and cognition that utilizes mathematical models known as connectionist networks or artificial neural networks. Connectionism has had many "waves" since its beginnings. The first wave appeared 1943 with Warren Sturgis McCulloch and Walter Pitts both focusing on comprehending neural circuitry through a formal and mathematical approach, and Frank Rosenblatt who published the 1958 paper "The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain" in Psychological Review, while working at the Cornell Aeronautical Laboratory. The first wave ended with the 1969 book Perceptrons about limitations of the original perceptron idea, written by Marvin Minsky and Seymour Papert, which contributed to discouraging major funding agencies in the US from investing in connectionist research. With a few noteworthy deviations, most connectionist research entered a period of inactivity until the mid-1980s. The term connectionist model was reintroduced in a 1982 paper in the journal Cognitive Science by Jerome Feldman and Dana Ballard. The second wave blossomed in the late 1980s, following a 1987 book Parallel Distributed Processing by James L. McClelland, David E. Rumelhart, et al., which introduced a couple of improvements to the simple perceptron idea, such as intermediate processors (now known as "hidden layers") alongside input and output units, and used a sigmoid activation function instead of the old "all-or-nothing" function. Their work built upon that of John Hopfield, who was a key figure investigating the mathematical characteristics of sigmoid activation functions. From the late 1980s to the mid-1990s, connectionism took on an almost revolutionary tone when Schneider, Terence Horgan and Tienson posed the question of whether connectionism represented a fundamental shift in psychology and so-called "good old-fashioned AI", or GOFAI. Some advantages of the second wave connectionist approach included its applicability to a broad array of functions, structural approximation to biological neurons, low requirements for innate structure, and capacity for graceful degradation. Its disadvantages included the difficulty in deciphering how ANNs process information or account for the compositionality of mental representations, and a resultant difficulty explaining phenomena at a higher level. The current (third) wave has been marked by advances in deep learning, which have made possible the creation of large language models. The success of deep-learning networks in the past decade has greatly increased the popularity of this approach, but the complexity and scale of such networks has brought with them increased interpretability problems. == Basic principle == The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. The form of the connections and the units can vary from model to model. For example, units in the network could represent neurons and the connections could represent synapses, as in the human brain. This principle has been seen as an alternative to GOFAI and the classical theories of mind based on symbolic computation, but the extent to which the two approaches are compatible has been the subject of much debate since their inception. === Activation function === Internal states of any network change over time due to neurons sending a signal to a succeeding layer of neurons in the case of a feedforward network, or to a previous layer in the case of a recurrent network. Discovery of non-linear activation functions has enabled the second wave of connectionism. === Memory and learning === Neural networks follow two basic principles: Any mental state can be described as a n-dimensional vector of numeric activation values over neural units in a network. Memory and learning are created by modifying the 'weights' of the connections between neural units, generally represented as an n×m matrix. The weights are adjusted according to some learning rule or algorithm, such as Hebbian learning. Most of the variety among the models comes from: Interpretation of units: Units can be interpreted as neurons or groups of neurons. Definition of activation: Activation can be defined in a variety of ways. For example, in a Boltzmann machine, the activation is interpreted as the probability of generating an action potential spike, and is determined via a logistic function on the sum of the inputs to a unit. Learning algorithm: Different networks modify their connections differently. In general, any mathematically defined change in connection weights over time is referred to as the "learning algorithm". === Biological realism === Connectionist work in general does not need to be biologically realistic. One area where connectionist models are thought to be biologically implausible is with respect to error-propagation networks that are needed to support learning, but error propagation can explain some of the biologically-generated electrical activity seen at the scalp in event-related potentials such as the N400 and P600, and this provides some biological support for one of the key assumptions of connectionist learning procedures. Many recurrent connectionist models also incorporate dynamical systems theory. Many researchers, such as the connectionist Paul Smolensky, have argued that connectionist models will evolve toward fully continuous, high-dimensional, non-linear, dynamic systems approaches. == Precursors == Precursors of the connectionist principles can be traced to early work in psychology, such as that of William James. Psychological theories based on knowledge about the human brain were fashionable in the late 19th century. As early as 1869, the neurologist John Hughlings Jackson argued for multi-level, distributed systems. Following from this lead, Herbert Spencer's Principles of Psychology, 3rd edition (1872), and Sigmund Freud's Project for a Scientific Psychology (composed 1895) propounded connectionist or proto-connectionist theories. These tended to be speculative theories. But by the early 20th century, Edward Thorndike was writing about human learning that posited a connectionist type network. Hopfield networks had precursors in the Ising model due to Wilhelm Lenz (1920) and Ernst Ising (1925), though the Ising model conceived by them did not involve time. Monte Carlo simulations of Ising model required the advent of computers in the 1950s. == The first wave == The first wave begun in 1943 with Warren Sturgis McCulloch and Walter Pitts both focusing on comprehending neural circuitry through a formal and mathematical approach. McCulloch and Pitts showed how neural systems could implement first-order logic: Their classic paper "A Logical Calculus of Ideas Immanent in Nervous Activity" (1943) is important in this development here. They were influenced by the work of Nicolas Rashevsky in the 1930s and symbolic logic in the style of Principia Mathematica. Hebb contributed greatly to speculations about neural functioning, and proposed a learning principle, Hebbian learning. Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesion experiments. Friedrich Hayek independently conceived the model, first in a brief unpublished manuscript in 1920, then expanded into a book in 1952. The Perceptron machines were proposed and built by Frank Rosenblatt, who published the 1958 paper “The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain” in Psychological Review, while working at the Cornell Aeronautical Laboratory. He cited Hebb, Hayek, Uttley, and Ashby as main influences. Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s. The research group led by Widrow empirically searched for methods to train two-layered ADALINE networks (MADALINE), with limited success. A method to train multilayered perceptrons with arbitrary levels of trainable weights was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, called the Group Method of Data Handling. This method employs incremental layer by layer training based on regression analysis, where useless units in hidden layers are pruned with the help of a validation set. The first multilayered perceptrons trained by stochastic gradient descent was published in 1967 by Shun'ichi Amari. In computer experiments conducted by Amari's student Saito, a five layer MLP with two modifiable layers learned useful internal representations to classify non-linearily separable pattern classes. In 1972, Shun'ichi Amari produced an early example of self-organizing network. == The neural network winter == There was some conflict among artificial intelligence researchers as to what neural networks are useful for. Around late 1960s, there was a widespread lull in research a

    Read more →
  • The Emperor's New Mind

    The Emperor's New Mind

    The Emperor's New Mind: Concerning Computers, Minds and The Laws of Physics is a 1989 book by the mathematical physicist Roger Penrose that posits a quantum mind theory. Penrose argues that human consciousness is non-algorithmic, and thus is not capable of being modeled by a conventional Turing machine, which includes a digital computer. Penrose hypothesizes that quantum mechanics plays an essential role in the understanding of human consciousness. The collapse of the quantum wavefunction is seen as playing an important role in brain function. Most of the book is spent reviewing, for the scientifically-minded lay-reader, a plethora of interrelated subjects such as Newtonian physics, special and general relativity, the philosophy and limitations of mathematics, quantum physics, cosmology, and the nature of time. Penrose intermittently describes how each of these bears on his developing theme: that consciousness is not "algorithmic". Only the later portions of the book address the thesis directly. == Overview == Penrose states that his ideas on the nature of consciousness are speculative, and his thesis is considered erroneous by some experts in the fields of philosophy, computer science, and robotics. The Emperor's New Mind attacks the claims of artificial intelligence using the physics of computing: Penrose notes that the present home of computing lies more in the tangible world of classical mechanics than in the imponderable realm of quantum mechanics. The modern computer is a deterministic system that for the most part simply executes algorithms. Penrose shows that, by reconfiguring the boundaries of a billiard table, one might make a computer in which the billiard balls act as message carriers and their interactions act as logical decisions. The billiard-ball computer was first designed some years ago by Edward Fredkin and Tommaso Toffoli of the Massachusetts Institute of Technology. == Reception == Following the publication of the book, Penrose began to collaborate with Stuart Hameroff on a biological analog to quantum computation involving microtubules, which became the foundation for his subsequent book, Shadows of the Mind: A Search for the Missing Science of Consciousness. Penrose won the Science Book Prize in 1990 for The Emperor's New Mind. According to an article in the American Journal of Physics, Penrose incorrectly claims a barrier far away from a localized particle can affect the particle.

    Read more →
  • StepFun

    StepFun

    Shanghai Jieyue Xingchen Intelligent Technology Co., Ltd, known as StepFun, is an artificial intelligence (AI) company based in Shanghai, China. It has been dubbed one of China's "AI Tiger" companies by investors. == Background == StepFun was founded in April 2023 by former Microsoft employees. Investors include Tencent, Qiming Venture Partners and Shanghai State-owned Capital Investment. In July 2025 at the World Artificial Intelligence Conference, StepFun announced the "Model-Chip Ecosystem Innovation Alliance" which consisted of Chinese developers of large language models (LLMs) and AI chip manufacturers. This included companies such as Huawei, Biren Technology, Moore Threads and Enflame. Another second alliance named the "Shanghai General Chamber of Commerce AI Committee" was also established that included StepFun, SenseTime, MiniMax, MetaX and Iluvatar CoreX. On 25 February 2026, it was reported that StepFun was seeking an initial public offering on the Hong Kong Stock Exchange. StepFun focuses on multimodal models which are designed to understand multiple types of input data such as text, video and audio. == Products == In July 2024 at the World Artificial Intelligence Conference, StepFun officially launched Step-2, a trillion-parameter LLM, along with the Step-1.5V multimodal model and the Step-1X image generation model. In February 2025, StepFun and Geely jointly announced the open-sourcing of two multimodal large models to global developers. They were Step-Video-T2V and Step-Audio. In July 2025, StepFun released Step 3. The Model-Chip Ecosystem Innovation Alliance aimed to optimize Step 3 for domestic chips. In April 2025, Step-R1-V-Mini was released. It is a multimodal reasoning model designed for visual interpretation and image understanding. In February 2026, Step-3.5-Flash, a mixture-of-experts model with 196 billion parameters and 11 billion active parameters was released under the free and open-source Apache 2.0 license. It supports tool use and a 256k token context window. == Models ==

    Read more →
  • Albert One

    Albert One

    Albert One is an artificial intelligence chatbot created by Robby Garner and designed to mimic the way humans make conversations using a multi-faceted approach in natural language programming. == History == In both 1998 and 1999, Albert One won the Loebner Prize Contest, a competition between chatterbots. Some parts of Albert were deployed on the internet beginning in 1995, to gather information about what kinds of things people would say to a chatterbot. Another element of Albert One involved the building of a large database of human statements, and associated replies. This portion of the project was tested at the 1994-1997 Loebner Prize contests. Albert was the first of Robby Garner's multifaceted bots. The Albert One system was composed of several subsystems. Among those were a version of Eliza, the therapist, Elivs, another Eliza-like bot, and several other helper applications working together in a hierarchical arrangement. As a continuation of the stimulus-response library, various other database queries and assertions were tested to arrive at each of Albert's responses. Robby went on to develop networked examples of this kind of hierarchical "glue" at The Turing Hub.

    Read more →
  • Open Knowledge Base Connectivity

    Open Knowledge Base Connectivity

    Open Knowledge Base Connectivity (OKBC) is a protocol and an API for accessing knowledge in knowledge representation systems such as ontology repositories and object–relational databases. It is somewhat complementary to the Knowledge Interchange Format that serves as a general representation language for knowledge. It is developed by SRI International's Artificial Intelligence Center for DARPA's High Performance Knowledge Base program (HPKB).

    Read more →
  • AlphaGo Zero

    AlphaGo Zero

    AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in Nature in October 2017 introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version. By playing games against itself, AlphaGo Zero: surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0; reached the level of AlphaGo Master in 21 days; and exceeded all previous versions in 40 days. Training artificial intelligence (AI) without datasets derived from human experts has significant implications for the development of AI with superhuman skills, as expert data is "often expensive, unreliable, or simply unavailable." Demis Hassabis, the co-founder and CEO of DeepMind, said that AlphaGo Zero was so powerful because it was "no longer constrained by the limits of human knowledge". Furthermore, AlphaGo Zero performed better than standard deep reinforcement learning models (such as Deep Q-Network implementations) due to its integration of Monte Carlo tree search. David Silver, one of the first authors of DeepMind's papers published in Nature on AlphaGo, said that it is possible to have generalized AI algorithms by removing the need to learn from humans. Google later developed AlphaZero, a generalized version of AlphaGo Zero that could play chess and shōgi in addition to Go. In December 2017, AlphaZero beat the 3-day version of AlphaGo Zero by winning 60 games to 40, and with 8 hours of training it outperformed AlphaGo Lee on an Elo scale. AlphaZero also defeated a top chess program (Stockfish) and a top Shōgi program (Elmo). == Architecture == The network in AlphaGo Zero is a ResNet with two heads. The stem of the network takes as input a 17x19x19 tensor representation of the Go board. 8 channels are the positions of the current player's stones from the last eight time steps. (1 if there is a stone, 0 otherwise. If the time step go before the beginning of the game, then 0 in all positions.) 8 channels are the positions of the other player's stones from the last eight time steps. 1 channel is all 1 if black is to move, and 0 otherwise. The body is a ResNet with either 20 or 40 residual blocks and 256 channels. There are two heads, a policy head and a value head. Policy head outputs a logit array of size 19 × 19 + 1 {\displaystyle 19\times 19+1} , representing the logit of making a move in one of the points, plus the logit of passing. Value head outputs a number in the range ( − 1 , + 1 ) {\displaystyle (-1,+1)} , representing the expected score for the current player. -1 represents current player losing, and +1 winning. == Training == AlphaGo Zero's neural network was trained using TensorFlow, with 64 GPU workers and 19 CPU parameter servers. Only four TPUs were used for inference. The neural network initially knew nothing about Go beyond the rules. Unlike earlier versions of AlphaGo, Zero only perceived the board's stones, rather than having some rare human-programmed edge cases to help recognize unusual Go board positions. The AI engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game's outcome. In the first three days AlphaGo Zero played 4.9 million games against itself in quick succession. It appeared to develop the skills required to beat top humans within just a few days, whereas the earlier AlphaGo took months of training to achieve the same level. According to Epoch.ai, training cost 3e23 FLOPs. For comparison, the researchers also trained a version of AlphaGo Zero using human games, AlphaGo Master, and found that it learned more quickly, but actually performed more poorly in the long run. DeepMind submitted its initial findings in a paper to Nature in April 2017, which was then published in October 2017. == Hardware cost == The hardware cost for a single AlphaGo Zero system in 2017, including the four TPUs, has been quoted as around $25 million. == Applications == According to Hassabis, AlphaGo's algorithms are likely to be of the most benefit to domains that require an intelligent search through an enormous space of possibilities, such as protein folding (see AlphaFold) or accurately simulating chemical reactions. AlphaGo's techniques are probably less useful in domains that are difficult to simulate, such as learning how to drive a car. DeepMind stated in October 2017 that it had already started active work on attempting to use AlphaGo Zero technology for protein folding, and stated it would soon publish new findings. == Reception == AlphaGo Zero was widely regarded as a significant advance, even when compared with its groundbreaking predecessor, AlphaGo. Oren Etzioni of the Allen Institute for Artificial Intelligence called AlphaGo Zero "a very impressive technical result" in "both their ability to do it—and their ability to train the system in 40 days, on four TPUs". The Guardian called it a "major breakthrough for artificial intelligence", citing Eleni Vasilaki of Sheffield University and Tom Mitchell of Carnegie Mellon University, who called it an impressive feat and an “outstanding engineering accomplishment" respectively. Mark Pesce of the University of Sydney called AlphaGo Zero "a big technological advance" taking us into "undiscovered territory". Gary Marcus, a psychologist at New York University, has cautioned that for all we know, AlphaGo may contain "implicit knowledge that the programmers have about how to construct machines to play problems like Go" and will need to be tested in other domains before being sure that its base architecture is effective at much more than playing Go. In contrast, DeepMind is "confident that this approach is generalisable to a large number of domains". In response to the reports, South Korean Go professional Lee Sedol said, "The previous version of AlphaGo wasn’t perfect, and I believe that’s why AlphaGo Zero was made." On the potential for AlphaGo's development, Lee said he will have to wait and see but also said it will affect young Go players. Mok Jin-seok, who directs the South Korean national Go team, said the Go world has already been imitating the playing styles of previous versions of AlphaGo and creating new ideas from them, and he is hopeful that new ideas will come out from AlphaGo Zero. Mok also added that general trends in the Go world are now being influenced by AlphaGo's playing style. "At first, it was hard to understand and I almost felt like I was playing against an alien. However, having had a great amount of experience, I’ve become used to it," Mok said. "We are now past the point where we debate the gap between the capability of AlphaGo and humans. It’s now between computers." Mok has reportedly already begun analyzing the playing style of AlphaGo Zero along with players from the national team. "Though having watched only a few matches, we received the impression that AlphaGo Zero plays more like a human than its predecessors," Mok said. Chinese Go professional Ke Jie commented on the remarkable accomplishments of the new program: "A pure self-learning AlphaGo is the strongest. Humans seem redundant in front of its self-improvement." == Comparison with predecessors == == AlphaZero == On 5 December 2017, DeepMind team released a preprint on arXiv, introducing AlphaZero, a program using generalized AlphaGo Zero's approach, which achieved within 24 hours a superhuman level of play in chess, shogi, and Go, defeating world-champion programs, Stockfish, Elmo, and 3-day version of AlphaGo Zero in each case. AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include: AZ has hard-coded rules for setting search hyperparameters. The neural network is now updated continually. Chess (unlike Go) can end in a tie; therefore AZ can take into account the possibility of a tie game. An open source program, Leela Zero, based on the ideas from the AlphaGo papers is available. It uses a GPU instead of the TPUs recent versions of AlphaGo rely on.

    Read more →