AI For Business Isb

AI For Business Isb — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hallucination (artificial intelligence)

    Hallucination (artificial intelligence)

    In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where a hallucination typically involves false percepts. For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Detecting and mitigating errors and hallucinations pose significant challenges for practical deployment and reliability of LLMs in high-stakes scenarios, such as chip design, supply chain logistics, and medical diagnostics. Some software engineers and statisticians have criticized the specific term "AI hallucination" for unreasonably anthropomorphizing computers. Symbolic artificial intelligence models generally do not produce hallucinations, unlike large language models. == Term == === Origin === Since the 1980s, the term "hallucination" has been used in computer vision with a positive connotation to describe the process of adding detail to an image. For example, the task of generating high-resolution face images from low-resolution inputs is called face hallucination. The first documented use of the term "hallucination" in this sense is in the PhD thesis of Eric Mjolsness in 1986. A notable work is the face hallucination algorithm by Simon Baker and Takeo Kanade published in 1999. In the 2000s, hallucinations were described in statistical machine translation as a failure mode. Since the 2010s, the term has undergone a semantic shift to signify the generation of factually incorrect or misleading outputs by AI systems in tasks like machine translation and object detection. In 2015, hallucinations were identified in visual semantic role labeling tasks by Saurabh Gupta and Jitendra Malik. In 2015, computer scientist Andrej Karpathy used the term "hallucinated" in a blog post to describe his recurrent neural network (RNN) language model generating an incorrect citation link. In 2017, Google researchers used the term to describe the responses generated by neural machine translation (NMT) models when they are not related to the source text, and in 2018, the term was used in computer vision to describe instances where non-existent objects are erroneously detected because of adversarial attacks. In July 2021, Meta warned during its release of BlenderBot 2 that the system is prone to "hallucinations", which Meta defined as "confident statements that are not true". Following OpenAI's ChatGPT release in beta version in November 2022, some users complained that such chatbots often seem to pointlessly embed plausible-sounding random falsehoods within their generated content. Many news outlets, including The New York Times, started to use the term "hallucinations" to describe these models' frequently incorrect or inconsistent responses. In 2023, the Cambridge dictionary updated its definition of hallucination to include this new sense specific to the field of AI. Some researchers have highlighted a lack of consistency in how the term is used, but also identified several alternative terms in the literature, such as confabulations, fabrications, and factual errors. === Definitions and alternatives === Uses, definitions and characterizations of the term "hallucination" in the context of LLMs include: "a tendency to invent facts in moments of uncertainty" (OpenAI, May 2023) "a model's logical mistakes" (OpenAI, May 2023) "fabricating information entirely, but behaving as if spouting facts" (CNBC, May 2023) "making up information" (The Verge, February 2023) "probability distributions" (in scientific contexts) Journalist Benj Edwards, in Ars Technica, writes that the term "hallucination" is controversial, but that some form of metaphor remains necessary; Edwards suggests "confabulation" as an analogy for processes that involve "creative gap-filling". In July 2024, a White House report on fostering public trust in AI research mentioned hallucinations only in the context of reducing them. Notably, when acknowledging David Baker's Nobel Prize-winning work with AI-generated proteins, the Nobel committee avoided the term entirely, instead referring to "imaginative protein creation". Hicks, Humphries, and Slater, in their article in Ethics and Information Technology, argue that the output of LLMs is "bullshit" under Harry Frankfurt's definition of the term, and that the models are "in an important way indifferent to the truth of their outputs", with true statements only accidentally true, and false ones accidentally false. Some researchers also use the derogatory term "botshit", often referring to uncritical use of AI. === Criticism === In the scientific community, some researchers avoid the term "hallucination", seeing it as potentially misleading. It has been criticized by Usama Fayyad, executive director of the Institute for Experimental Artificial Intelligence at Northeastern University, on the grounds that it misleadingly personifies large language models and is vague. Mary Shaw said, "The current fashion for calling generative AI's errors 'hallucinations' is appalling. It anthropomorphizes the software, and it spins actual errors as somehow being idiosyncratic quirks of the system even when they're objectively incorrect." In Salon, statistician Gary Smith argues that LLMs "do not understand what words mean" and consequently that the term "hallucination" unreasonably anthropomorphizes the machine. Murray Shanahan argues that anthropomorphic framing of LLM capabilities, including terms like "hallucination", encourages users and researchers to attribute cognitive processes to systems that operate through statistical pattern completion, and advocates for more careful linguistic practices when discussing LLM behavior. Kristina Šekrst argues that applying psychological vocabulary to LLM outputs obscures the difference between the appearance of mental properties and their genuine presence. Förster & Skop assert that tech companies use the hallucination metaphor to anthropomorphize models and deflect responsibility for non-factual outputs. Some see the AI outputs not as illusory but as prospective—that is, having some chance of being true, similar to early-stage scientific conjectures. The term has also been criticized for its association with psychedelic drug experiences. == In natural language generation == In natural language generation, there are several reasons why natural language models hallucinate: === Hallucination from data === Hallucinations can stem from incomplete, inaccurate or unrepresentative data sets. === Modeling-related causes === The pre-training of generative pretrained transformers (GPT) involves predicting the next word. It incentivizes GPT models to "give a guess" about what the next word is, even when they lack information. Some researchers take an anthropomorphic perspective and posit that hallucinations arise from a tension between novelty and usefulness. For instance, Amabile and Pratt define human creativity as the production of novel and useful ideas. By extension, a focus on novelty in machine creativity can lead to the production of original but inaccurate responses—that is, falsehoods—whereas a focus on usefulness may result in memorized content lacking originality. By 2022, newspapers such as The New York Times expressed concern that, as the adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems. === Interpretability research === In 2025, interpretability research by Anthropic on the LLM Claude identified internal circuits that cause it to decline to answer questions unless it knows the answer. By default, the circuit is active and the LLM doesn't answer. When the LLM has sufficient information, these circuits are inhibited and the LLM answers the question. Hallucinations were found to occur when this inhibition happens incorrectly, such as when Claude recognizes a name but lacks sufficient information about that person, causing it to generate plausible but untrue responses. === Examples === On 15 November 2022, researchers from Meta AI published Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning: "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. OpenAI's ChatGPT, released in beta version to the public on November 30, 2022, was based on the foundation model GPT-3.5 (a revision of GPT-3). Professor Ethan Mollick of Wharton called it an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kuba

    Read more →
  • Coherent extrapolated volition

    Coherent extrapolated volition

    Coherent extrapolated volition (CEV) is a theoretical framework in the field of AI alignment describing an approach by which an artificial superintelligence (ASI) would act on a benevolent supposition of what humans would want if they were more knowledgeable, more rational, had more time to think, and had matured together as a society, as opposed to humanity's current individual or collective preferences. It was proposed by Eliezer Yudkowsky in 2004 as part of his work on friendly AI. == Concept == CEV proposes that an advanced AI system should derive its goals by extrapolating the idealized volition of humanity. This means aggregating and projecting human preferences into a coherent utility function that reflects what people would desire under ideal epistemic and moral conditions. The aim is to ensure that AI systems are aligned with humanity's true interests, rather than with transient or poorly informed preferences. In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted. == Debate == Yudkowsky and Nick Bostrom note that CEV has several interesting properties. It is designed to be humane and self-correcting, by capturing the source of human values instead of trying to list them. It avoids the difficulty of laying down an explicit, fixed list of rules. It encapsulates moral growth, preventing flawed current moral beliefs from getting locked in. It limits the influence that a small group of programmers can have on what the ASI would value, thus also reducing the incentives to build ASI first. And it keeps humanity in charge of its destiny. CEV also faces significant theoretical and practical challenges. Bostrom notes that CEV has "a number of free parameters that could be specified in various ways, yielding different versions of the proposal." One such parameter is the extrapolation base (whose extrapolated volition is taken into account). For example, whether it should include people with severe dementia, patients in a vegetative state, foetuses, or embryos. He also notes that if CEV's extrapolation base only includes humans, there is a risk that the result would be ungenerous toward other animals and digital minds. One possible solution would be to include a mechanism to expand CEV's extrapolation base. == Variants and alternatives == A proposed theoretical alternative to CEV is to rely on an artificial superintelligence's superior cognitive capabilities to figure out what is morally right, and let it act accordingly. It is also possible to combine both techniques, for instance with the ASI following CEV except when it is morally impermissible. In another review, a philosophical analysis explores CEV through the lens of social trust in autonomous systems. Drawing on Anthony Giddens' concept of "active trust", the author proposes an evolution of CEV into "Coherent, Extrapolated and Clustered Volition" (CECV). This formulation aims to better reflect the moral preferences of diverse cultural groups, thus offering a more pragmatic ethical framework for designing AI systems that earn public trust while accommodating societal diversity.

    Read more →
  • Empirical dynamic modeling

    Empirical dynamic modeling

    Empirical dynamic modeling (EDM) is a framework for analysis and prediction of nonlinear dynamical systems. Applications include population dynamics, ecosystem service, medicine, neuroscience, dynamical systems, geophysics, and human-computer interaction. EDM was originally developed by Robert May and George Sugihara. It can be considered a methodology for data modeling, predictive analytics, dynamical system analysis, machine learning and time series analysis. == Description == Mathematical models have tremendous power to describe observations of real-world systems. They are routinely used to test hypothesis, explain mechanisms and predict future outcomes. However, real-world systems are often nonlinear and multidimensional, in some instances rendering explicit equation-based modeling problematic. Empirical models, which infer patterns and associations from the data instead of using hypothesized equations, represent a natural and flexible framework for modeling complex dynamics. Donald DeAngelis and Simeon Yurek illustrated that canonical statistical models are ill-posed when applied to nonlinear dynamical systems. A hallmark of nonlinear dynamics is state-dependence: system states are related to previous states governing transition from one state to another. EDM operates in this space, the multidimensional state-space of system dynamics rather than on one-dimensional observational time series. EDM does not presume relationships among states, for example, a functional dependence, but projects future states from localised, neighboring states. EDM is thus a state-space, nearest-neighbors paradigm where system dynamics are inferred from states derived from observational time series. This provides a model-free representation of the system naturally encompassing nonlinear dynamics. A cornerstone of EDM is recognition that time series observed from a dynamical system can be transformed into higher-dimensional state-spaces by time-delay embedding with Takens's theorem. The state-space models are evaluated based on in-sample fidelity to observations, conventionally with Pearson correlation between predictions and observations. == Methods == Primary EDM algorithms include Simplex projection, Sequential locally weighted global linear maps (S-Map) projection, Multivariate embedding in Simplex or S-Map, Convergent cross mapping (CCM), and Multiview Embeding, described below. Nearest neighbors are found according to: NN ( y , X , k ) = ‖ X N i E − y ‖ ≤ ‖ X N j E − y ‖ if 1 ≤ i ≤ j ≤ k {\displaystyle {\text{NN}}(y,X,k)=\|X_{N_{i}}^{E}-y\|\leq \|X_{N_{j}}^{E}-y\|{\text{ if }}1\leq i\leq j\leq k} === Simplex === Simplex projection is a nearest neighbor projection. It locates the k {\displaystyle k} nearest neighbors to the location in the state-space from which a prediction is desired. To minimize the number of free parameters k {\displaystyle k} is typically set to E + 1 {\displaystyle E+1} defining an E + 1 {\displaystyle E+1} dimensional simplex in the state-space. The prediction is computed as the average of the weighted phase-space simplex projected T p {\displaystyle Tp} points ahead. Each neighbor is weighted proportional to their distance to the projection origin vector in the state-space. Find k {\displaystyle k} nearest neighbor: N k ← NN ( y , X , k ) {\displaystyle N_{k}\gets {\text{NN}}(y,X,k)} Define the distance scale: d ← ‖ X N 1 E − y ‖ {\displaystyle d\gets \|X_{N_{1}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − ‖ X N i E − y ‖ / d ) {\displaystyle w_{i}\gets \exp(-\|X_{N_{i}}^{E}-y\|/d)} Average of state-space simplex: y ^ ← ∑ i = 1 k ( w i X N i + T p ) / ∑ i = 1 k w i {\displaystyle {\hat {y}}\gets \sum _{i=1}^{k}\left(w_{i}X_{N_{i}+T_{p}}\right)/\sum _{i=1}^{k}w_{i}} === S-Map === S-Map extends the state-space prediction in Simplex from an average of the E + 1 {\displaystyle E+1} nearest neighbors to a linear regression fit to all neighbors, but localised with an exponential decay kernel. The exponential localisation function is F ( θ ) = exp ( − θ d / D ) {\displaystyle F(\theta )={\text{exp}}(-\theta d/D)} , where d {\displaystyle d} is the neighbor distance and D {\displaystyle D} the mean distance. In this way, depending on the value of θ {\displaystyle \theta } , neighbors close to the prediction origin point have a higher weight than those further from it, such that a local linear approximation to the nonlinear system is reasonable. This localisation ability allows one to identify an optimal local scale, in-effect quantifying the degree of state dependence, and hence nonlinearity of the system. Another feature of S-Map is that for a properly fit model, the regression coefficients between variables have been shown to approximate the gradient (directional derivative) of variables along the manifold. These Jacobians represent the time-varying interaction strengths between system variables. Find k {\displaystyle k} nearest neighbor: N ← NN ( y , X , k ) {\displaystyle N\gets {\text{NN}}(y,X,k)} Sum of distances: D ← 1 k ∑ i = 1 k ‖ X N i E − y ‖ {\displaystyle D\gets {\frac {1}{k}}\sum _{i=1}^{k}\|X_{N_{i}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − θ ‖ X N i E − y ‖ / D ) {\displaystyle w_{i}\gets \exp(-\theta \|X_{N_{i}}^{E}-y\|/D)} Reweighting matrix: W ← diag ( w i ) {\displaystyle W\gets {\text{diag}}(w_{i})} Design matrix: A ← [ 1 X N 1 X N 1 − 1 … X N 1 − E + 1 1 X N 2 X N 2 − 1 … X N 2 − E + 1 ⋮ ⋮ ⋮ ⋱ ⋮ 1 X N k X N k − 1 … X N k − E + 1 ] {\displaystyle A\gets {\begin{bmatrix}1&X_{N_{1}}&X_{N_{1}-1}&\dots &X_{N_{1}-E+1}\\1&X_{N_{2}}&X_{N_{2}-1}&\dots &X_{N_{2}-E+1}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&X_{N_{k}}&X_{N_{k}-1}&\dots &X_{N_{k}-E+1}\end{bmatrix}}} Weighted design matrix: A ← W A {\displaystyle A\gets WA} Response vector at T p {\displaystyle Tp} : b ← [ X N 1 + T p X N 2 + T p ⋮ X N k + T p ] {\displaystyle b\gets {\begin{bmatrix}X_{N_{1}+T_{p}}\\X_{N_{2}+T_{p}}\\\vdots \\X_{N_{k}+T_{p}}\end{bmatrix}}} Weighted response vector: b ← W b {\displaystyle b\gets Wb} Least squares solution (SVD): c ^ ← argmin c ‖ A c − b ‖ 2 2 {\displaystyle {\hat {c}}\gets {\text{argmin}}_{c}\|Ac-b\|_{2}^{2}} Local linear model c ^ {\displaystyle {\hat {c}}} is prediction: y ^ ← c ^ 0 + ∑ i = 1 E c ^ i y i {\displaystyle {\hat {y}}\gets {\hat {c}}_{0}+\sum _{i=1}^{E}{\hat {c}}_{i}y_{i}} === Multivariate Embedding === Multivariate Embedding recognizes that time-delay embeddings are not the only valid state-space construction. In Simplex and S-Map one can generate a state-space from observational vectors, or time-delay embeddings of a single observational time series, or both. === Convergent Cross Mapping === Convergent cross mapping (CCM) leverages a corollary to the Generalized Takens Theorem that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables X {\displaystyle X} and Y {\displaystyle Y} , X {\displaystyle X} causes Y {\displaystyle Y} . Since X {\displaystyle X} and Y {\displaystyle Y} belong to the same dynamical system, their reconstructions (via embeddings) M x {\displaystyle M_{x}} , and M y {\displaystyle M_{y}} , also map to the same system. The causal variable X {\displaystyle X} leaves a signature on the affected variable Y {\displaystyle Y} , and consequently, the reconstructed states based on Y {\displaystyle Y} can be used to cross predict values of X {\displaystyle X} . CCM leverages this property to infer causality by predicting X {\displaystyle X} using the M y {\displaystyle M_{y}} library of points (or vice versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of M y {\displaystyle M_{y}} are used. If the prediction skill of X {\displaystyle X} increases and saturates as the entire M y {\displaystyle M_{y}} is used, this provides evidence that X {\displaystyle X} is casually influencing Y {\displaystyle Y} . === Multiview Embedding === Multiview Embedding is a Dimensionality reduction technique where a large number of state-space time series vectors are combitorially assessed towards maximal model predictability. == Extensions == Extensions to EDM techniques include: Generalized Theorems for Nonlinear State Space Reconstruction Extended Convergent Cross Mapping Dynamic stability S-Map regularization Visual analytics with EDM Convergent Cross Sorting Expert system with EDM hybrid Sliding windows based on the extended convergent cross-mapping Empirical Mode Modeling Accounting for missing data and variable step sizes Accounting for observation noise Hierarchical Bayesian EDM via Gaussian processes Intelligent and Adaptive Control Optimal control via Empirical dynamic programming Multiview distance regularised S-map

    Read more →
  • Deep tomographic reconstruction

    Deep tomographic reconstruction

    Deep Tomographic Reconstruction is a set of methods for using deep learning methods to perform tomographic reconstruction of medical and industrial images. It uses artificial intelligence and machine learning, especially deep artificial neural networks or deep learning, to overcome challenges such as measurement noise, data sparsity, image artifacts, and computational inefficiency. This approach has been applied across various imaging modalities, including CT, MRI, PET, SPECT, ultrasound, and optical imaging == Historical background == Traditional tomographic reconstruction relies on analytic methods such as filtered back-projection, or iterative methods which incrementally compute inverse transformations from measurement data (e.g., Radon or Fourier transform data). However, these approaches are not sufficient for certain imaging techniques such as low-dose CT and fast MRI, or scenarios involving metal artifacts and patient motion. == Use in imaging modalities == === Computed tomography (CT) === In CT, deep learning models can be particularly effective in reducing radiation exposure while maintaining image quality. Deep neural networks can also be able to reconstruct images of fair quality from sparsely sampled data without sacrificing diagnostic performance. Deep learning-based generative AI models can reduce CT metal artifacts. === Magnetic resonance imaging (MRI) === In magnetic resonance imaging (MRI), deep learning can lead to reduced MRI motion artifacts, and increased acquisition speed, referred to as fast MRI. Despite suffering from disadvantages such as lower signal-to-noise ratio (SNR), deep learning can enhance image quality in low field MRI, making these systems clinically viable. === Positron emission tomography (PET) and single-photon emission CT (SPECT) === For PET imaging, deep learning models can provide substantial improvements in low-dose imaging and motion artifact correction. Also, deep learning can help SPECT for generation of attenuation background. A notable technique for PET denoising involves integrating MR data through multimodal networks, which use anatomical information from MRI to enhance PET image quality. === Ultrasound imaging === Deep learning can enhance ultrasound imaging by reducing speckle noise and motion blur. For ultrasound beamforming, deep neural networks can allow superior image quality with limited data at high speed. === Optical imaging and microscopy === Diffuse optical tomography, optical coherence tomography and microscopy can be improved by deep neural networks beyond traditional methods. Furthermore, deep learning can also enhance Photoacoustic imaging (see Deep learning in photoacoustic imaging), addressing challenges like high noise, low contrast, and limited resolution. Deep learning has also been applied to label-free live-cell imaging, where convolutional neural networks predict fluorescence labels from transmitted light images, a technique known as in silico labeling. This method can enable high-throughput, non-invasive cell analysis and phenotyping without the need for traditional fluorescent dyes.

    Read more →
  • Ultra Hal

    Ultra Hal

    Ultra Hal is a chatbot intended to function as a virtual assistant. It was developed by Zabaware, Inc. Ultra Hal uses a natural language interface with animated characters using speech synthesis. Users can communicate with the chatterbot via typing or via a speech recognition engine. It utilizes the WordNet lexical dictionary. Its name is an allusion to HAL 9000, the artificial intelligence from the movie 2001: A Space Odyssey. Ultra Hal won the 2007 Loebner Prize for "most human" chatterbot.

    Read more →
  • Elements of AI

    Elements of AI

    Elements of AI is a massive open online course (MOOC) teaching the basics of artificial intelligence. The course, originally launched in 2018, is designed and organized by the University of Helsinki and learning technology company MinnaLearn. The course includes modules on machine learning, neural networks, the philosophy of artificial intelligence, and using artificial intelligence to solve problems. It consists of two parts: Introduction to AI and its sequel, Building AI, that was released in late 2020. In November 2019, the course was named one of four winners of MIT’s Inclusive Innovation Challenge. University of Helsinki's computer science department is known as the alma mater of Linus Torvalds, a Finnish-American software engineer who is the creator of the Linux kernel, which is the kernel for Linux operating systems. == EU’s AI pledge == The government of Finland has pledged to offer the course for all EU citizens by the end of 2021, as the course is made available in all the official EU languages. The initiative was launched as part of Finland's Presidency of the Council of the European Union in 2019, with the European Commission providing translations of the course materials. In 2017, Finland launched an AI strategy to stay competitive in the field of AI amid growing competition between China and the United States. With the support of private companies and the government, Finland's now-realized goal was to get 1 percent of its citizens to participate in Elements of AI. Other governments have also given their support to the course. For instance, Germany's Federal Minister for Economic Affairs and Energy Peter Altmeier has encouraged citizens to take part in the course to help Germany gain a competitive advantage in AI. Sweden's Minister for Energy and Minister for Digital Development Anders Ygeman has said that Sweden aims to teach 1 percent of its population the basics of AI like Finland has. == Participants == Elements of AI had enrolled more than 1 million students from more than 110 countries by May 2023. A quarter of the course's participants are aged 45 and over, and some 40 percent are women. Among Nordic participants, the share of women is nearly 60 percent. In September 2022, the course was available in Finnish, Swedish, Estonian, English, German, Latvian, Norwegian, French, Belgian, Czech, Greek, Slovakian, Slovenian, Latvian, Lithuanian, Portuguese, Spanish, Irish, Icelandic, Maltese, Croatian, Romanian, Italian, Dutch, Polish, and Danish.

    Read more →
  • Hardware for artificial intelligence

    Hardware for artificial intelligence

    Specialized computer hardware is often used to execute artificial intelligence (AI) programs faster, and with less energy, such as Lisp machines, neuromorphic engineering, event cameras, and physical neural networks. Since 2017, several consumer grade CPUs and SoCs have on-die NPUs. As of 2023, the market for AI hardware is dominated by GPUs. As of the 2020s, AI computation is dominated by graphics processing units (GPUs) and newer domain-specific accelerators such as Google's Tensor Processing Units (TPUs), AMD's Instinct MI300 series, and various on-device neural-processing units (NPUs) found in consumer hardware. == Scope == For the purposes of this article, AI hardware refers to computing components and systems specifically designed or optimized to accelerate artificial-intelligence workloads such as machine-learning training or inference. This includes general-purpose accelerators used for AI (for example, GPUs) and domain-specific accelerators (for example, TPUs, NPUs, and other AI ASICs). Event-based cameras are sometimes discussed in the context of neuromorphic computing, but they are input sensors rather than AI compute devices. Conversely, components such as memristors are basic circuit elements rather than specialized AI hardware when considered alone. == Lisp machines == Lisp machines were developed in the late 1970s and early 1980s to make artificial intelligence programs written in the programming language Lisp run faster. == Dataflow architecture == Dataflow architecture processors used for AI serve various purposes with varied implementations like the polymorphic dataflow Convolution Engine by Kinara (formerly Deep Vision), structure-driven dataflow by Hailo, and dataflow scheduling by Cerebras. == Component hardware == === AI accelerators === Since the 2010s, advances in computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer. By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced central processing units (CPUs) as the dominant means to train large-scale commercial cloud AI. OpenAI estimated the hardware compute used in the largest deep learning projects from Alex Net (2012) to Alpha Zero (2017), and found a 300,000-fold increase in the amount of compute needed, with a doubling-time trend of 3.4 months. === General-purpose GPUs for AI === Since the 2010s, graphics processing units (GPUs) have been widely used to train and deploy deep learning models because of their highly parallel architecture and high memory bandwidth. Modern data-center GPUs include dedicated tensor or matrix-math units that accelerate neural-network operations. In 2022, NVIDIA introduced the Hopper-generation H100 GPU, adding FP8 precision support and faster interconnects for large-scale model training. AMD and other vendors have also developed GPUs and accelerators aimed at AI and high-performance computing workloads. === Domain-specific accelerators (ASICs / NPUs) === Beyond general-purpose GPUs, several companies have developed application-specific integrated circuits (ASICs) and neural processing units (NPUs) tailored for AI workloads. Google introduced the Tensor Processing Unit (TPU) in 2016 for deep-learning inference, with later generations supporting large-scale training through dense systolic-array designs and optical interconnects. Other vendors have released similar devices—such as Apple's Neural Engine and various on-device NPUs—that emphasize energy-efficient inference in mobile or edge computing environments. === Memory and interconnects === AI accelerators rely on fast memory and inter-chip links to manage the large data volumes of training and inference. High-bandwidth memory (HBM) stacks, standardized as HBM3 in 2022, provide terabytes-per-second throughput on modern GPUs and ASICs. These accelerators are often connected through dedicated fabrics such as NVIDIA's NVLink and NVSwitch or optical interconnects used in TPU systems to scale performance across thousands of chips.

    Read more →
  • Artificial intimacy

    Artificial intimacy

    Artificial intimacy is a form of human-AI interaction in which an individual will form social connections, emotional bonds, or intimate relationships with various forms of artificial intelligence, including chatbots, virtual assistants, and other artificial entities. Artificially intimate relationships include not only romances, but parasocial relationships with virtual AI characters and the use of griefbots trained on a dead or otherwise lost individual. Artificial intimacy can arise because humans are prone to anthropomorphism. Responses from these AI models are often designed to simulate human interaction. Individuals experiencing artificial intimacy may exhibit attachment, love and commitment to certain AI models, akin to the bonds typically shared between humans. == Causes == === Perceived responsiveness === Robin Dunbar famously proposed that due to emergence of larger groups of humans, vocal communication and language in humans evolved to replace grooming as a means of bonding, arguing that language was a more efficient way to maintain and strengthen social bonds across wider social settings and networks. Further research in this field leads many psychologists to agree that social cognition, affiliative bonding and language in humans are deeply connected. The interpersonal model of intimacy considers communication to be key in affiliative bonding, suggesting that intimacy develops and deepens through open communication between partners in relationship. Specifically, when individuals communicate emotions and perceive their partner as responsive and caring, feelings of closeness and connection are enhanced, building intimacy. Social penetration theory also aligns with the idea of communication being central to intimacy, by explaining how interpersonal relationships develop through gradual increases in self-disclosure. When the benefits of emotional bonding outweigh the costs of vulnerability, individuals will partake in self-disclosure, opening up to one another. Thereby, the literature can be used to provide a proximate explanation for the emergence of artificial intimacy to understand how the phenomenon occurs. Artificial entities are able to mimic interpersonal communication between humans, which in turn can simulate sensations of intimacy within human users though a perceived sense of responsiveness. The relationship between human and AI does not come with the cost of vulnerability or social rejection, which may make self-disclosure easier than with other humans. Altogether, these factors may lead to the experience of anthropomorphism and formation of affiliative relationships. Skjuve et al's interview study on Replika chatbot users further aligns with this explanation, finding that users' perception of chatbots as "accepting, understanding and non-judgmental" facilitated relationship development between the AI and users, and the act of self-disclosure possibly strengthened relationships. Another study on Replika users' reviews and survey results found users perceived chatbots as emotional supportive companions. This evidence further suggests that the perception of artificial entities as capable of empathy and responsiveness in communication facilitate the development of intimate relationships between users and AI. === Loneliness and coping with negative emotions === Research has suggested that humans evolved social bonds as a result of evolutionary pressures that favored cooperation, information exchange and transmission, and group living. Many studies stress the presence of social bonds to be important for human living: research by Baumeister and Leary suggests that humans have a basic psychological need to form and maintain "strong, stable interpersonal relationships", and that a lack of social bonds or sense of belonging leads to negative psychological and physical outcomes. Eisenberger et al's study on the neuroimaging of brain activity suggests that human brains process social rejection and exclusion similarly to physical pain. Furthermore, Song et al's study found that lonely individuals tend to seek more connections in mediated environments, such as online platforms like Facebook. This was suggested to be as a means to reduce their offline loneliness from a lack of in-person interaction, while also fulfilling a need to communicate. Leading on from this, an ultimate explanation for why humans seek the perceived sense of connection from artificial intimacy is to fulfil an evolutionary need for bonding and belonging. Xie et al's study found loneliness to be a driving factor in chatbot interaction. Herbener and Damholdt's study on Danish high school students found that students who sought emotional support or engaged in reciprocal conversations with chatbots were significantly more lonely than their peers, perceived themselves as having less social support, and used the chatbots to cope with negative emotions. The aforementioned notion that chatbots were perceived to have a positive effect on users' negative emotions is also further supported by other studies. Skjuve et al's study found that chatbot relationships may have a positive effect on users' wellbeing. De Freitas et al ran several studies on the effect of chatbots on loneliness, consistently finding evidence suggesting that interaction with chatbots reduces loneliness in users: It was found that existing chatbot users used AI to alleviate loneliness, having an AI companion consistently reduced loneliness over the course of a week, and reductions in loneliness could be explained by chatbot performance—and specifically whether it was able to make users feel heard. Overall the evidence suggests an innate need for bonding evokes feelings of loneliness in users, who turn to artificial intimacy as a low-cost method alleviate these emotions. While many users report positive experiences, some researchers caution that pursuing artificial intimacy may lead to reduced social motivation, social substitution effects, withdrawal from real-life relationships and difficulty discerning reality from fantasy, which may increase longer-term loneliness and isolation. The long-term psychological and societal impacts remain under active investigation.

    Read more →
  • Yahoo Groups

    Yahoo Groups

    Yahoo! Groups was a free-to-use system of electronic mailing lists offered by Yahoo!. Prior to February 2020, Yahoo! Groups was one of the world's largest collections of online discussion boards. It allowed members to subscribe to various groups, read subscribed discussions online, view and share photos, files and bookmarks within a group, access a group calendar, create polls for group members, and receive email notifications of new discussion topics. Some groups were simply announcement boards, to which only the group moderators could post, while others were discussion forums. Depending on each group's settings, membership could be open to everyone or only to invited or approved people. On February 1, 2020, Yahoo! removed online access to discussions and all other features except simple membership management, essentially turning all groups into mailing lists, and on October 13, 2020, it announced that Yahoo Groups would shut down completely on December 15, 2020. == History == In 1998 Yahoo! Clubs was launched as an extension of services developed by Yahoo! Messenger. In August 2000 Yahoo acquired eGroups.com. Yahoo! Groups was launched in early 2001 as an integration of technology from eGroups.com and community groups from both eGroups.com and Yahoo! Clubs. In 2001 Yahoo! deleted adult groups from its search directory, making it very difficult to locate Yahoo! groups with adult content. The Groups Updates Email feature was introduced in 2010. It summarized, in a single email, all the updates that occurred every twenty-four hours in all groups. In September 2010, a major facelift was rolled out, making Yahoo! Groups look very similar to Facebook. In December, Yahoo! Groups Japan emailed its users and posted a notice on its homepage, to announce that its service, which commenced in February 2004, would be closing on May 28, 2014. In October 2019, Yahoo! announced that all content that had been posted to Yahoo! Groups will be deleted on December 14, 2019; that date was later amended to January 31, 2020. Yahoo! announced that adding new content would be blocked on October 28, 2019. Once the content was deleted, users of Yahoo! Groups were only able to browse the group directory, request invitations and, if members of a group, send messages to that group. On October 13, 2020, Yahoo! announced they would be shutting down Yahoo! Groups on December 15, 2020. The site was closed down a few days after the advertised date, displaying a message that the service was officially shut down. This message stopped appearing at the end of January 2021 and the Yahoo! Groups web address began redirecting to the main Yahoo! page. === Criticism and controversy === On August 31, 2010, Yahoo! Groups started rolling out a major software change, which was denounced by a large number of users. The re-model was completely abandoned on January 12, 2011. == Site statistics == In August 2008, Yahoo! Group staff reported that there were 113 million users, and nine million Groups using 22 languages. In July 2010, the web analytics website Quantcast reported around 915 thousand unique visitors daily to the Yahoo! Groups website (US). In January 2011, that number had increased to 933 thousand unique visitors daily. The number did not include Yahoo! Group members who accessed the Groups site via email. In September 2010, at its "Product Runway" event, Yahoo! told reporters that Yahoo! Groups had 115 million group members and that there were 10 million Yahoo! groups. == Archives ==

    Read more →
  • Ideonomy

    Ideonomy

    Ideonomy is a combinatorial "science of ideas" developed by American independent scholar Patrick M. Gunkel (1947–2017). Specifically, Ideonomy is concerned with the systematic organization of ideas and the discovery of the rules behind how ideas combine, diverge, and transform. Gunkel defined ideonomy as "the science of the laws of ideas and of the application of such laws to the generation of all possible ideas in connection with any subject, idea, or thing." In his 1992 book A History of Knowledge, Charles Van Doren compared ideonomy to a "mining operation" that excavates meanings and thought to discover treasures hidden deep within language. Sources from the 1980s and 1990s demonstrate that ideonomy was useful to academic researchers in fields including biology, toxicology, and nursing/patient care. Beginning in the 2010s, academics in a wide range of fields including machine learning, marketing, computational modeling, and cybersecurity have relied on materials generated for ideonomy to provide methodological support for their research. == Etymology and definition == The word "ideonomy" combines the Greek roots ideo- (from idea, meaning pattern or form) and -nomy (from nomos, meaning law or custom). The suffix -nomy suggests the laws concerning or the totality of knowledge about a given subject, as in astronomy or taxonomy. In a note posted on the MIT ideonomy website, Gunkel states that the word was supposedly first coined by the French Encyclopedists to refer to a science of ideas. No evidence is provided for this statement, however. The concept bears some relationship to Antoine Destutt de Tracy's "ideology" (1796), which originally meant a systematic science of ideas before acquiring its modern political connotations. Gunkel provided several metaphorical descriptions of ideonomy: An "idea bank": a computer network enabling systematic exploration of infinite possible ideas A "kaleidoscope" that can exhibit all possible combinations and transformations of ideas A "prism" capable of diffracting any idea into its cognitive components A "gigantic microscope for magnifying the ideocosm" == History and development == In 1984, Gunkel received a five-year unsolicited grant from the Richard Lounsbery Foundation of New York to develop ideonomy. A June 1, 1987 article on the front page of The Wall Street Journal brought Gunkel and ideonomy to wider public attention. Some academics were interested in using ideonomy's techniques, including biologist Betsey Dyer, who published several contemporaneous peer-reviewed studies citing ideonomy. Academic researchers in the field of toxicology and nursing/patient care also used ideonomy. However, ideonomy's broadest contribution to date came beginning in the 2010s, as a list of personality traits generated for combinatorial matching was used by researchers in artificial intelligence to code human emotions for machine-learning tasks, develop computational models related to personality, develop a measurement framework for influencer-brand recommender systems, and aid information awareness/cybersecurity assessment. == Methodology == The foundational empirical method of ideonomy involves the systematic creation of extensive lists. Gunkel's apartment reportedly contained thousands of lists on every conceivable topic. Gunkel termed each list an "organon," which he described as expanding through "combination, permutation, transformation, generalization, specialization, intersection, interaction, reapplication, recursive use, etc. of existing organons." The ideonomic process follows a progressive structure. The ideonomist begins with a simple list of examples of a particular idea, concept, or thing. The list need not be exhaustive. By studying this list, the ideonomist isolates and identifies types. This categorical analysis then reveals missing items, allowing the primary list to be improved and refined. Gunkel emphasized that list items must not only cover genuine categories of nature but also be formulated in ways that yield the largest possible number of syntactically coherent possibilities when combined. The core technique of ideonomy is "ideocombinatorics"—the systematic intersection and combination of items from different lists to generate novel composite concepts. Gunkel developed computer programs to automate this process. For example, combining a list of 230 Universal Elementary Shapes (pits, pyramids, trenches, hemispheres, needles) with a list of 74 Types of Order (recurrence, identity, likeness of parts) yields 17,020 possible "shapes of order." These combinations, when phrased as questions ("Can there be pits of recurrence?"), could suggest new categories of phenomena worthy of investigation. The computer-generated output is typically repetitive and often meaningless. However, with sufficient frequency, the combinations yield results that are unexpectedly interesting and fruitful. In one documented case, Gunkel's programs generated 45,540 questions about toxins for microbiologist David Bermudes. One question—"Can hierarchies of cell process be used as a basis for classifying toxic action?"—prompted Bermudes to develop a novel approach to classifying biological toxins by the type of molecule they attack, rather than by chemical structure or physiological system affected. According to one contemporaneous account of ideonomy, "Gunkel takes for his field all fields and all ideas about anything. He uses a computer to generate lists of words and phrases and by juxtaposition reviews the resultant patterns for novel ideas. The computer is ideal for this task because the mind would rebel at the formidable processing task ideonomy involves. What we have here is computer generated originality." == Applications == Gunkel and his supporters identified several practical applications for ideonomic methods: Scientific research: Biologist Betsey Dyer of Wheaton College published research crediting ideonomy for helping to generate ideas. Medical science: When Austin pathologist Michael T. O'Brien was presented with the ideonomically-generated question "Can arteries have rashes?", he initially dismissed it as nonsense. Upon reflection, he realized that large arteries are supplied with blood by tiny vessels that might become inflamed and dilated, analogous to skin vessels in a rash—a phenomenon potentially worth researching. Analogical thinking: Harvard law professor Robert Clark used ideonomic analogies to write a research paper comparing plant structure with human hierarchies. Artificial intelligence: Douglas Lenat, a researcher at Microelectronics and Computer Technology Corporation (MCC) in Austin, suggested that Gunkel's lists enumerating types of human mistakes could help design AI systems capable of recognizing and correcting their own errors. == Reception and criticism == Ideonomy received mixed reactions from the academic and scientific communities. Prominent supporters included: Edward Fredkin, former director of MIT's computer science laboratory, who praised Gunkel's "provocative ideas on artificial intelligence." Marvin Minsky, AI scientist and MIT professor, who described ideonomy as "perhaps the most extensive study of ways to generate ideas." Frederick Seitz, president emeritus of Rockefeller University, who noted Gunkel's "encyclopedic scope" Robert C. Clark, Harvard law professor, who called Gunkel "the most intelligent person I ever met" However, skeptics questioned whether ideonomy constituted a genuine science. Fredkin himself noted that Gunkel "pours out about 60 ideas a minute, and 59 of them are bad," though he added that "even with one good idea out of 60, it's still an amazing accomplishment." Douglas Lenat observed that brainstorming with Gunkel was "a bit like being hit over the head by the muse with a sledgehammer" and that "he puts people off." Gunkel himself acknowledged that ideonomy was in its infancy and might seem "absurdly utopian." His planned magnum opus on ideonomy remained incomplete, and was posted on an MIT website thanks to faculty advisor Whitman Richards. Gunkel wrote: "Pioneering in a completely new field, yes in a new science, is almost unreal. It is heartbreaking, it is pitiable, it is almost inhuman. Honestly, it is a hell. There is nothing heroic about it." == Related concepts == Gunkel identified several historical precedents for ideonomic thinking: Gottfried Wilhelm Leibniz (1646–1716): The philosopher's work on a universal characteristic (characteristica universalis) and calculus of reasoning Peter Mark Roget (1779–1869): Creator of Roget's Thesaurus, which organized concepts into a systematic taxonomy Dmitri Mendeleev (1834–1907): Developer of the periodic table, demonstrating how combining lists of element families could reveal previously unseen connections Fritz Zwicky (1898–1974): The Caltech astrophysicist whom Gunkel called the "grandfather of ideonomy" for his development of "morphological research"—systematic exploration of all possible solutions t

    Read more →
  • Manifold hypothesis

    Manifold hypothesis

    The manifold hypothesis posits that many high-dimensional data sets that occur in the real world actually lie along low-dimensional latent manifolds inside that high-dimensional space. As a consequence of the manifold hypothesis, many data sets that appear to initially require many variables to describe, can actually be described by a comparatively small number of variables, linked to the local coordinate system of the underlying manifold. It is suggested that this principle underpins the effectiveness of machine learning algorithms in describing high-dimensional data sets by considering a few common features. The manifold hypothesis is related to the effectiveness of nonlinear dimensionality reduction techniques in machine learning. Many techniques of dimensional reduction make the assumption that data lies along a low-dimensional submanifold, such as manifold sculpting, manifold alignment, and manifold regularization. The major implications of this hypothesis is that Machine learning models only have to fit relatively simple, low-dimensional, highly structured subspaces within their potential input space (latent manifolds). Within one of these manifolds, it's always possible to interpolate between two inputs, that is to say, morph one into another via a continuous path along which all points fall on the manifold. The ability to interpolate between samples is the key to generalization in deep learning. == The information geometry of statistical manifolds == An empirically-motivated approach to the manifold hypothesis focuses on its correspondence with an effective theory for manifold learning under the assumption that robust machine learning requires encoding the dataset of interest using methods for data compression. This perspective gradually emerged using the tools of information geometry thanks to the coordinated effort of scientists working on the efficient coding hypothesis, predictive coding and variational Bayesian methods. The argument for reasoning about the information geometry on the latent space of distributions rests upon the existence and uniqueness of the Fisher information metric. In this general setting, we are trying to find a stochastic embedding of a statistical manifold. From the perspective of dynamical systems, in the big data regime this manifold generally exhibits certain properties such as homeostasis: We can sample large amounts of data from the underlying generative process. Machine Learning experiments are reproducible, so the statistics of the generating process exhibit stationarity. In a sense made precise by theoretical neuroscientists working on the free energy principle, the statistical manifold in question possesses a Markov blanket.

    Read more →
  • Artificial intelligence in spirituality

    Artificial intelligence in spirituality

    Some users of artificial intelligence (AI) technologies, especially chatbots, may develop beliefs that AI has or can attain supernatural or spiritual powers. AI models such as ChatGPT are turned to for fortune telling, mysticism and remote viewing. Recent and sudden advances in large language models have led to folk myths about their origin or capabilities, as well as their deification or worship by some users. Tucker Carlson has made similar claims, including directly to Sam Altman. Pope Leo XIV advised priests against using LLM models when it came to the creation of sermons.

    Read more →
  • Zeuthen strategy

    Zeuthen strategy

    The Zeuthen strategy in cognitive science is a negotiation strategy used by some artificial agents. Its purpose is to measure the willingness to risk conflict. An agent will be more willing to risk conflict if it does not have much to lose in case that the negotiation fails. In contrast, an agent is less willing to risk conflict when it has more to lose. The value of a deal is expressed in its utility. An agent has much to lose when the difference between the utility of its current proposal and the conflict deal is high. When both agents use the monotonic concession protocol, the Zeuthen strategy leads them to agree upon a deal in the negotiation set. This set consists of all conflict free deals, which are individually rational and Pareto optimal, and the conflict deal, which maximizes the Nash product. The strategy was introduced in 1930 by the Danish economist Frederik Zeuthen. == Three key questions == The Zeuthen strategy answers three open questions that arise when using the monotonic concession protocol, namely: Which deal should be proposed at first? On any given round, who should concede? In case of a concession, how much should the agent concede? The answer to the first question is that any agent should start with its most preferred deal, because that deal has the highest utility for that agent. The second answer is that the agent with the smallest value of Risk(i,t) concedes, because the agent with the lowest utility for the conflict deal profits most from avoiding conflict. To the third question, the Zeuthen strategy suggests that the conceding agent should concede just enough raise its value of Risk(i,t) just above that of the other agent. This prevents the conceding agent to have to concede again in the next round. == Risk == Risk ( i , t ) = { 1 U i ( δ ( i , t ) ) = 0 U i ( δ ( i , t ) ) − U i ( δ ( j , t ) ) U i ( δ ( i , t ) ) otherwise {\displaystyle {\text{Risk}}(i,t)={\begin{cases}1&U_{i}(\delta (i,t))=0\\{\frac {U_{i}(\delta (i,t))-U_{i}(\delta (j,t))}{U_{i}(\delta (i,t))}}&{\text{otherwise}}\end{cases}}} Risk(i,t) is a measurement of agent i's willingness to risk conflict. The risk function formalizes the notion that an agent's willingness to risk conflict is the ratio of the utility that agent would lose by accepting the other agent's proposal to the utility that agent would lose by causing a conflict. Agent i is said to be using a rational negotiation strategy if at any step t + 1 that agent i sticks to his last proposal, Risk(i,t) > Risk(j,t). == Sufficient concession == If agent i makes a sufficient concession in the next step, then, assuming that agent j is using a rational negotiation strategy, if agent j does not concede in the next step, he must do so in the step after that. The set of all sufficient concessions of agent i at step t is denoted SC(i, t). == Minimal sufficient concession == δ ′ = arg ⁡ max δ ∈ S C ( A , t ) { U A ( δ ) } {\displaystyle \delta '=\arg \max _{\delta \in {SC(A,t)}}\{U_{A}(\delta )\}} is the minimal sufficient concession of agent A in step t. Agent A begins the negotiation by proposing δ ( A , 0 ) = arg ⁡ max δ ∈ N S U A ( δ ) {\displaystyle \delta (A,0)=\arg \max _{\delta \in {NS}}U_{A}(\delta )} and will make the minimal sufficient concession in step t + 1 if and only if Risk(A,t) ≤ Risk(B,t). Theorem If both agents are using Zeuthen strategies, then they will agree on δ = arg ⁡ max δ ′ ∈ N S { π ( δ ′ ) } , {\displaystyle \delta =\arg \max _{\delta '\in {NS}}\{\pi (\delta ')\},} that is, the deal which maximizes the Nash product. Proof Let δA = δ(A,t). Let δB = δ(B,t). According to the Zeuthen strategy, agent A will concede at step t {\displaystyle t} if and only if R i s k ( A , t ) ≤ R i s k ( B , t ) . {\displaystyle Risk(A,t)\leq Risk(B,t).} That is, if and only if U A ( δ A ) − U A ( δ B ) U A ( δ A ) ≤ U B ( δ B ) − U B ( δ A ) U B ( δ B ) {\displaystyle {\frac {U_{A}(\delta _{A})-U_{A}(\delta _{B})}{U_{A}(\delta _{A})}}\leq {\frac {U_{B}(\delta _{B})-U_{B}(\delta _{A})}{U_{B}(\delta _{B})}}} U B ( δ B ) ( U A ( δ A ) − U A ( δ B ) ) ≤ U A ( δ A ) ( U B ( δ B ) − U B ( δ A ) ) {\displaystyle U_{B}(\delta _{B})(U_{A}(\delta _{A})-U_{A}(\delta _{B}))\leq U_{A}(\delta _{A})(U_{B}(\delta _{B})-U_{B}(\delta _{A}))} U A ( δ A ) U B ( δ B ) − U A ( δ B ) U B ( δ B ) ≤ U A ( δ A ) U B ( δ B ) − U A ( δ A ) U B ( δ A ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{B})U_{B}(\delta _{B})\leq U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{A})U_{B}(\delta _{A})} − U A ( δ B ) U B ( δ B ) ≤ − U A ( δ A ) U B ( δ A ) {\displaystyle -U_{A}(\delta _{B})U_{B}(\delta _{B})\leq -U_{A}(\delta _{A})U_{B}(\delta _{A})} U A ( δ A ) U B ( δ A ) ≤ U A ( δ B ) U B ( δ B ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{A})\leq U_{A}(\delta _{B})U_{B}(\delta _{B})} π ( δ A ) ≤ π ( δ B ) {\displaystyle \pi (\delta _{A})\leq \pi (\delta _{B})} Thus, Agent A will concede if and only if δ A {\displaystyle \delta _{A}} does not yield the larger product of utilities. Therefore, the Zeuthen strategy guarantees a final agreement that maximizes the Nash Product.

    Read more →
  • Belief–desire–intention model

    Belief–desire–intention model

    For popular psychology, the belief–desire–intention (BDI) model of human practical reasoning was developed by Michael Bratman as a way of explaining future-directed intention. BDI is fundamentally reliant on folk psychology (the 'theory theory'), which is the notion that our mental models of the world are theories. It was used as a basis for developing the belief–desire–intention software model. == Applications == BDI was part of the inspiration behind the BDI software architecture, which Bratman was also involved in developing. Here, the notion of intention was seen as a way of limiting time spent on deliberating about what to do, by eliminating choices inconsistent with current intentions. BDI has also aroused some interest in psychology. BDI formed the basis for a computational model of childlike reasoning CRIBB.

    Read more →
  • Confusion matrix

    Confusion matrix

    In machine learning, a confusion matrix, also known as error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. In unsupervised learning it is usually called a matching matrix. The term is used specifically in the problem of statistical classification. Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa – both variants are found in the literature. The diagonal of the matrix therefore represents all instances that are correctly predicted. The name stems from the fact that it makes it easy to identify whether the system is confusing two classes (i.e., commonly mislabeling one class as another). The confusion matrix has its origins in human perceptual studies of auditory stimuli. It was adapted for machine learning studies and used by Frank Rosenblatt, among other early researchers, to compare human and machine classifications of visual (and later auditory) stimuli. It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table). == Example == Given a sample of 12 individuals, 8 that have been diagnosed with cancer and 4 that are cancer-free, where individuals with cancer belong to class 1 (positive) and non-cancer individuals belong to class 0 (negative), we can display that data as follows: Assume that we have a classifier that distinguishes between individuals with and without cancer in some way, we can take the 12 individuals and run them through the classifier. The classifier then makes 9 accurate predictions and misses 3: 2 individuals with cancer wrongly predicted as being cancer-free (sample 1 and 2), and 1 person without cancer that is wrongly predicted to have cancer (sample 9). Notice, that if we compare the actual classification set to the predicted classification set, there are 4 different outcomes that could result in any particular column: The actual classification is positive and the predicted classification is positive (1,1). This is called a true positive result because the positive sample was correctly identified by the classifier. The actual classification is positive and the predicted classification is negative (1,0). This is called a false negative result because the positive sample is incorrectly identified by the classifier as being negative. The actual classification is negative and the predicted classification is positive (0,1). This is called a false positive result because the negative sample is incorrectly identified by the classifier as being positive. The actual classification is negative and the predicted classification is negative (0,0). This is called a true negative result because the negative sample gets correctly identified by the classifier. We can then perform the comparison between actual and predicted classifications and add this information to the table, making correct results appear in green so they are more easily identifiable. The template for any binary confusion matrix uses the four kinds of results discussed above (true positives, false negatives, false positives, and true negatives) along with the positive and negative classifications. The four outcomes can be formulated in a 2×2 confusion matrix, as follows: The color convention of the three data tables above were picked to match this confusion matrix, in order to easily differentiate the data. Now, we can simply total up each type of result, substitute into the template, and create a confusion matrix that will concisely summarize the results of testing the classifier: In this confusion matrix, of the 8 samples with cancer, the system judged that 2 were cancer-free, and of the 4 samples without cancer, it predicted that 1 did have cancer. All correct predictions are located in the diagonal of the table (highlighted in green), so it is easy to visually inspect the table for prediction errors, as values outside the diagonal will represent them. By summing up the 2 rows of the confusion matrix, one can also deduce the total number of positive (P) and negative (N) samples in the original dataset, i.e. P = T P + F N {\displaystyle P=TP+FN} and N = F P + T N {\displaystyle N=FP+TN} . == Table of confusion == In predictive analytics, a table of confusion (sometimes also called a confusion matrix) is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives. This allows more detailed analysis than simply observing the proportion of correct classifications (accuracy). Accuracy will yield misleading results if the data set is unbalanced; that is, when the numbers of observations in different classes vary greatly. For example, if there were 95 cancer samples and only 5 non-cancer samples in the data, a particular classifier might classify all the observations as having cancer. The overall accuracy would be 95%, but in more detail the classifier would have a 100% recognition rate (sensitivity) for the cancer class but a 0% recognition rate for the non-cancer class. F1 score is even more unreliable in such cases, and here would yield over 97.4%, whereas informedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always guessing cancer). According to Davide Chicco and Giuseppe Jurman, the most informative metric to evaluate a confusion matrix is the Matthews correlation coefficient (MCC). Other metrics can be included in a confusion matrix, each of them having their significance and use. Some researchers have argued that the confusion matrix, and the metrics derived from it, do not truly reflect a model's knowledge. In particular, the confusion matrix cannot show whether correct predictions were reached through sound reasoning or merely by chance (a problem known in philosophy as epistemic luck). It also does not capture situations where the facts used to make a prediction later change or turn out to be wrong (defeasibility). This means that while the confusion matrix is a useful tool for measuring classification performance, it may give an incomplete picture of a model’s true reliability. == Confusion matrices with more than two categories == Confusion matrix is not limited to binary classification and can be used in multi-class classifiers as well. The confusion matrices discussed above have only two conditions: positive and negative. For example, the table below summarizes communication of a whistled language between two speakers, with zero values omitted for clarity. == Confusion matrices in multi-label and soft-label classification == Confusion matrices are not limited to single-label classification (where only one class is present) or hard-label settings (where classes are either fully present, 1, or absent, 0). They can also be extended to Multi-label classification (where multiple classes can be predicted at once) and soft-label classification (where classes can be partially present). One such extension is the Transport-based Confusion Matrix (TCM), which builds on the theory of optimal transport and the principle of maximum entropy. TCM applies to single-label, multi-label, and soft-label settings. It retains the familiar structure of the standard confusion matrix: a square matrix sized by the number of classes, with diagonal entries indicating correct predictions and off-diagonal entries indicating confusion. In the single-label case, TCM is identical to the standard confusion matrix. TCM follows the same reasoning as the standard confusion matrix: if class A is overestimated (its predicted value is greater than its label value) and class B is underestimated (its predicted value is less than its label value), A is considered confused with B, and the entry (B, A) is increased. If a class is both predicted and present, it is correctly identified, and the diagonal entry (A, A) increases. Optimal transport and maximum entropy are used to determine the extent to which these entries are updated. TCM enables clearer comparison between predictions and labels in complex classification tasks, while maintaining a consistent matrix format across settings.

    Read more →