AI Data Analyst Zalando

AI Data Analyst Zalando — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Outline of deep learning

    Outline of deep learning

    The following outline is provided as an overview of, and topical guide to, deep learning: Deep learning is a subfield of machine learning and artificial intelligence based on artificial neural networks with multiple processing layers. It emphasizes representation learning and is widely used in areas such as computer vision, natural language processing, speech recognition, recommender systems, robotics, and generative artificial intelligence. == Ways to categorize deep learning == A field of study A branch of artificial intelligence A subfield of machine learning A subfield of computer science A form of representation learning A class of methods based on artificial neural networks An approach used in computational statistics == History == === Precursors === Cybernetics Perceptron Connectionism Neocognitron Backpropagation === Milestones === LeNet Long short-term memory Deep belief network AlexNet Sequence to sequence learning Generative adversarial network Residual neural network Transformer BERT Generative pre-trained transformer Diffusion model === Related histories === History of artificial intelligence History of machine learning Timeline of machine learning == Core concepts == == Learning settings == Supervised learning Unsupervised learning Self-supervised learning Semi-supervised learning Reinforcement learning Transfer learning Multitask learning Multimodal learning Online machine learning Continual learning == Common tasks == Image classification Object detection Image segmentation Automatic speech recognition Neural machine translation Question answering Automatic summarization Text-to-image model Protein structure prediction == Architectures == === Feedforward and convolutional architectures === Feedforward neural network Multilayer perceptron Convolutional neural network Radial basis function network Residual neural network U-Net === Recurrent and sequence architectures === Recurrent neural network Long short-term memory Gated recurrent unit Sequence to sequence learning Recursive neural network === Representation-learning architectures === Autoencoder Denoising autoencoder Sparse autoencoder Variational autoencoder Restricted Boltzmann machine Deep belief network === Attention and transformer architectures === Attention (machine learning) Transformer BERT Generative pre-trained transformer Vision transformer === Generative and probabilistic architectures === Autoregressive model Diffusion model Energy-based model Generative adversarial network Mixture of experts === Graph and memory architectures === Graph neural network Graph convolutional network Siamese network Neural Turing machine Memory network Echo state network Capsule neural network == Neural network components and techniques == Artificial neuron Activation function Rectified linear unit Sigmoid function Softmax function Embedding Convolution Pooling layer Attention Batch normalization Layer normalization Residual connections == Training and optimization == Backpropagation Gradient descent Stochastic gradient descent Adam optimization Learning rate Loss function Cross-entropy Mean squared error Regularization Dropout Early stopping Batch normalization Data augmentation Transfer learning Knowledge distillation Ensemble learning Curriculum learning == Datasets and benchmarks == CIFAR-10 ImageNet MNIST database Common Objects in Context (COCO) General Language Understanding Evaluation (GLUE) benchmark LibriSpeech SQuAD == Applications == === Computer vision === Computer vision Facial recognition system Image classification Image segmentation Medical imaging Object detection Optical character recognition === Natural language processing === Automatic summarization Chatbot Information retrieval Large language model Natural language processing Neural machine translation Question answering Sentiment analysis === Speech and audio === Automatic speech recognition Music information retrieval Speaker recognition Speech synthesis === Science and medicine === Bioinformatics Computational biology Drug discovery Medical diagnosis Protein structure prediction === Robotics and control === Autonomous car Computer game bot Control theory Robotics === Recommendation, search, and forecasting === Anomaly detection Forecasting Fraud detection Recommender system Search engine === Generative artificial intelligence === Deepfake Generative artificial intelligence Large language model Speech synthesis Text-to-image model === Computer graphics and video games === Deep Learning Anti-Aliasing (DLAA) Deep Learning Super Sampling (DLSS) == Hardware == AMD Instinct AMD XDNA Application-specific integrated circuit Deep learning processor, Neural processing unit (NPU), or Neural Engine Field-programmable gate array General-purpose computing on graphics processing units (GPGPU) Graphics processing unit NVIDIA Deep Learning Accelerator (NVDLA) Tensor processing unit Vision processing unit Wafer-scale integration === Supporting software platforms === CUDA Metal ROCm == Software == === Open-source frameworks and libraries === === Neural network software === EDLUT Emergent Encog JOONE Neuroph NeuroSolutions OpenNN Peltarion Synapse SNNS === Platforms, tools, and deployment === Amazon SageMaker Google Colab Hugging Face Kaggle Kubeflow MLflow ONNX OpenVINO TensorFlow Hub == Algorithms for deep learning and neural networks == Backpropagation Conjugate gradient method Generalized Hebbian algorithm Gradient descent Levenberg–Marquardt algorithm Perceptron Quasi-Newton method Wake-sleep algorithm == Methods and related topics == === Representation and metric learning === Contrastive learning Embedding Feature learning Manifold learning Metric learning === Generative modeling === Autoregressive model Diffusion model Generative adversarial network Generative model Variational inference === Efficient and scalable deep learning === Knowledge distillation Low-rank approximation Mixture of experts Quantization Sparsity === Reliability, safety, and interpretability === Adversarial machine learning AI alignment Algorithmic bias Catastrophic forgetting Differential privacy Explainable artificial intelligence Federated learning Hallucination (artificial intelligence) == Conferences and workshops == Annual Meeting of the Association for Computational Linguistics Conference on Computer Vision and Pattern Recognition Conference on Neural Information Processing Systems International Conference on Computer Vision International Conference on Learning Representations International Conference on Machine Learning == Organizations == === Research laboratories and institutions === Allen Institute for AI Alberta Machine Intelligence Institute European Laboratory for Learning and Intelligent Systems Google DeepMind Meta AI Mila Microsoft Research Vector Institute === Companies === Anthropic Cerebras Cohere DeepSeek Mistral AI OpenAI Stability AI xAI == Publications == === Books === Deep Learning – Ian Goodfellow and Yoshua Bengio Neural Networks and Deep Learning – Michael Nielsen Perceptrons – Marvin Minsky and Seymour Papert === Journals === IEEE Transactions on Neural Networks and Learning Systems Neural Networks Neural Computation == Influential persons ==

    Read more →
  • Privacy Lost

    Privacy Lost

    Privacy Lost is a 2023 short science fiction film directed by Peter Stoel and Robert Berger. It follows a family using augmented reality (AR) and artificial intelligence (AI) devices capable of reading emotional states, raising questions about privacy and manipulation. == Premise == Privacy Lost follows a family using AR glasses that capture and interpret emotions in real time. As the parents argue in a restaurant, their emotional states and even hidden feelings become visible through these glasses. An AI-driven waiter adapts its appearance for each family member, employing emotional data to influence their decisions. == Cast == Brian Kant as Waiter Michael Krass as Husband Estelle Levinson as Waitress Thor van der Linden as Scotty Carlijn van Ramshorst as Wife == Production == Filming took place at HeadQ Productions, a virtual studio located in Amsterdam. The creators sought to depict a near-future scenario in which real-time emotion analysis becomes part of daily interactions. The film was screened at the Augmented World Expo (AWE), where it was noted for its thematic focus on AI-driven manipulation and emotional tracking. The depiction of AR glasses and AI characters integrates modern visual effects to show how devices might analyze emotional responses in real time. It also depicts how AI-driven interactions could influence consumer decisions, pointing to concerns over potential misuse. == Themes == Privacy Lost focuses on the intersection of advanced AI capabilities and AR environments, showing how real-time emotional analysis can be leveraged for targeted persuasion. The film aims to highlight the social and ethical implications of emerging AR and AI technologies, underlining how establishing clear regulatory frameworks for them is necessary to protect individual privacy, govern the storage of emotion-based data, and prevent manipulative practices. Critics describe the film’s theme as dystopian and note that such a reality is unlikely to occur in the near future. However, despite the exaggerated scenario, the film emphasizes the importance of a responsible approach by developers toward emerging technologies.

    Read more →
  • Global Artificial Intelligence Summit & Awards

    Global Artificial Intelligence Summit & Awards

    The Global Artificial Intelligence Summit & Awards (GAISA) is an international conference on Artificial Intelligence organized annually by AICRA. Since its inception in 2019, GAISA has been held at various locations each year. The 5th Edition of GAISA will be Scheduled on April 11-12, 2024, at Bharat Mandapam. GAISA 2025 features a distinguished lineup of speakers, including leading experts, researchers, and executives from top global tech companies. These thought leaders are at the forefront of AI innovation, with deep expertise in areas such as machine learning, robotics, and ethical AI. Their diverse backgrounds span academia, industry, and entrepreneurship, offering unique insights into how AI is reshaping sectors like healthcare, finance, transportation, and more. Attendees can expect thought-provoking discussions on the future of AI, its societal impact, and the transformative potential of emerging technologies in solving complex global challenges Few Speakers are listed below:- Shri Nitin Gadkari, Rao Inderjit Singh, Piyush Goyal, Admiral R Hari Kumar PVSM, AVSM, ADC, Samir V Kamat, Narayan Tatu Rane, Prof. K. Vijay Raghavan and many others. == History == The conference was launched first in 2019 as Vigyan Bhawan New Delhi by AICRA with an objective of discussion and exploring artificial intelligence in engrossed sectors.

    Read more →
  • Estimation of distribution algorithm

    Estimation of distribution algorithm

    Estimation of distribution algorithms (EDAs), sometimes called probabilistic model-building genetic algorithms (PMBGAs), are stochastic optimization methods that guide the search for the optimum by building and sampling explicit probabilistic models of promising candidate solutions. Optimization is viewed as a series of incremental updates of a probabilistic model, starting with the model encoding an uninformative prior over admissible solutions and ending with the model that generates only the global optima. EDAs belong to the class of evolutionary algorithms. The main difference between EDAs and most conventional evolutionary algorithms is that evolutionary algorithms generate new candidate solutions using an implicit distribution defined by one or more variation operators, whereas EDAs use an explicit probability distribution encoded by a Bayesian network, a multivariate normal distribution, or another model class. Similarly as other evolutionary algorithms, EDAs can be used to solve optimization problems defined over a number of representations from vectors to LISP style S expressions, and the quality of candidate solutions is often evaluated using one or more objective functions. The general procedure of an EDA is outlined in the following: t := 0 initialize model M(0) to represent uniform distribution over admissible solutions while (termination criteria not met) do P := generate N>0 candidate solutions by sampling M(t) F := evaluate all candidate solutions in P M(t + 1) := adjust_model(P, F, M(t)) t := t + 1 Using explicit probabilistic models in optimization allowed EDAs to feasibly solve optimization problems that were notoriously difficult for most conventional evolutionary algorithms and traditional optimization techniques, such as problems with high levels of epistasis. Nonetheless, the advantage of EDAs is also that these algorithms provide an optimization practitioner with a series of probabilistic models that reveal a lot of information about the problem being solved. This information can in turn be used to design problem-specific neighborhood operators for local search, to bias future runs of EDAs on a similar problem, or to create an efficient computational model of the problem. For example, if the population is represented by bit strings of length 4, the EDA can represent the population of promising solution using a single vector of four probabilities (p1, p2, p3, p4) where each component of p defines the probability of that position being a 1. Using this probability vector it is possible to create an arbitrary number of candidate solutions. == Estimation of distribution algorithms (EDAs) == This section describes the models built by some well known EDAs of different levels of complexity. It is always assumed a population P ( t ) {\displaystyle P(t)} at the generation t {\displaystyle t} , a selection operator S {\displaystyle S} , a model-building operator α {\displaystyle \alpha } and a sampling operator β {\displaystyle \beta } . == Univariate factorizations == The most simple EDAs assume that decision variables are independent, i.e. p ( X 1 , X 2 ) = p ( X 1 ) ⋅ p ( X 2 ) {\displaystyle p(X_{1},X_{2})=p(X_{1})\cdot p(X_{2})} . Therefore, univariate EDAs rely only on univariate statistics and multivariate distributions must be factorized as the product of N {\displaystyle N} univariate probability distributions, D Univariate := p ( X 1 , … , X N ) = ∏ i = 1 N p ( X i ) . {\displaystyle D_{\text{Univariate}}:=p(X_{1},\dots ,X_{N})=\prod _{i=1}^{N}p(X_{i}).} Such factorizations are used in many different EDAs, next we describe some of them. === Univariate marginal distribution algorithm (UMDA) === The UMDA is a simple EDA that uses an operator α U M D A {\displaystyle \alpha _{UMDA}} to estimate marginal probabilities from a selected population S ( P ( t ) ) {\displaystyle S(P(t))} . By assuming S ( P ( t ) ) {\displaystyle S(P(t))} contain λ {\displaystyle \lambda } elements, α U M D A {\displaystyle \alpha _{UMDA}} produces probabilities: p t + 1 ( X i ) = 1 λ ∑ x ∈ S ( P ( t ) ) x i , ∀ i ∈ 1 , 2 , … , N . {\displaystyle p_{t+1}(X_{i})={\dfrac {1}{\lambda }}\sum _{x\in S(P(t))}x_{i},~\forall i\in 1,2,\dots ,N.} Every UMDA step can be described as follows D ( t + 1 ) = α UMDA ∘ S ∘ β λ ( D ( t ) ) . {\displaystyle D(t+1)=\alpha _{\text{UMDA}}\circ S\circ \beta _{\lambda }(D(t)).} === Population-based incremental learning (PBIL) === The PBIL, represents the population implicitly by its model, from which it samples new solutions and updates the model. At each generation, μ {\displaystyle \mu } individuals are sampled and λ ≤ μ {\displaystyle \lambda \leq \mu } are selected. Such individuals are then used to update the model as follows p t + 1 ( X i ) = ( 1 − γ ) p t ( X i ) + ( γ / λ ) ∑ x ∈ S ( P ( t ) ) x i , ∀ i ∈ 1 , 2 , … , N , {\displaystyle p_{t+1}(X_{i})=(1-\gamma )p_{t}(X_{i})+(\gamma /\lambda )\sum _{x\in S(P(t))}x_{i},~\forall i\in 1,2,\dots ,N,} where γ ∈ ( 0 , 1 ] {\displaystyle \gamma \in (0,1]} is a parameter defining the learning rate, a small value determines that the previous model p t ( X i ) {\displaystyle p_{t}(X_{i})} should be only slightly modified by the new solutions sampled. PBIL can be described as D ( t + 1 ) = α PIBIL ∘ S ∘ β μ ( D ( t ) ) {\displaystyle D(t+1)=\alpha _{\text{PIBIL}}\circ S\circ \beta _{\mu }(D(t))} === Compact genetic algorithm (cGA) === The CGA, also relies on the implicit populations defined by univariate distributions. At each generation t {\displaystyle t} , two individuals x , y {\displaystyle x,y} are sampled, P ( t ) = β 2 ( D ( t ) ) {\displaystyle P(t)=\beta _{2}(D(t))} . The population P ( t ) {\displaystyle P(t)} is then sorted in decreasing order of fitness, S Sort ( f ) ( P ( t ) ) {\displaystyle S_{{\text{Sort}}(f)}(P(t))} , with u {\displaystyle u} being the best and v {\displaystyle v} being the worst solution. The CGA estimates univariate probabilities as follows p t + 1 ( X i ) = p t ( X i ) + γ ( u i − v i ) , ∀ i ∈ 1 , 2 , … , N , {\displaystyle p_{t+1}(X_{i})=p_{t}(X_{i})+\gamma (u_{i}-v_{i}),\quad \forall i\in 1,2,\dots ,N,} where, γ ∈ ( 0 , 1 ] {\displaystyle \gamma \in (0,1]} is a constant defining the learning rate, usually set to γ = 1 / N {\displaystyle \gamma =1/N} . The CGA can be defined as D ( t + 1 ) = α CGA ∘ S Sort ( f ) ∘ β 2 ( D ( t ) ) {\displaystyle D(t+1)=\alpha _{\text{CGA}}\circ S_{{\text{Sort}}(f)}\circ \beta _{2}(D(t))} == Bivariate factorizations == Although univariate models can be computed efficiently, in many cases they are not representative enough to provide better performance than GAs. In order to overcome such a drawback, the use of bivariate factorizations was proposed in the EDA community, in which dependencies between pairs of variables could be modeled. A bivariate factorization can be defined as follows, where π i {\displaystyle \pi _{i}} contains a possible variable dependent to X i {\displaystyle X_{i}} , i.e. | π i | = 1 {\displaystyle |\pi _{i}|=1} . D Bivariate := p ( X 1 , … , X N ) = ∏ i = 1 N p ( X i | π i ) . {\displaystyle D_{\text{Bivariate}}:=p(X_{1},\dots ,X_{N})=\prod _{i=1}^{N}p(X_{i}|\pi _{i}).} Bivariate and multivariate distributions are usually represented as probabilistic graphical models (graphs), in which edges denote statistical dependencies (or conditional probabilities) and vertices denote variables. To learn the structure of a PGM from data linkage-learning is employed. === Mutual information maximizing input clustering (MIMIC) === The MIMIC factorizes the joint probability distribution in a chain-like model representing successive dependencies between variables. It finds a permutation of the decision variables, r : i ↦ j {\displaystyle r:i\mapsto j} , such that x r ( 1 ) x r ( 2 ) , … , x r ( N ) {\displaystyle x_{r(1)}x_{r(2)},\dots ,x_{r(N)}} minimizes the Kullback–Leibler divergence in relation to the true probability distribution, i.e. π r ( i + 1 ) = { X r ( i ) } {\displaystyle \pi _{r(i+1)}=\{X_{r(i)}\}} . MIMIC models a distribution p t + 1 ( X 1 , … , X N ) = p t ( X r ( N ) ) ∏ i = 1 N − 1 p t ( X r ( i ) | X r ( i + 1 ) ) . {\displaystyle p_{t+1}(X_{1},\dots ,X_{N})=p_{t}(X_{r(N)})\prod _{i=1}^{N-1}p_{t}(X_{r(i)}|X_{r(i+1)}).} New solutions are sampled from the leftmost to the rightmost variable, the first is generated independently and the others according to conditional probabilities. Since the estimated distribution must be recomputed each generation, MIMIC uses concrete populations in the following way P ( t + 1 ) = β μ ∘ α MIMIC ∘ S ( P ( t ) ) . {\displaystyle P(t+1)=\beta _{\mu }\circ \alpha _{\text{MIMIC}}\circ S(P(t)).} === Bivariate marginal distribution algorithm (BMDA) === The BMDA factorizes the joint probability distribution in bivariate distributions. First, a randomly chosen variable is added as a node in a graph, the most dependent variable to one of those in the graph is chosen among those not yet in the graph, this procedure is repeated until no remain

    Read more →
  • Luminance HDR

    Luminance HDR

    Luminance HDR, formerly Qtpfsgui, is graphics software used for the creation and manipulation of high-dynamic-range images. Released under the terms of the GPL, it is available for Linux, Microsoft Windows, and Mac OS X (Intel only). Luminance HDR supports several High Dynamic Range (HDR) as well as Low Dynamic Range (LDR) file formats. == Functionality == Prerequisite of HDR photography are several narrow-range digital images with different exposures. Luminance HDR combines these images and calculates a high-contrast image. In order to view this image on a regular computer monitor, Luminance HDR can convert it into a displayable LDR image format using a variety of methods, such as tone mapping. Currently fifteen different tone mapping operators (algorithms) are available, each one with its tunable parameters. Different image processing techniques can be applied to the generated HDR images, such as resizing, cropping, rotating and a number of projective transformations. The software also provides batch processing functionality for creating HDR images and for tone mapping them in a non-interactive way. A module for copying Exif data among sets of images is also provided. For users who prefers the command line, a non-GUI, non-graphical interface is also available on all supported platforms. A common problem with HDR photography is that images need to be aligned exactly. If the subject is static, this can be achieved using a tripod or a stable surface on which the camera is placed. In the case of image data that does not align exactly, an automatic alignment can be performed using a tool provided by the Hugin project. If this automation doesn't provide the desired result, the user may improve it manually. == Supported formats == HDR images are images with a high dynamic range and, using Luminance HDR, they can be created as well as edited. The following HDR graphic formats are supported: OpenEXR Radiance HDR Tag Image File Format (TIFF) Format: 16 Bit, 32 Bit (Float) and LogLuv Raw PFS native Luminance HDR can create an HDR image from several LDR images and tonemap an HDR into an LDR. The following LDR formats are supported: JPG PNG Portable Pixmap (PPM) Portable Bitmap (PBM) TIFF (8 Bit)

    Read more →
  • DialogOS

    DialogOS

    DialogOS is a graphical programming environment to design computer system which can converse through voice with the user. Dialogs are clicked together in a Flowchart. DialogOS includes bindings to control Lego Mindstorms robots by voice and has bindings to SQL databases, as well as a generic plugin architecture to integrate with other types of backends. DialogOS is used in computer science courses in schools and universities to teach programming and to introduce beginners in the basic principles of human/computer interaction and dialog design. It has also been used in research systems. DialogOS was initially developed commercially by CLT Sprachtechnologie GmbH until its liquidation in 2017. The rights were then acquired by Saarland University and the software was released as open-source. == Bindings to Lego Mindstorms NXT == DialogOS can control the LEGO Mindstorms NXT Series. It uses sensor-nodes to obtain values for the following sensors: noise sensor ultrasonic sensor touch sensor luminosity sensor

    Read more →
  • SQLf

    SQLf

    SQLf is a SQL extended with fuzzy set theory application for expressing flexible (fuzzy) queries to traditional (or ″Regular″) Relational Databases. Among the known extensions proposed to SQL, at the present time, this is the most complete, because it allows the use of diverse fuzzy elements in all the constructions of the language SQL. SQLf is the only known proposal of flexible query system allowing linguistic quantification over set of rows in queries, achieved through the extension of SQL nesting and partitioning structures with fuzzy quantifiers. It also allows the use of quantifiers to qualify the quantity of search criteria satisfied by single rows. Several mechanisms are proposed for query evaluation, the most important being the one based on the derivation principle. This consists in deriving classic queries that produce, given a threshold t, a t-cut of the result of the fuzzy query, so that the additional processing cost of using a fuzzy language is diminished. == Basic block == The fundamental querying structure of SQLf is the multi-relational block. The conception of this structure is based on the three basic operations of the relational algebra: projection, cartesian product and selection, and the application of fuzzy sets’ concepts. The result of a SQLf query is a fuzzy set of rows that is a fuzzy relation instead of a regular relation. A basic block in SQLf consists of a SELECT clause, a FROM clause and an optional WHERE clause. The semantic of this query structure is: The SELECT clause corresponds to the projection. It specifies the relations’ attributes (or attribute expressions) that will be selected. The resulting table is a fuzzy set and it is given in decreasing ordered of satisfaction degree. The SELECT clause specifies also a calibration that is intended to restrict the set of rows retrieved. There are two kinds of calibrations: quantitative and qualitative. In quantitative calibration the user specifies the number of results to be retrieved, so that the query will retrieve the rows with highest membership degrees up to the number of required answers. In qualitative calibration the user specifies a minim level of satisfaction that must have any retrieved row. The FROM clause corresponds to the Cartesian Product. The consult is made on the Cartesian Product of the relations that are specified in this clause. The WHERE clause corresponds to the selection. It specifies the condition for which the satisfaction degree will be calculated. Rows that do not satisfy at all the condition are rejected. This condition is a fuzzy predicate that may involve any attribute of the relations. The following is an example of a SELECT query that returns a list of hotels that are cheap. The query retrieves all rows from the Hotels table that satisfice the fuzzy predicate cheap defined by the fuzzy set μ=(∞, ∞, 25, 30). The result is sorted in descending order by the membership degree of the query.

    Read more →
  • Gene expression programming

    Gene expression programming

    Gene expression programming (GEP) in computer programming is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and composition, much like a living organism. And like living organisms, the computer programs of GEP are also encoded in simple linear chromosomes of fixed length. Thus, GEP is a genotype–phenotype system, benefiting from a simple genome to keep and transmit the genetic information and a complex phenotype to explore the environment and adapt to it. == Background == Evolutionary algorithms use populations of individuals, select individuals according to fitness, and introduce genetic variation using one or more genetic operators. Their use in artificial computational systems dates back to the 1950s where they were used to solve optimization problems (e.g. Box 1957 and Friedman 1959). But it was with the introduction of evolution strategies by Rechenberg in 1965 that evolutionary algorithms gained popularity. A good overview text on evolutionary algorithms is the book "An Introduction to Genetic Algorithms" by Mitchell (1996). Gene expression programming belongs to the family of evolutionary algorithms and is closely related to genetic algorithms and genetic programming. From genetic algorithms it inherited the linear chromosomes of fixed length; and from genetic programming it inherited the expressive parse trees of varied sizes and shapes. In gene expression programming the linear chromosomes work as the genotype and the parse trees as the phenotype, creating a genotype/phenotype system. This genotype/phenotype system is multigenic, thus encoding multiple parse trees in each chromosome. This means that the computer programs created by GEP are composed of multiple parse trees. Because these parse trees are the result of gene expression, in GEP they are called expression trees. Masood Nekoei, et al. utilized this expression programming style in ABC optimization to conduct ABCEP as a method that outperformed other evolutionary algorithms.ABCEP == Encoding: the genotype == The genome of gene expression programming consists of a linear, symbolic string or chromosome of fixed length composed of one or more genes of equal size. These genes, despite their fixed length, code for expression trees of different sizes and shapes. An example of a chromosome with two genes, each of size 9, is the string (position zero indicates the start of each gene): 012345678012345678 L+a-baccdcLabacd where “L” represents the natural logarithm function and “a”, “b”, “c”, and “d” represent the variables and constants used in a problem. == Expression trees: the phenotype == As shown above, the genes of gene expression programming have all the same size. However, these fixed length strings code for expression trees of different sizes. This means that the size of the coding regions varies from gene to gene, allowing for adaptation and evolution to occur smoothly. For example, the mathematical expression: ( a − b ) ( c + d ) {\displaystyle {\sqrt {(a-b)(c+d)}}\,} can also be represented as an expression tree: where "Q” represents the square root function. This kind of expression tree consists of the phenotypic expression of GEP genes, whereas the genes are linear strings encoding these complex structures. For this particular example, the linear string corresponds to: 01234567 Q-+abcd which is the straightforward reading of the expression tree from top to bottom and from left to right. These linear strings are called k-expressions (from Karva notation). Going from k-expressions to expression trees is also very simple. For example, the following k-expression: 01234567890 Qb+baQba is composed of two different terminals (the variables “a” and “b”), two different functions of two arguments (“” and “+”), and a function of one argument (“Q”). Its expression gives: == K-expressions and genes == The k-expressions of gene expression programming correspond to the region of genes that gets expressed. This means that there might be sequences in the genes that are not expressed, which is indeed true for most genes. The reason for these noncoding regions is to provide a buffer of terminals so that all k-expressions encoded in GEP genes correspond always to valid programs or expressions. The genes of gene expression programming are therefore composed of two different domains – a head and a tail – each with different properties and functions. The head is used mainly to encode the functions and variables chosen to solve the problem at hand, whereas the tail, while also used to encode the variables, provides essentially a reservoir of terminals to ensure that all programs are error-free. For GEP genes the length of the tail is given by the formula: t = h ( n max − 1 ) + 1 {\displaystyle t=h(n_{\max }-1)+1} where h is the head's length and nmax is maximum arity. For example, for a gene created using the set of functions F = {Q, +, −, ∗, /} and the set of terminals T = {a, b}, nmax = 2. And if we choose a head length of 15, then t = 15 (2–1) + 1 = 16, which gives a gene length g of 15 + 16 = 31. The randomly generated string below is an example of one such gene: 0123456789012345678901234567890 b+a-aQab+//+b+babbabbbababbaaa It encodes the expression tree: which, in this case, only uses 8 of the 31 elements that constitute the gene. It's not hard to see that, despite their fixed length, each gene has the potential to code for expression trees of different sizes and shapes, with the simplest composed of only one node (when the first element of a gene is a terminal) and the largest composed of as many nodes as there are elements in the gene (when all the elements in the head are functions with maximum arity). It's also not hard to see that it is trivial to implement all kinds of genetic modification (mutation, inversion, insertion, recombination, and so on) with the guarantee that all resulting offspring encode correct, error-free programs. == Multigenic chromosomes == The chromosomes of gene expression programming are usually composed of more than one gene of equal length. Each gene codes for a sub-expression tree (sub-ET) or sub-program. Then the sub-ETs can interact with one another in different ways, forming a more complex program. The figure shows an example of a program composed of three sub-ETs. In the final program the sub-ETs could be linked by addition or some other function, as there are no restrictions to the kind of linking function one might choose. Some examples of more complex linkers include taking the average, the median, the midrange, thresholding their sum to make a binomial classification, applying the sigmoid function to compute a probability, and so on. These linking functions are usually chosen a priori for each problem, but they can also be evolved elegantly and efficiently by the cellular system of gene expression programming. == Cells and code reuse == In gene expression programming, homeotic genes control the interactions of the different sub-ETs or modules of the main program. The expression of such genes results in different main programs or cells, that is, they determine which genes are expressed in each cell and how the sub-ETs of each cell interact with one another. In other words, homeotic genes determine which sub-ETs are called upon and how often in which main program or cell and what kind of connections they establish with one another. === Homeotic genes and the cellular system === Homeotic genes have exactly the same kind of structural organization as normal genes and they are built using an identical process. They also contain a head domain and a tail domain, with the difference that the heads contain now linking functions and a special kind of terminals – genic terminals – that represent the normal genes. The expression of the normal genes results as usual in different sub-ETs, which in the cellular system are called ADFs (automatically defined functions). As for the tails, they contain only genic terminals, that is, derived features generated on the fly by the algorithm. For example, the chromosome in the figure has three normal genes and one homeotic gene and encodes a main program that invokes three different functions a total of four times, linking them in a particular way. From this example it is clear that the cellular system not only allows the unconstrained evolution of linking functions but also code reuse. And it shouldn't be hard to implement recursion in this system. === Multiple main programs and multicellular systems === Multicellular systems are composed of more than one homeotic gene. Each homeotic gene in this system puts together a different combination of sub-expression trees or ADFs, creating multiple cells or main programs. For example, the program shown in the figure was created using a cellular system with two cells and three normal genes. The applications of these multicellular systems are mu

    Read more →
  • Adversarial machine learning

    Adversarial machine learning

    Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. Machine learning techniques are mostly designed to work on specific problem sets, under the assumption that the training and test data are generated from the same statistical distribution (IID). However, this assumption is often violated in practical high-stake applications, where users may intentionally supply fabricated data that violates the statistical assumption. Most common attacks in adversarial machine learning include evasion attacks, data poisoning attacks, Byzantine attacks and model extraction. == History == At the MIT Spam Conference in January 2004, John Graham-Cumming showed that a machine-learning spam filter could be used to defeat another machine-learning spam filter by automatically learning which words to add to a spam email to get the email classified as not spam. In 2004, Nilesh Dalvi and others noted that linear classifiers used in spam filters could be defeated by simple "evasion attacks" as spammers inserted "good words" into their spam emails. (Around 2007, some spammers added random noise to fuzz words within "image spam" in order to defeat OCR-based filters.) In 2006, Marco Barreno and others published "Can Machine Learning Be Secure?", outlining a broad taxonomy of attacks. As late as 2013 many researchers continued to hope that non-linear classifiers (such as support vector machines and neural networks) might be robust to adversaries, until Battista Biggio and others demonstrated the first gradient-based attacks on such machine-learning models (2012–2013). In 2012, deep neural networks began to dominate computer vision problems; starting in 2014, Christian Szegedy and others demonstrated that deep neural networks could be fooled by adversaries, again using a gradient-based attack to craft adversarial perturbations. Further work would show that adversarial attacks are harder to produce in uncontrolled environments, due to the different environmental constraints that cancel out the effect of noise. For example, any small rotation or slight illumination on an adversarial image can destroy the adversariality. In addition, researchers such as Google Brain's Nick Frosst point out that it is much easier to make self-driving cars miss stop signs by physically removing the sign itself, rather than creating adversarial examples. Frosst also believes that the adversarial machine learning community incorrectly assumes models trained on a certain data distribution will also perform well on a completely different data distribution. He suggests that a new approach to machine learning should be explored, and is currently working on a unique neural network that has characteristics more similar to human perception than state-of-the-art approaches. While adversarial machine learning continues to be heavily rooted in academia, large tech companies such as Google, Microsoft, and IBM have begun curating documentation and open source code bases to allow others to concretely assess the robustness of machine learning models and minimize the risk of adversarial attacks. === Examples === Examples include attacks in spam filtering, where spam messages are obfuscated through the misspelling of "bad" words or the insertion of "good" words; attacks in computer security, such as obfuscating malware code within network packets or modifying the characteristics of a network flow to mislead intrusion detection; attacks in biometric recognition where fake biometric traits may be exploited to impersonate a legitimate user; or to compromise users' template galleries that adapt to updated traits over time. Researchers showed that by changing only one-pixel it was possible to fool deep learning algorithms. Others 3-D printed a toy turtle with a texture engineered to make Google's object detection AI classify it as a rifle regardless of the angle from which the turtle was viewed. Creating the turtle required only low-cost commercially available 3-D printing technology. A machine-tweaked image of a dog was shown to look like a cat to both computers and humans. A 2019 study reported that humans can guess how machines will classify adversarial images. Researchers discovered methods for perturbing the appearance of a stop sign such that an autonomous vehicle classified it as a merge or speed limit sign. A data poisoning filter called Nightshade was released in 2023 by researchers at the University of Chicago. It was created for use by visual artists to put on their artwork to corrupt the data set of text-to-image models, which usually scrape their data from the internet without the consent of the image creator. McAfee attacked Tesla's former Mobileye system, fooling it into driving 50 mph over the speed limit, simply by adding a two-inch strip of black tape to a speed limit sign. Adversarial patterns on glasses or clothing designed to deceive facial-recognition systems or license-plate readers, have led to a niche industry of "stealth streetwear". An adversarial attack on a neural network can allow an attacker to inject algorithms into the target system. Researchers can also create adversarial audio inputs to disguise commands to intelligent assistants in benign-seeming audio; a parallel literature explores human perception of such stimuli. Clustering algorithms are used in security applications. Malware and computer virus analysis aims to identify malware families, and to generate specific detection signatures. In the context of malware detection, researchers have proposed methods for adversarial malware generation that automatically craft binaries to evade learning-based detectors while preserving malicious functionality. Optimization-based attacks such as GAMMA use genetic algorithms to inject benign content (for example, padding or new PE sections) into Windows executables, framing evasion as a constrained optimization problem that balances misclassification success with the size of the injected payload and showing transferability to commercial antivirus products. Complementary work uses generative adversarial networks (GANs) to learn feature-space perturbations that cause malware to be classified as benign; Mal-LSGAN, for instance, replaces the standard GAN loss with a least-squares objective and modified activation functions to improve training stability and produce adversarial malware examples that substantially reduce true positive rates across multiple detectors. == Challenges in applying machine learning to security == Researchers have observed that the constraints under which machine-learning techniques function in the security domain are different from those of common benchmark domains. Security data may change over time, include mislabeled samples, or reflect adversarial behavior, which complicates evaluation and reproducibility. === Data collection issues === Security datasets vary across formats, including binaries, network traces, and log files. Studies have reported that the process of converting these sources into features can introduce bias or inconsistencies. In addition, time-based leakage can occur when related malware samples are not properly separated across training and testing splits, which may lead to overly optimistic results. === Labeling and ground truth challenges === Malware labels are often unstable because different antivirus engines may classify the same sample in conflicting ways. Ceschin et al. note that families may be renamed or reorganized over time, causing further discrepancies in ground truth and reducing the reliability of benchmarks. === Concept drift === Because malware creators continuously adapt their techniques, the statistical properties of malicious samples also change. This form of concept drift has been widely documented and may reduce model performance unless systems are updated regularly or incorporate mechanisms for incremental learning. === Feature robustness === Researchers differentiate between features that can be easily manipulated and those that are more resistant to modification. For example, simple static attributes, such as header fields, may be altered by attackers, while structural features, such as control-flow graphs, are generally more stable but computationally expensive to extract. === Class imbalance === In realistic deployment environments, the proportion of malicious samples can be extremely low, ranging from 0.01% to 2% of total data. This unbalanced distribution causes models to develop a bias towards the majority class, achieving high accuracy but failing to identify malicious samples. Prior approaches to this problem have included both data-level solutions and sequence-specific models. Methods like n-gram and Long Short-Term Memory (LSTM) networks can model sequential data, but their performance has been shown to decline significantly when malware samples are realistically proportioned in the training set, demonstrating the limitations in

    Read more →
  • Recraft

    Recraft

    Recraft is a generative artificial intelligence program and service developed by the London-based startup Recraft, Inc. The company also offers Recraft Studio, a web-based workspace that lets users create and edit images, vectors, and mockups using various text-to-image models. Like models such as Midjourney and DALL-E, the Recraft model generates digital images from natural language prompts, and is specifically tailored for creative workflows, with features that emphasize brand consistency, text fidelity, and layout control. == History and background == Recraft, Inc. was founded in 2022 by machine learning scientist Anna Veronika Dorogush, best known for co-creating the CatBoost machine learning library at Yandex. The company emerged from stealth on May 31, 2023, with a public release of its vector graphics generation capability on Product Hunt. On January 17, 2024, TechCrunch profiled Recraft’s foundational model for graphic design, noting its emphasis on addressing copyright and ethical concerns associated with AI-generated imagery. On October 28, 2024, TechCrunch reported that Recraft's third major model, V3, had topped a crowdsourced benchmark, surpassing Midjourney and OpenAI's DALL-E in overall image quality. On May 5, 2025, Recraft announced a $30 million Series B funding round led by Accel, reporting more than four million registered users at the time of the announcement. == Models == Recraft has developed multiple generations of its text-to-image models since 2022. Each generation reflects improvements in fidelity, controllability, and support for both raster and vector outputs. The models are proprietary and accessible through the Recraft API, Recraft Studio. Recraft models are also hosted as an image generation API on fal, Replicate, Prodia, and others. === Recraft V2 === Recraft V2 was released in March 2024 and was the company’s first model trained from scratch. It contained roughly 20 billion parameters and introduced native vector image generation, brand-color conditioning, and improved stylistic consistency for icons and illustrations. === Recraft V3 === Recraft V3 was released in October 2024 and achieved first place on the Artificial Analysis benchmark hosted on Hugging Face. The model introduced advances in photorealism, improved rendering of multi-word text, and increased responsiveness to detailed descriptive prompts. It also added the “Artistic” parameter, which allowed users to adjust stylistic intensity within generated images. === Recraft V4 === Recraft V4 was released in February 2026. According to Recraft, V4 is a “ground-up rebuild” aimed at improving prompt accuracy and output quality for design workflows, with the company emphasizing “design taste” and art-directed results. Recraft states that V4 is available in two versions: V4 for faster iteration and V4 Pro for higher-resolution, print-ready assets; the API documentation describes V4 as 1-megapixel output and V4 Pro as 4-megapixel output, with vector variants available for each. === Features === Vectorization: Recraft’s models can generate and convert images into native vector formats, producing scalable graphics composed of editable paths rather than fixed pixels. Style reference: The models support the use of reference images to guide stylistic characteristics such as color palette, line quality, composition, or visual tone. Style mixing: Recraft models can combine multiple stylistic inputs within a single generation. By blending attributes from different references or stylistic instructions, the system produces images that reflect hybrid visual characteristics while maintaining internal consistency. Inpainting editing: The models support localized image modification through inpainting, enabling users to regenerate selected regions of an image while preserving surrounding content. === Model capabilities === Recraft’s models generate raster and vector images from natural-language prompts and are designed to interpret detailed descriptions with attention to composition, style, and text placement. The models support controlled stylistic variation through preset or reference-based guidance and can maintain coherent line, color, or layout structure across multiple outputs. They produce scalable vector graphics alongside high-resolution raster images, and include features for localized image modification through inpainting or outpainting operations. === Technology === Recraft has not publicly disclosed the detailed technical architecture of its models. However, third-party reviews and benchmarks have noted that its performance resembles diffusion models such as Midjourney and Stable Diffusion. The model is designed for creative workflows requiring visual consistency and flexible output formats. Reviewers have noted its ability to generate legible multi-line text, produce high-resolution imagery at various canvas sizes, and to maintain alignment with user-defined brand palettes and design themes. Though not open-source, Recraft's models are accessible through a web interface and commercial API. Advanced features such as style settings and positioning control differentiate it from general-purpose text-to-image models. == Recraft Studio == Recraft Studio is a web-based workspace for generating and editing images using Recraft’s image models and selected external models. The infinite canvas interface provides access to a range of creation and refinement tools within a single environment. Raster and vector generation with styles: Recraft Studio supports the generation of both raster and vector images. Users can apply predefined or reference-based styles during generation, allowing for visual consistency across multiple outputs. Mockups: The studio includes mockup tools that allow generated designs to be placed onto predefined surfaces or templates for visualization and presentation purposes. Vectorization: Recraft Studio provides vectorization tools that convert raster images into editable vector graphics, enabling further modification of shapes, colors, and layout. Image upscaling: The workspace includes image upscaling functionality for increasing resolution while preserving visual detail. Editing tools and natural-language editing: Recraft Studio offers a set of editing tools for modifying images within the canvas, including localized adjustments and natural-language–based editing commands that allow users to describe changes using text. === Supported models === Recraft Studio provides access to Recraft’s proprietary image models as well as other external frontier image models such as Nano Banana, GPT 4-o, Imagen, Flux, and others. == Business model == Recraft develops proprietary image models that are accessible through Recraft Studio and the Recraft API. Recraft Studio operates on a freemium model, offering a free tier with limited daily credits and paid subscriptions for access to additional features. The API follows a credit-based system in which units are purchased separately for programmatic image generation. A team plan supports collaborative use, and the API enables organizations and developers to integrate Recraft’s image generation and editing capabilities into their own systems and workflows.

    Read more →
  • Fuzzy logic

    Fuzzy logic

    Fuzzy logic is a form of many-valued logic in which the truth value of variables may be any real number between 0 and 1. It is employed to handle the concept of partial truth, where the truth value may range between completely true and completely false. By contrast, in Boolean logic, the truth values of variables may only be the integer values 0 or 1. The term fuzzy logic was introduced with the 1965 proposal of fuzzy set theory by mathematician Lotfi Zadeh. Basic fuzzy logic had, however, been studied since the 1920s, as infinite-valued logic—notably by Łukasiewicz and Tarski. The works of Zadeh and Joseph Goguen in the 1960s and 1970s went further by considering issues such as linguistic variables and lattices. Fuzzy logic is based on the observation that people make decisions based on imprecise and non-numerical information. Fuzzy models or fuzzy sets are mathematical means of representing vagueness and imprecise information (hence the term fuzzy). These models have the capability of recognising, representing, manipulating, interpreting, and using data and information that are vague and lack certainty. Fuzzy logic has been applied to many fields, from control theory to artificial intelligence. == Overview == Classical logic only permits conclusions that are either true or false. However, there are also propositions with variable answers, which one might find when asking a group of people to identify a color. In such instances, the truth appears as the result of reasoning from inexact or partial knowledge in which the sampled answers are mapped on a spectrum. Both degrees of truth and probabilities range between 0 and 1 and hence may seem identical at first, but fuzzy logic uses degrees of truth as a mathematical model of vagueness, while probability is a mathematical model of ignorance. === Applying truth values === A basic application might characterize various sub-ranges of a continuous variable. For instance, a temperature measurement for anti-lock brakes might have several separate membership functions defining particular temperature ranges needed to control the brakes properly. Each function maps the same temperature value to a truth value in the 0 to 1 range. These truth values can then be used to determine how the brakes should be controlled. Fuzzy set theory provides a means for representing uncertainty. === Linguistic variables === In fuzzy logic applications, non-numeric values are often used to facilitate the expression of rules and facts. A linguistic variable such as age may accept values such as young and its antonym old. Because natural languages do not always contain enough value terms to express a fuzzy value scale, it is common practice to modify linguistic values with adjectives or adverbs. For example, we can use the hedges rather and somewhat to construct the additional values rather old or somewhat young. == Fuzzy systems == === Mamdani === The most well-known system is the Mamdani rule-based one. It uses the following rules: Fuzzify all input values into fuzzy membership functions. Execute all applicable rules in the rulebase to compute the fuzzy output functions. De-fuzzify the fuzzy output functions to get "crisp" output values. ==== Fuzzification ==== Fuzzification is the process of assigning the numerical input of a system to fuzzy sets with some degree of membership. This degree of membership may be anywhere within the interval [0,1]. If it is 0 then the value does not belong to the given fuzzy set, and if it is 1 then the value completely belongs within the fuzzy set. Any value between 0 and 1 represents the degree of uncertainty that the value belongs in the set. These fuzzy sets are typically described by words, and so by assigning the system input to fuzzy sets, we can reason with it in a linguistically natural manner. For example, in the image below, the meanings of the expressions cold, warm, and hot are represented by functions mapping a temperature scale. A point on that scale has three "truth values"—one for each of the three functions. The vertical line in the image represents a particular temperature that the three arrows (truth values) gauge. Since the red arrow points to zero, this temperature may be interpreted as "not hot"; i.e. this temperature has zero membership in the fuzzy set "hot". The orange arrow (pointing at 0.2) may describe it as "slightly warm" and the blue arrow (pointing at 0.8) "fairly cold". Therefore, this temperature has 0.2 membership in the fuzzy set "warm" and 0.8 membership in the fuzzy set "cold". The degree of membership assigned for each fuzzy set is the result of fuzzification. Fuzzy sets are often defined as triangle or trapezoid-shaped curves, as each value will have a slope where the value is increasing, a peak where the value is equal to 1 (which can have a length of 0 or greater) and a slope where the value is decreasing. They can also be defined using a sigmoid function. One common case is the standard logistic function defined as S ( x ) = 1 1 + e − x {\displaystyle S(x)={\frac {1}{1+e^{-x}}}} which has the following symmetry property S ( x ) + S ( − x ) = 1. {\displaystyle S(x)+S(-x)=1.} From this it follows that ( S ( x ) + S ( − x ) ) ⋅ ( S ( y ) + S ( − y ) ) ⋅ ( S ( z ) + S ( − z ) ) = 1 {\displaystyle (S(x)+S(-x))\cdot (S(y)+S(-y))\cdot (S(z)+S(-z))=1} ==== Fuzzy logic operators ==== Fuzzy logic works with membership values in a way that mimics Boolean logic. To this end, replacements for basic operators ("gates") AND, OR, NOT must be available. There are several ways to accomplish this. A common replacement is called the Zadeh operators: For TRUE/1 and FALSE/0, the fuzzy expressions produce the same result as the Boolean expressions. There are also other operators, more linguistic in nature, called hedges that can be applied. These are generally adverbs such as very, or somewhat, which modify the meaning of a set using a mathematical formula. However, an arbitrary choice table does not always define a fuzzy logic function. In the paper (Zaitsev, et al), a criterion has been formulated to recognize whether a given choice table defines a fuzzy logic function and a simple algorithm of fuzzy logic function synthesis has been proposed based on introduced concepts of constituents of minimum and maximum. A fuzzy logic function represents a disjunction of constituents of minimum, where a constituent of minimum is a conjunction of variables of the current area greater than or equal to the function value in this area (to the right of the function value in the inequality, including the function value). Another set of AND/OR operators is based on multiplication, where Given any two of AND/OR/NOT, it is possible to derive the third. The generalization of AND is an instance of a t-norm. ==== IF-THEN rules ==== IF-THEN rules map input or computed truth values to desired output truth values. Example: Given a certain temperature, the fuzzy variable hot has a certain truth value, which is copied to the high variable. Should an output variable occur in several THEN parts, the values from the respective IF parts are combined using the OR operator. ==== Defuzzification ==== The goal is to get a continuous variable from fuzzy truth values. This would be easy if the output truth values were exactly those obtained from fuzzification of a given number. Since, however, all output truth values are computed independently, in most cases they do not represent such a set of numbers. One has then to decide for a number that matches best the "intention" encoded in the truth value. For example, for several truth values of fan_speed, an actual speed must be found that best fits the computed truth values of the variables 'slow', 'moderate' and so on. There is no single algorithm for this purpose. A common algorithm is For each truth value, cut the membership function at this value Combine the resulting curves using the OR operator Find the center-of-weight of the area under the curve The x position of this center is then the final output. === Takagi–Sugeno–Kang (TSK) === The Takagi–Sugeno or Takagi–Sugeno–Kang (TSK) system was introduced by Tomohiro Takagi and Michio Sugeno for fuzzy identification of systems and applications to modeling and control. Sugeno and Kang later developed methods for structure identification of such fuzzy models from input-output data. The TSK system is similar to Mamdani, but the defuzzification process is included in the execution of the fuzzy rules. These are also adapted, so that instead the consequent of the rule is represented through a polynomial function, usually constant in a zero-order model or linear in a first-order model. An example of a rule with a constant output would be: In this case, the output will be equal to the constant of the consequent (e.g. 2). In most scenarios we would have an entire rule base, with 2 or more rules. If this is the case, the output of the entire rule base will be the average of the consequent of each rule i (Y

    Read more →
  • Wayve

    Wayve

    Wayve Technologies Ltd is a British autonomous driving technology company focused on developing self-driving vehicle systems through end-to-end deep learning. Founded in 2017 by researchers from the University of Cambridge, Wayve’s approach eschews detailed 3D maps and hand-coded rules, in favor of a self-learning “AI driver” that learns from camera data and driving experience. The London-headquartered startup has garnered significant attention and funding for its visually-based method. == History == Wayve was founded in Cambridge, England, on August 21, 2017, by Amar Shah and Alex Kendall, two machine learning PhD students at the University of Cambridge. Shah initially served as CEO while Kendall was CTO, and the pair set out to develop an unconventional self-driving car system using machine learning at every layer of the driving task. In May 2018, Wayve emerged from stealth mode with backing from early-stage investors. At this time the company had around 10 employees, and its advisory investors included Uber’s Chief Scientist, Zoubin Ghahramani, who shared Wayve’s vision of a learning-centric driving AI. In 2019, Wayve achieved a milestone by training a car to drive autonomously on public roads it had never seen before, using only cameras, a basic GPS map, and end-to-end deep learning control. The company moved its base to London and secured a $20 million Series A funding round in November 2019. This investment enabled Wayve to launch a pilot fleet of autonomous electric vehicles in central London for real-world testing. During these trials, Wayve’s cars (such as retrofitted Jaguar I-Pace SUVs) began navigating the complex, narrow streets of London to prove the system’s ability to adapt to challenging urban scenarios. In 2020, co-founder Amar Shah departed the company, and Alex Kendall assumed the role of CEO. The startup joined the Microsoft for Startups: Autonomous Driving program in 2020, leveraging Microsoft Azure’s cloud computing for training its machine learning models at scale. It also committed to testing exclusively on electric vehicles, and a goal to reduce carbon emissions. In 2021, Wayve entered pilot programs with major UK retailers. It launched a 12-month autonomous delivery trial with supermarket chain Asda, and received a £10 million ($13.6 million) investment from online grocer Ocado Group as part of a partnership to develop self-driving grocery delivery vans. Ocado’s backing gave Wayve access to a fleet of delivery vans for data collection and testing on busy London routes (with human safety drivers present) to train its AI in urban traffic. In 2022, after a successful Series B funding round, the company extended road testing beyond the UK to other regions, and, by 2023, in multiple countries. The company had begun operating in the United States and in continental Europe, in preparation for larger commercial deployments. In 2023, Wayve announced a collaboration with Nissan to integrate Wayve’s AI-driven software into its ProPilot ADAS system, slated to launch in fiscal year 2027. Wayve received strategic investment from Uber, in 2024, to jointly develop autonomous ride-hailing services. The two companies plan to trial a fully driverless robotaxi service in London, supported by a UK government program to accelerate commercial self-driving pilots to as early as 2026. To demonstrate the scalability of its technology, Wayve conducted an “AI-500” roadshow project, driving in dozens of cities across Asia, Europe, and North America using the same AI model. By mid-2025, it had completed autonomous driving demos in 90 cities without prior HD mapping. In April 2025, Wayve opened its first Asian research hub in Japan, with investment by SoftBank, to improve its model’s generalization using local driving data. That year, the company conducted driving tests in over 500 cities in Europe, North America and Japan without city-specific programming. In February 2026, Nissan, Uber and Wayve announced their collaboration on robotaxi development, with the aim of launching a pilot programme in Tokyo by late 2026. Wayve also formed a strategic alliance with Mercedes-Benz and Stellantis on personal vehicle and robotaxi applications. == Financing and investors == Wayve has been backed by a mix of venture capital (VC) firms, corporate investors, and individuals. Its initial seed funding came from funds such as Compound (NYC) and Firstminute Capital (London), as well as Cambridge-based angel investors, in 2018. Academic Pieter Abbeel and Uber’s chief scientist, Zoubin Ghahramani, were early backers. In November 2019, Wayve raised a $20 million Series A led by Eclipse Ventures, with participation from Balderton Capital and other prior investors. The Series A financing was used to fund the company’s first autonomous trials in London, and marked the first time a European self-driving car startup had secured a U.S. VC as lead investor. In October 2021, Ocado Group invested £10 million (approximately $13.6 million) in Wayve as a strategic partner in autonomous grocery delivery. This brought Wayve’s total funding to around $60 million at that time. The Series B round followed in January 2022, when Wayve announced $200 million in new funding led by Eclipse Ventures, with D1 Capital Partners, Moore Strategic Ventures, and Linse Capital. Balderton, Microsoft and Virgin Group joined as strategic backers. Baillie Gifford and Compound also participated; Ocado increased its stake as a strategic investor; and Meta AI head Yann LeCun and Richard Branson also became investors. Wayve’s Series C in May 2024 closed a $1.05 billion, led by Japan’s SoftBank Group. The funding round was the largest-ever for a UK AI company, and included new investor Nvidia, and returning investors Microsoft and Eclipse Ventures, among others. Uber also joined as a stratgic partner and a stakeholder. The Series C round increased Wayve’s total funding raised to about $1.3 billion to date from investors including SoftBank, Microsoft and Nvidia, and lifted Wayve’s valuation into “unicorn” status. In February 2026, Wayve announced a $1.2 billion Series D funding round; later that month, the company reported that $1.5 billion had been raised from, primarily, Mercedes-Benz, Stellantis, Nissan, and existing backers Uber, Microsoft and Nvidia, increasing Wayve's overall valuation to $8.6 billion. == Technology == Wayve’s self-driving approach centers on end-to-end deep learning and a vision-based AI system. Unlike conventional autonomous vehicles that depend on high-definition maps, hand-coded rules, and arrays of expensive lidar sensors, Wayve’s platform learns to drive predominantly using camera data and machine learning algorithms. The company refers to its AI-driven driving software as an “Embodied AI” or AI Driver, emphasizing that the system learns from experience (both real and simulated) to handle complex or novel situations rather than following pre-programmed instructions, not unlike Tesla's approach. The Wayve hardware-agnostic autonomy stack consists of a suite of video cameras, with basic automotive sensors, mounted on the vehicle, and paired with onboard compute units that are powered by GPUs to run the AI models. This vision-only philosophy is similar to Tesla’s Autopilot/FSDB model, but Wayve’s solution is vehicle-agnostic and mapless. Wayve’s strategy is to provide its driving AI as an OEM-ready platform; it plans to license or embed its technology into vehicles made by established automakers rather than build its own cars. Wayve’s development vehicles currently use Nvidia’s Orin system-on-chip as the onboard computer for running the AI model, but CEO Kendall has noted that the software can run on “whatever GPU [an automaker] already has in their vehicles” Wayve has built a cloud infrastructure, largely on Microsoft Azure, to process petabytes of this data, and uses simulation tools (known internally as the “Wayve Infinity” simulator) to synthetically generate and practice rare or dangerous scenarios for the AI to learn from. == Corporate affairs == Wayve is a privately held company headquartered in London, England, with its primary research and development office in the Kings Cross area of London. The company was initially incorporated as Wayve Technologies Ltd in the UK. Wayve has also established a presence in the U.S., in Silicon Valley); in Canada, with a research hub in Vancouver; in Yokohama, Japan; in Leonberg, Germany; and in Herzliya, Israel. The Leadership team includes research scientists and engineers with backgrounds in computer vision, robotics, and automotive systems. President Erez Dagan was hired in 2024, following two decades at Mobileye; chief scientist Jamie Shotton is formerly of Microsoft Research; CEO Alex Kendall, originally from New Zealand with a PhD in computer vision from Cambridge, took over as CEO in 2020 after the departure of his co-founder Amar Shah.

    Read more →
  • NHS COVID-19

    NHS COVID-19

    NHS COVID-19 was a voluntary contact tracing app for monitoring the spread of the COVID-19 pandemic in England and Wales, in use from 24 September 2020 until 27 April 2023. It was available for Android and iOS smartphones, and could be used by anyone aged 16 or over. Two versions of the app were created. The first was commissioned by NHSX and developed by the Pivotal division of American software company VMware. A pilot deployment began in May 2020, but on 18 June development of the app was abandoned in favour of a second design using the Apple/Google Exposure Notification system. Scotland and Northern Ireland had separate contact tracing apps. A 2023 study estimated that in its first year of use, the app's contact tracing function prevented an estimated 1 million cases, and 9,600 deaths. == Description == The app allowed users to: See the alert level of their local authority area (in Wales) or information about restrictions (in England); to enable this, the user must enter the first half of their postcode "Check in" at places displaying an NHS QR code poster (no longer required by legislation after 26 January 2022, removed from the app the next month) Be notified when they have been in close contact with someone who has tested positive for the virus Be notified when local health protection teams determine that people with the virus had attended a business or other venue around the same time as the user Check their symptoms, and book a coronavirus test if necessary If asked to self-isolate, receive information and a daily "countdown". At first, "close contact" was defined as being within 2 metres for 15 minutes, or within 4 metres for a longer time. These time durations were reduced from 29 October 2020, to as little as three minutes when the other person is at their most infectious, i.e. soon after they begin showing symptoms. === Implementation === The Android app was coded in Kotlin, and the iOS app in Swift. The backend used Java and is deployed to Amazon Web Services using Terraform. The code of the app and back-end is open-source and available on GitHub. == Context == The app was part of the UK's test and trace programme which was chaired by Dido Harding; from 12 May 2020 Tom Riordan, chief executive of Leeds City Council, led the tracing effort. == First phase and cancellation == === Description === In March 2020, NHSX commissioned a contact tracing app to monitor the spread in the United Kingdom of the coronavirus disease 2019 (COVID-19) in the 2020 pandemic, developed by the Pivotal division of American software company VMware. The app used a centralised approach, in contrast to the Google / Apple contact tracing project. NHSX consulted ethicists and GCHQ's National Cyber Security Centre (NCSC) about the privacy aspects. The app recorded the make and model of the phone and asked the user for their postcode area. It generated a unique installation identification number and also a daily identification number. It then used Bluetooth Low Energy (BLE) to record the daily identification number of other users nearby. If a user was unwell, they could tell the app about symptoms which are characteristic of COVID-19, such as a fever and cough. These details were then passed to a central NHS server. This would assess the information and notify other users that have been in contact, giving them appropriate advice such as physical distancing. The NHS would also arrange for a swab test of the unwell user and the outcome would determine further notifications to contacts: if the test confirmed infection with COVID-19, the contacts would be asked to isolate. By June 2020, £11.8 million had been spent on the app; in 2020–21, £35 million was spent on the app. === Deployment === The first public trial of the app began on the Isle of Wight on 5 May 2020 and by 11 May it had been downloaded 55,000 times. When the first national contact tracing schemes were launched – Test, Trace, Protect in Wales on 13 May, then on 28 May NHS Test and Trace in England, and Test and Protect in Scotland – the app was not ready to be included. Replying to a question at the government's daily briefing on 8 June, Hancock was unable to give a date for rollout of the app in England, saying it would be brought in "when it's right to do so". On 17 June, Lord Bethell, junior minister for Innovation at the Department of Health and Social Care, said "we're seeking to get something going before the winter ... it isn't a priority for us at the moment". On 18 June, Health Secretary Matt Hancock announced development would switch to the Apple/Google system after admitting that Apple's restrictions on usage of Bluetooth prevented the app from working effectively. At the same press briefing Dido Harding, leader of the UK's test and trace programme, said "What we've done in really rigorously testing both our own Covid-19 app and the Google-Apple version is demonstrate that none of them are working sufficiently well enough to be actually reliable to determine whether any of us should self-isolate for two weeks [and] that's true across the world". === Concerns === The first, ultimately rejected, version of the app was subject to privacy concerns, the government backtracking on initial statements that the data collected from the app would not be shared outside the NHS. Matthew Gould, CEO of NHSX, the government department responsible for the app, said the data would be accessible to other organisations, but did not disclose which. Data collected would not necessarily be anonymised and would be held in a centralised repository. Over 150 of the UK's security and privacy experts warned the app's data could be used by 'a bad actor (state, private sector, or hacker)' to spy on citizens. Fears were discussed by the House of Commons' Human Rights Select Committee about plans for the app to record user location data. Parliament's Joint Committee on Human Rights said this version of the app should not be released without proper privacy protections. The second version of the app, released nationwide, addressed these concerns by employing a decentralised framework, the Apple/Google Exposure Notification system. Under this system, users remain pseudonymous: a person diagnosed with COVID-19 does not know which people are informed about an encounter, and contacted persons do not receive any information about the person diagnosed with COVID-19. The functionality of the app was also questioned in late April and early May 2020, as the software's use of Bluetooth required the app to be constantly running, meaning users could not use other apps or lock their device if the app was to function properly. The developers of the app were said to have found a way of working around this restriction. === Related contracts === Faculty – a company linked to Cambridge Analytica – provided research and modelling to NHSX in support of the response to the pandemic. Palantir, also linked to Cambridge Analytica, provided their data management platform. These contracts began in February and March respectively. == Second phase == As outlined on cancellation of the first app on 18 June 2020, the Department of Health and Social Care published on 30 July a brief description of the "next phase" app. Users would be able to scan a QR code at venues they visit, and later be notified if they had visited a place which was the source of a number of infections; the app would also assist with identifying symptoms and ordering a test. By using the Exposure Notification system from Apple and Google, personal data would be decentralised. Zuhlke Engineering Ltd, the UK branch of Swiss-based Zühlke Group, used 70 staff to complete the development of the app in 12 weeks. Zuhlke Engineering was awarded "Development Team of the Year" title at UK IT Industry awards in November 2021 for development of NHS COVID-19 application. === Timeline === Testing of the app by NHS volunteer responders, and selected residents of the Isle of Wight and the London Borough of Newham, began around 13 August. The app was made available to the public (aged 16 or over) in England and Wales on 24 September. An updated app released on 29 October, in part from collaboration with the Alan Turing Institute, improved the accuracy of measurements of the distance between the user's phone and other phones. At the same time, the duration threshold for determining exposure was reduced; this was expected to lead to an increase in the number of users told to self-isolate. An update to the app in April 2021, timed to coincide with easing of restrictions on hospitality businesses, was blocked by Apple and Google. It was intended that users who tested positive would be asked to share their history of visited venues, to assist in warning others, but this would have contravened assurances by Apple and Google that location data from devices would not be shared. === Statistics and effectiveness === The app was downloaded six million times on the first day it was generally availa

    Read more →
  • Trevor Paglen

    Trevor Paglen

    Trevor Paglen (born 1974) is an American artist, geographer, and author whose work covers mass surveillance and data collection. In 2016, Paglen won the Deutsche Börse Photography Foundation Prize and he has also won The Cultural Award from the German Society for Photography. In 2017, he was a recipient of a MacArthur Fellowship. On March 17, 2026, Paglen was awarded the 2026 LG Guggenheim Award (a collaboration between LG and Guggenheim New York). == Early life and education == Paglen earned a B.A. degree in religious studies in 1998 from the University of California at Berkeley, a M.F.A. degree in 2002 from the School of the Art Institute of Chicago, and a Ph.D. in Geography in 2008 from the University of California at Berkeley. While at UC Berkeley, Paglen lived in the Berkeley Student Cooperative, residing in Chateau, Fenwick, and Rochdale co-ops. == Work == Sean O'Hagan, writing in The Guardian in 2015, said that Paglen, whose "ongoing grand project [is] the murky world of global state surveillance and the ethics of drone warfare", "is one of the most conceptually adventurous political artists working today, and has collaborated with scientists and human rights activists on his always ambitious multimedia projects." His visual work such as his "Limit Telephotography" and "The Other Night Sky" series have received widespread attention for both his technical innovations and for his conceptual project that involves simultaneously making and negating documentary-style truth-claims. Paglen’s work relies on contemporary technology in two meaningful ways. Firstly, the views he photographs would be impossible to shoot without media tech, that includes the cameras, the microscopes, and even helicopters. But interestingly enough, the shots would not be possible if not for the existence of the subject. The contrasts between secrecy and revelation, evidence and abstraction distinguish Paglen's work. With that the artist presents not so much "evidence" as admonitions to awareness. He was an Eyebeam Commissioned Artist in 2007. In 2008 the Berkeley Art Museum devoted a comprehensive solo exhibition to his work. In the next year, Paglen took part in the Istanbul Biennial, and in 2010 he exhibited at the Vienna Secession. Autonomy Cube was a project by Paglen and Jacob Appelbaum that placed relays for the anonymous communication network Tor in traditional art museums. He contributed to the Oscar-winning documentary film Citizenfour (2014), directed by Laura Poitras. Paglen features in the nerd-culture documentary Traceroute (2016). Orbital Reflector was a reflective, mylar sculpture by Paglen intended to be the first "purely artistic" object in space. The temporary satellite, containing an inflatable mylar balloon with reflective surface, launched into space 3 December 2018. A mid-career survey in 2018–2019, Trevor Paglen: Sites Unseen, was a traveling exhibition shown at the Smithsonian American Art Museum in Washington DC and the Museum of Contemporary Art San Diego. In September 2020, Pace Gallery in London held an exhibition of Paglen's work, exploring "the weird, partial ways computers look back at us". His work is included in the permanent collections of the San Francisco Museum of Modern Art, the Columbus Museum of Art, and the Metropolitan Museum. === Experimental Geography === Paglen is credited with coining the term "Experimental Geography" to describe practices coupling experimental cultural production and art-making with ideas from critical human geography about the production of space, materialism, and praxis. The 2009 book Experimental Geography: Radical Approaches to Landscape, Cartography, and Urbanism is largely inspired by Paglen's work. == Publications == Paglen has published a number of books. Torture Taxi (2006) (co-authored with investigative journalist A. C. Thompson) was the first book to comprehensively describe the CIA's extraordinary rendition program. I Could Tell You But Then You Would Have to be Destroyed by Me (2007), is a look at the world of black projects through unit patches and memorabilia created for top-secret programs. Blank Spots on the Map: The Dark Geography of the Pentagon's Secret World (2009) is a broader look at secrecy in the United States. The Last Pictures (2012) is a collection of 100 images to be placed on permanent media and launched into space on EchoStar XVI, as a repository available for future civilizations (alien or human) to find. === Publications by Paglen === I Could Tell You But Then You Would Have to be Destroyed by Me. Brooklyn, NY: Melville House, 2007. ISBN 1-933633-32-8. Blank Spots on the Map: The Dark Geography of the Pentagon's Secret World. New York: Dutton, 2009. ISBN 9781101011492. Invisible: Covert Operations and Classified Landscapes, Photographs by Trevor Paglen. New York: Aperture, 2010. ISBN 9781597111300. With an essay by Rebecca Solnit. The Last Pictures. Oakland, CA: University of California, 2012. ISBN 9780520275003. Trevor Paglen. London: Phaidon, 2018. ISBN 0714873446. With essays by Laren Cornell, Julia Bryan-Wilson, Omar Kholeif. === Publications co-authored === Torture Taxi. Co-authored with A. C. Thompson. Brooklyn, NY: Melville House Publishing, 2006. ISBN 1-933633-09-3. Icon, 2007. ISBN 9781840468304. === Publications with contributions by Paglen === Experimental Geography: Radical Approaches to Landscape, Cartography, and Urbanism. Brooklyn, NY: Melville House, 2009. ISBN 978-0091636586. Edited by Nato Thompson. With essays by Paglen, Thompson, and Jeffrey Kastner. Trevor Paglen and Jacob Appelbaum – Autonomy Cube. Revolver, 2016. ISBN 978-3957633026. Essays by Luke Skrebowski and Keller Easterling on Autonomy Cube, a piece of sculpture by Paglen and Jacob Appelbaum. In English and German. == Exhibitions == Bellwether Gallery, New York, November–December 2006 The Other Night Sky, Berkeley Art Museum, 2008 A Compendium of Secrets, Cologne Still Revolution: Suspended in Time, Museum of Contemporary Canadian Art, Toronto, May–June 2009. Group exhibition with Paglen, Barbara Astman, Walead Beshty, Mat Collishaw, Stan Douglas, Idris Khan, Martha Rosler, and Mikhael Subotzky A Hidden Landscape, Aksioma, Ljubljana, Slowenia Geographies of Seeing, Lighthouse, Brighton, England, October–November 2012 The Last Pictures, New York, 2012–13 Trevor Paglen, Altman Siegel gallery, San Francisco, CA, March–May 2015 The Octopus, Frankfurter Kunstverein, Frankfurt am Main, 2015 Autonomy Cube, Edith-Russ-Haus, Oldenburg, Germany, October 2015 – January 2016. Sculpture by Paglen and Jacob Appelbaum. Deutsche Börse Photography Foundation Prize 2016, The Photographers' Gallery, London, April–July 2016. Deutsche Börse Photography Prize shortlist with Paglen, Erik Kessels, Laura El-Tantawy, and Tobias Zielony. Radical Landscapes, di Rosa, Napa, February–April 2016 L’Image volée, Americas II, Bahamas Internet Cable System (BICS-1) and Globenet, Fondazione Prada, Milan (group exhibition), 2016 A Study of Invisible Images, Metro Pictures, New York, September–October 2017 == Awards == 2014: Pioneer Award from the Electronic Frontier Foundation. 2015: The Cultural Award from the German Society for Photography (DGPh) 2015: Academy Award as cameraman and director for the documentary film Citzenfour. 2016: Deutsche Börse Photography Foundation Prize 2017: MacArthur Fellowship, John D. and Catherine T. MacArthur Foundation, Chicago, IL 2018: Nam June Paik Art Center Prize == Films about Paglen == Unseen Skies (2021) == Works ==

    Read more →
  • Belief–desire–intention software model

    Belief–desire–intention software model

    The belief–desire–intention software model (BDI) is a software model developed for programming intelligent agents. Superficially characterized by the implementation of an agent's beliefs, desires and intentions, it actually uses these concepts to solve a particular problem in agent programming. In essence, it provides a mechanism for separating the activity of selecting a plan (from a plan library or an external planner application) from the execution of currently active plans. Consequently, BDI agents are able to balance the time spent on deliberating about plans (choosing what to do) and executing those plans (doing it). A third activity, creating the plans in the first place (planning), is not within the scope of the model, and is left to the system designer and programmer. == Overview == In order to achieve this separation, the BDI software model implements the principal aspects of Michael Bratman's theory of human practical reasoning (also referred to as Belief-Desire-Intention, or BDI). That is to say, it implements the notions of belief, desire and (in particular) intention, in a manner inspired by Bratman. For Bratman, desire and intention are both pro-attitudes (mental attitudes concerned with action). He identifies commitment as the distinguishing factor between desire and intention, noting that it leads to (1) temporal persistence in plans and (2) further plans being made on the basis of those to which it is already committed. The BDI software model partially addresses these issues. Temporal persistence, in the sense of explicit reference to time, is not explored. The hierarchical nature of plans is more easily implemented: a plan consists of a number of steps, some of which may invoke other plans. The hierarchical definition of plans itself implies a kind of temporal persistence, since the overarching plan remains in effect while subsidiary plans are being executed. An important aspect of the BDI software model (in terms of its research relevance) is the existence of logical models through which it is possible to define and reason about BDI agents. Research in this area has led, for example, to the axiomatization of some BDI implementations, as well as to formal logical descriptions such as Anand Rao and Michael Georgeff's BDICTL. The latter combines a multiple-modal logic (with modalities representing beliefs, desires and intentions) with the temporal logic CTL. More recently, Michael Wooldridge has extended BDICTL to define LORA (the Logic Of Rational Agents), by incorporating an action logic. In principle, LORA allows reasoning not only about individual agents, but also about communication and other interaction in a multi-agent system. The BDI software model is closely associated with intelligent agents, but does not, of itself, ensure all the characteristics associated with such agents. For example, it allows agents to have private beliefs, but does not force them to be private. It also has nothing to say about agent communication. Ultimately, the BDI software model is an attempt to solve a problem that has more to do with plans and planning (the choice and execution thereof) than it has to do with the programming of intelligent agents. This approach has recently been proposed by Steven Umbrello and Roman Yampolskiy as a means of designing autonomous vehicles for human values. == BDI agents == A BDI agent is a particular type of bounded rational software agent, imbued with particular mental attitudes, viz: Beliefs, Desires and Intentions (BDI). === Architecture === This section defines the idealized architectural components of a BDI system. Beliefs: Beliefs represent the informational state of the agent–its beliefs about the world (including itself and other agents). Beliefs can also include inference rules, allowing forward chaining to lead to new beliefs. Using the term belief rather than knowledge recognizes that what an agent believes may not necessarily be true (and in fact may change in the future). Beliefset: Beliefs are stored in database (sometimes called a belief base or a belief set), although that is an implementation decision. Desires: Desires represent the motivational state of the agent. They represent objectives or situations that the agent would like to accomplish or bring about. Examples of desires might be: find the best price, go to the party or become rich. Goals: A goal is a desire that has been adopted for active pursuit by the agent. Usage of the term goals adds the further restriction that the set of active desires must be consistent. For example, one should not have concurrent goals to go to a party and to stay at home – even though they could both be desirable. Intentions: Intentions represent the deliberative state of the agent – what the agent has chosen to do. Intentions are desires to which the agent has to some extent committed. In implemented systems, this means the agent has begun executing a plan. Plans: Plans are sequences of actions (recipes or knowledge areas) that an agent can perform to achieve one or more of its intentions. Plans may include other plans: my plan to go for a drive may include a plan to find my car keys. This reflects that in Bratman's model, plans are initially only partially conceived, with details being filled in as they progress. Events: These are triggers for reactive activity by the agent. An event may update beliefs, trigger plans or modify goals. Events may be generated externally and received by sensors or integrated systems. Additionally, events may be generated internally to trigger decoupled updates or plans of activity. BDI was also extended with an obligations component, giving rise to the BOID agent architecture to incorporate obligations, norms and commitments of agents that act within a social environment. === BDI interpreter === This section defines an idealized BDI interpreter that provides the basis of SRI's PRS lineage of BDI systems: initialize-state repeat options: option-generator (event-queue) selected-options: deliberate(options) update-intentions(selected-options) execute() get-new-external-events() drop-unsuccessful-attitudes() drop-impossible-attitudes() end repeat === Limitations and criticisms === The BDI software model is one example of a reasoning architecture for a single rational agent, and one concern in a broader multi-agent system. This section bounds the scope of concerns for the BDI software model, highlighting known limitations of the architecture. Learning: BDI agents lack any specific mechanisms within the architecture to learn from past behavior and adapt to new situations. Three attitudes: Classical decision theorists and planning research questions the necessity of having all three attitudes, distributed AI research questions whether the three attitudes are sufficient. Logics: The multi-modal logics that underlie BDI (that do not have complete axiomatizations and are not efficiently computable) have little relevance in practice. Multiple agents: In addition to not explicitly supporting learning, the framework may not be appropriate to learning behavior. Further, the BDI model does not explicitly describe mechanisms for interaction with other agents and integration into a multi-agent system. Explicit goals: Most BDI implementations do not have an explicit representation of goals. Lookahead: The architecture does not have (by design) any lookahead deliberation or forward planning. This may not be desirable because adopted plans may use up limited resources, actions may not be reversible, task execution may take longer than forward planning, and actions may have undesirable side effects if unsuccessful. == BDI agent implementations == === 'Pure' BDI === Procedural Reasoning System (PRS) IRMA (not implemented but can be considered as PRS with non-reconsideration) UM-PRS OpenPRS Distributed Multi-Agent Reasoning System (dMARS) AgentSpeak(L) – see Jason below AgentSpeak(RT) Agent Real-Time System (ARTS) (ARTS) JAM JACK Intelligent Agents JADEX (open source project) JaKtA JASON GORITE SPARK 3APL 2APL GOAL agent programming language CogniTAO (Think-As-One) Living Systems Process Suite PROFETA Gwendolen (Part of the Model Checking Agent Programming Languages Framework) === Extensions and hybrid systems === JACK Teams CogniTAO (Think-As-One) Living Systems Process Suite Brahms JaCaMo

    Read more →