AI For Business Owners

AI For Business Owners — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Quantexa

    Quantexa

    Quantexa is a UK-based software company that develops artificial intelligence-based applications for data analytics and decision-making. The company was founded in 2016 and is headquartered in London, with operations in North America, Europe, and the Asia-Pacific region. As of 2025, Quantexa reported a valuation of $2.6 billion and provides services to organizations in over 70 countries. Investors include Warburg Pincus, HSBC, and the Ontario Teachers’ Pension Plan. == History == Quantexa was founded in London in 2016 by several co-founders, including Jamie Hutton, Richard Seewald, Imam Hoque, Felix Hoddinott, and Vishal Marria, who also serves as the company's chief executive officer. The company was established to develop tools intended to address limitations in traditional data analysis methods, particularly those related to identifying hidden connections across large datasets. The name "Quantexa" is derived from the company's focus on quantitative methods and data analysis. In 2023, Quantexa acquired Dublin-based AI firm Aylien. In April 2023, the company completed a Series E funding round, raising $129 million at a valuation of approximately $1.8 billion, marking its entry into "unicorn" status. In October 2024, the company reported annual recurring revenue (ARR) exceeding $100 million. In early 2025, Quantexa participated in the World Economic Forum's Unicorn Program, which supports high-growth technology companies. In March 2025, Quantexa completed a Series F funding round of $175 million, led by Teachers' Venture Growth, the venture arm of the Ontario Teachers' Pension Plan. That August, the company was reported to be considering a 2026 IPO. The company formed a partnership with Zurich in October 2025, the first insurer to add its AI-based Decision Intelligence platform to enhance fraud detection.

    Read more →
  • Latent semantic mapping

    Latent semantic mapping

    Latent semantic mapping (LSM) is a data-driven framework to model globally meaningful relationships implicit in large volumes of (often textual) data. It is a generalization of latent semantic analysis. In information retrieval, LSA enables retrieval on the basis of conceptual content, instead of merely matching words between queries and documents. LSM was derived from earlier work on latent semantic analysis. There are 3 main characteristics of latent semantic analysis: Discrete entities, usually in the form of words and documents, are mapped onto continuous vectors, the mapping involves a form of global correlation pattern, and dimensionality reduction is an important aspect of the analysis process. These constitute generic properties, and have been identified as potentially useful in a variety of different contexts. This usefulness has encouraged great interest in LSM. The intended product of latent semantic mapping, is a data-driven framework for modeling relationships in large volumes of data. Mac OS X v10.5 and later includes a framework implementing latent semantic mapping.

    Read more →
  • Referring expression generation

    Referring expression generation

    Referring expression generation (REG) is the subtask of natural language generation (NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of referring expressions (noun phrases) that identify specific entities called targets. This task can be split into two sections. The content selection part determines which set of properties distinguish the intended target and the linguistic realization part defines how these properties are translated into natural language. A variety of algorithms have been developed in the NLG community to generate different types of referring expressions. == Types of referring expressions == A referring expression (RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to identify some individual object (thing, being, event...) The technical terminology for identify differs a great deal from one school of linguistics to another. The most widespread term is probably refer, and a thing identified is a referent, as for example in the work of John Lyons. In linguistics, the study of reference relations belongs to pragmatics, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of knowledge, perception and cognition more generally. Various devices can be used for reference: determiners, pronouns, proper names... Reference relations can be of different kinds; referents can be in a "real" or imaginary world, in discourse itself, and they may be singular, plural, or collective. === Pronouns === The simplest type of referring expressions are pronoun such as he and it. The linguistics and natural language processing communities have developed various models for predicting anaphor referents, such as centering theory, and ideally referring-expression generation would be based on such models. However most NLG systems use much simpler algorithms, for example using a pronoun if the referent was mentioned in the previous sentence (or sentential clause), and no other entity of the same gender was mentioned in this sentence. === Definite noun phrases === There has been a considerable amount of research on generating definite noun phrases, such as the big red book. Much of this builds on the model proposed by Dale and Reiter. This has been extended in various ways, for example Krahmer et al. present a graph-theoretic model of definite NP generation with many nice properties. In recent years a shared-task event has compared different algorithms for definite NP generation, using the TUNA corpus. === Spatial and temporal reference === Recently there has been more research on generating referring expressions for time and space. Such references tend to be imprecise (what is the exact meaning of tonight?), and also to be interpreted in different ways by different people. Hence it may be necessary to explicitly reason about false positive vs false negative tradeoffs, and even calculate the utility of different possible referring expressions in a particular task context. === Criteria for good expressions === Ideally, a good referring expression should satisfy a number of criteria: Referential success: It should unambiguously identify the referent to the reader. Ease of comprehension: The reader should be able to quickly read and understand it. Computational complexity: The generation algorithm should be fast No false inferences: The expression should not confuse or mislead the reader by suggesting false implicatures or other pragmatic inferences. For example, a reader may be confused if he is told Sit by the brown wooden table in a context where there is only one table. == History == === Pre-2000 era === REG goes back to the early days of NLG. One of the first approaches was done by Winograd in 1972 who developed an "incremental" REG algorithm for his SHRDLU program. Afterwards researchers started to model the human abilities to create referring expressions in the 1980s. This new approach to the topic was influenced by the researchers Appelt and Kronfeld who created the programs KAMP and BERTRAND and considered referring expressions as parts of bigger speech acts. Some of their most interesting findings were the fact that referring expressions can be used to add information beyond the identification of the referent as well as the influence of communicative context and the Gricean maxims on referring expressions. Furthermore, its skepticism concerning the naturalness of minimal descriptions made Appelt and Kronfeld's research a foundation of later work on REG. The search for simple, well-defined problems changed the direction of research in the early 1990s. This new approach was led by Dale and Reiter who stressed the identification of the referent as the central goal. Like Appelt they discuss the connection between the Gricean maxims and referring expressions in their culminant paper in which they also propose a formal problem definition. Furthermore, Reiter and Dale discuss the Full Brevity and Greedy Heuristics algorithms as well as their Incremental Algorithm(IA) which became one of the most important algorithms in REG. === Later developments === After 2000 the research began to lift some of the simplifying assumptions, that had been made in early REG research in order to create more simple algorithms. Different research groups concentrated on different limitations creating several expanded algorithms. Often these extend the IA in a single perspective for example in relation to: Reference to Sets like "the t-shirt wearers" or "the green apples and the banana on the left" Relational Descriptions like "the cup on the table" or "the woman who has three children" Context Dependency, Vagueness and Gradeability include statements like "the older man" or "the car on the left" which are often unclear without a context Salience and Generation of Pronouns are highly discourse dependent making for example "she" a reference to "the (most salient) female person" Many simplifying assumptions are still in place or have just begun to be worked on. Also a combination of the different extensions has yet to be done and is called a "non-trivial enterprise" by Krahmer and van Deemter. Another important change after 2000 was the increasing use of empirical studies in order to evaluate algorithms. This development took place due to the emergence of transparent corpora. Although there are still discussions about what the best evaluation metrics are, the use of experimental evaluation has already led to a better comparability of algorithms, a discussion about the goals of REG and more task-oriented research. Furthermore, research has extended its range to related topics such as the choice of Knowledge Representation(KR) Frameworks. In this area the main question, which KR framework is most suitable for the use in REG remains open. The answer to this question depends on how well descriptions can be expressed or found. A lot of the potential of KR frameworks has been left unused so far. Some of the different approaches are the usage of: Graph search which treats relations between targets in the same way as properties. Constraint Satisfaction which allows for a separation between problem specification and the implementation. Modern Knowledge Representation which offers logical inference in for example Description Logic or Conceptual Graphs. == Problem definition == Dale and Reiter (1995) think about referring expressions as distinguishing descriptions. They define: The referent as the entity that should be described The context set as set of salient entities The contrast set or potential distractors as all elements of the context set except the referent A property as a reference to a single attribute–value pair Each entity in the domain can be characterised as a set of attribute–value pairs for example ⟨ {\displaystyle \langle } type, dog ⟩ {\displaystyle \rangle } , ⟨ {\displaystyle \langle } gender, female ⟩ {\displaystyle \rangle } or ⟨ {\displaystyle \langle } age, 10 years ⟩ {\displaystyle \rangle } . The problem then is defined as follows: Let r {\displaystyle r} be the intended referent, and C {\displaystyle C} be the contrast set. Then, a set L {\displaystyle L} of attribute–value pairs will represent a distinguishing description if the following two conditions hold: Every attribute–value pair in L {\displaystyle L} applies to r {\displaystyle r} : that is, every element of L {\displaystyle L} specifies an attribute–value that r {\displaystyle r} possesses. For every member c {\displaystyle c} of C {\displaystyle C} , there is at least one element l {\displaystyle l} of L {\displaystyle L} that does not apply to c {\displaystyle c} : that is, there is an l {\displaystyle l} in L {\displaystyle L} that specifies an attribute–value that c {\displaystyle c} does not possess. l {\displaystyle l} is said

    Read more →
  • Query understanding

    Query understanding

    Query understanding is the process of inferring the intent of a search engine user by extracting semantic meaning from the searcher’s keywords. Query understanding methods generally take place before the search engine retrieves and ranks results. It is related to natural language processing but specifically focused on the understanding of search queries. == Methods == === Stemming and lemmatization === Many languages inflect words to reflect their role in the utterance they appear in. The variation between various forms of a word is likely to be of little importance for the relatively coarse-grained model of meaning involved in a retrieval system, and for this reason the task of conflating the various forms of a word is a potentially useful technique to increase recall of a retrieval system. Stemming algorithms, also known as stemmers, typically use a collection of simple rules to remove suffixes intended to model the language’s inflection rules. For some languages, there are simple lemmatisation methods to reduce a word in query to its lemma or root form or its stem; for others, this operation involves non-trivial string processing and may require recognizing the word's part of speech or referencing a lexical database. The effectiveness of stemming and lemmatization varies across languages. === Query Segmentation === Query segmentation is a key component of query understanding, aiming to divide a query into meaningful segments. Traditional approaches, such as the bag-of-words model, treat individual words as independent units, which can limit interpretative accuracy. For languages like Chinese, where words are not separated by spaces, segmentation is essential, as individual characters often lack standalone meaning. Even in English, the BOW model may not capture the full meaning, as certain phrases—such as "New York"—carry significance as a whole rather than as isolated terms. By identifying phrases or entities within queries, query segmentation enhances interpretation, enabling search engines to apply proximity and ordering constraints, ultimately improving search accuracy and user satisfaction. === Entity recognition === Entity recognition is the process of locating and classifying entities within a text string. Named-entity recognition specifically focuses on named entities, such as names of people, places, and organizations. In addition, entity recognition includes identifying concepts in queries that may be represented by multi-word phrases. Entity recognition systems typically use grammar-based linguistic techniques or statistical machine learning models. === Query rewriting === Query rewriting is the process of automatically reformulating a search query to more accurately capture its intent. Query expansion adds additional query terms, such as synonyms, in order to retrieve more documents and thereby increase recall. Query relaxation removes query terms to reduce the requirements for a document to match the query, thereby also increasing recall. Other forms of query rewriting, such as automatically converting consecutive query terms into phrases and restricting query terms to specific fields, aim to increase precision. === Spelling Correction === Automatic spelling correction is a critical feature of modern search engines, designed to address common spelling errors in user queries. Such errors are especially frequent as users often search for unfamiliar topics. By correcting misspelled queries, search engines enhance their understanding of user intent, thereby improving the relevance and quality of search results and overall user experience.

    Read more →
  • Seq2seq

    Seq2seq

    Seq2seq is a family of machine learning approaches used for natural language processing. Originally developed by Lê Viết Quốc, a Vietnamese computer scientist and a machine learning pioneer at Google Brain, this framework has become foundational in many modern AI systems. Applications include language translation, image captioning, conversational models, speech recognition, and text summarization. Seq2seq uses sequence transformation: it turns one sequence into another sequence. == History == One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: 'This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. seq2seq is an approach to machine translation (or more generally, sequence transduction) with roots in information theory, where communication is understood as an encode-transmit-decode process, and machine translation can be studied as a special case of communication. This viewpoint was elaborated, for example, in the noisy channel model of machine translation. In practice, seq2seq maps an input sequence into a real-numerical vector by using a neural network (the encoder), and then maps it back to an output sequence using another neural network (the decoder). The idea of encoder-decoder sequence transduction had been developed in the early 2010s. The papers most commonly cited as the originators that produced seq2seq are two papers from 2014. In the seq2seq as proposed by them, both the encoder and the decoder were LSTMs. This had the "bottleneck" problem, since the encoding vector has a fixed size, so for long input sequences, information would tend to be lost, as they are difficult to fit into the fixed-length encoding vector. The attention mechanism, proposed in 2014, resolved the bottleneck problem. They called their model RNNsearch, as it "emulates searching through a source sentence during decoding a translation". A problem with seq2seq models at this point was that recurrent neural networks are difficult to parallelize. The 2017 publication of Transformers resolved the problem by replacing the encoding RNN with self-attention Transformer blocks ("encoder blocks"), and the decoding RNN with cross-attention causally-masked Transformer blocks ("decoder blocks"). === Priority dispute === One of the papers cited as the originator for seq2seq is (Sutskever et al 2014), published at Google Brain while they were on Google's machine translation project. The research allowed Google to overhaul Google Translate into Google Neural Machine Translation in 2016. Tomáš Mikolov claims to have developed the idea (before joining Google Brain) of using a "neural language model on pairs of sentences... and then [generating] translation after seeing the first sentence"—which he equates with seq2seq machine translation, and to have mentioned the idea to Ilya Sutskever and Quoc Le (while at Google Brain), who failed to acknowledge him in their paper. Mikolov had worked on RNNLM (using RNN for language modelling) for his PhD thesis, and is more notable for developing word2vec. == Architecture == The main reference for this section is. === Encoder === The encoder is responsible for processing the input sequence and capturing its essential information, which is stored as the hidden state of the network and, in a model with attention mechanism, a context vector. The context vector is the weighted sum of the input hidden states and is generated for every time instance in the output sequences. === Decoder === The decoder takes the context vector and hidden states from the encoder and generates the final output sequence. The decoder operates in an autoregressive manner, producing one element of the output sequence at a time. At each step, it considers the previously generated elements, the context vector, and the input sequence information to make predictions for the next element in the output sequence. Specifically, in a model with attention mechanism, the context vector and the hidden state are concatenated together to form an attention hidden vector, which is used as an input for the decoder. The seq2seq method developed in the early 2010s uses two neural networks: an encoder network converts an input sentence into numerical vectors, and a decoder network converts those vectors to sentences in the target language. The Attention mechanism was grafted onto this structure in 2014 and is shown below. Later it was refined into the encoder-decoder Transformer architecture of 2017. === Training vs prediction === There is a subtle difference between training and prediction. During training time, both the input and the output sequences are known. During prediction time, only the input sequence is known, and the output sequence must be decoded by the network itself. Specifically, consider an input sequence x 1 : n {\displaystyle x_{1:n}} and output sequence y 1 : m {\displaystyle y_{1:m}} . The encoder would process the input x 1 : n {\displaystyle x_{1:n}} step by step. After that, the decoder would take the output from the encoder, as well as the as input, and produce a prediction y ^ 1 {\displaystyle {\hat {y}}_{1}} . Now, the question is: what should be input to the decoder in the next step? A standard method for training is "teacher forcing". In teacher forcing, no matter what is output by the decoder, the next input to the decoder is always the reference. That is, even if y ^ 1 ≠ y 1 {\displaystyle {\hat {y}}_{1}\neq y_{1}} , the next input to the decoder is still y 1 {\displaystyle y_{1}} , and so on. During prediction time, the "teacher" y 1 : m {\displaystyle y_{1:m}} would be unavailable. Therefore, the input to the decoder must be y ^ 1 {\displaystyle {\hat {y}}_{1}} , then y ^ 2 {\displaystyle {\hat {y}}_{2}} , and so on. It is found that if a model is trained purely by teacher forcing, its performance would degrade during prediction time, since generation based on the model's own output is different from generation based on the teacher's output. This is called exposure bias or a train/test distribution shift. A 2015 paper recommends that, during training, randomly switch between teacher forcing and no teacher forcing. === Attention for seq2seq === The attention mechanism is an enhancement introduced by Bahdanau et al. in 2014 to address limitations in the basic Seq2Seq architecture where a longer input sequence results in the hidden state output of the encoder becoming irrelevant for the decoder. It enables the model to selectively focus on different parts of the input sequence during the decoding process. At each decoder step, an alignment model calculates the attention score using the current decoder state and all of the attention hidden vectors as input. An alignment model is another neural network model that is trained jointly with the seq2seq model used to calculate how well an input, represented by the hidden state, matches with the previous output, represented by attention hidden state. A softmax function is then applied to the attention score to get the attention weight. In some models, the encoder states are directly fed into an activation function, removing the need for alignment model. An activation function receives one decoder state and one encoder state and returns a scalar value of their relevance. Consider the seq2seq language English-to-French translation task. To be concrete, let us consider the translation of "the zone of international control ", which should translate to "la zone de contrôle international ". Here, we use the special token as a control character to delimit the end of input for both the encoder and the decoder. An input sequence of text x 0 , x 1 , … {\displaystyle x_{0},x_{1},\dots } is processed by a neural network (which can be an LSTM, a Transformer encoder, or some other network) into a sequence of real-valued vectors h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , where h {\displaystyle h} stands for "hidden vector". After the encoder has finished processing, the decoder starts operating over the hidden vectors, to produce an output sequence y 0 , y 1 , … {\displaystyle y_{0},y_{1},\dots } , autoregressively. That is, it always takes as input both the hidden vectors produced by the encoder, and what the decoder itself has produced before, to produce the next output word: ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , "") → "la" ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la") → "la zone" ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la zone") → "la zone de" ... ( h 0 , h 1 , … {\displaystyle h_{0},h_{1},\dots } , " la zone de contrôle international") → "la zone de contrôle international " Here, we use the special token as a control character to delimit the start of input for the decoder. The decoding terminates as soon as "" appears in the decoder output. ==

    Read more →
  • Residual neural network

    Residual neural network

    A residual neural network (also referred to as a residual network or ResNet) is a deep learning architecture in which the layers learn residual functions with reference to the layer inputs. It was developed in 2015 for image recognition, and won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) of that year. As a point of terminology, "residual connection" refers to the specific architectural motif of x ↦ f ( x ) + x {\displaystyle x\mapsto f(x)+x} , where f {\displaystyle f} is an arbitrary neural network module. The motif had been used previously (see §History for details). However, the publication of ResNet made it widely popular for feedforward networks, appearing in neural networks that are seemingly unrelated to ResNet. The residual connection stabilizes the training and convergence of deep neural networks with hundreds of layers, and is a common motif in deep neural networks, such as transformer models (e.g., BERT, and GPT models such as ChatGPT), the AlphaGo Zero system, the AlphaStar system, and the AlphaFold system. == Mathematics == === Residual connection === In a multilayer neural network model, consider a (non-residual) subnetwork with a certain number of stacked layers (e.g., 2 or 3). Let H ( x ; α ) {\displaystyle H(x;\alpha )} denote the subnetwork. Suppose H ∗ {\displaystyle H^{}} is the desired optimal output of this subnetwork. Residual learning simply adds x {\displaystyle x} directly to the output, such that the optimal learned output now becomes be H ∗ − x {\displaystyle H^{}-x} , which is interpreted as a "residual" with respect to x {\displaystyle x} . The operation of "adding x {\displaystyle x} " is implemented via a "skip connection" that performs an identity mapping to connect the input of the subnetwork with its output. This connection is referred to as a "residual connection" in later work. Let F ( x ; α ) = H ( x ; a ) + x {\displaystyle F(x;\alpha )=H(x;a)+x} . The function F {\displaystyle F} is often represented by matrix multiplication interlaced with activation functions and normalization operations (e.g., batch normalization or layer normalization). As a whole, one of these subnetworks is referred to as a "residual block". A deep residual network is constructed by simply stacking these blocks. Long short-term memory (LSTM) has a memory mechanism that serves as a residual connection. In an LSTM without a forget gate, an input x t {\displaystyle x_{t}} is processed by a function F {\displaystyle F} and added to a memory cell c t {\displaystyle c_{t}} , resulting in c t + 1 = c t + F ( x t ) {\displaystyle c_{t+1}=c_{t}+F(x_{t})} . An LSTM with a forget gate essentially functions as a highway network. To stabilize the variance of the layers' inputs, it is recommended to replace the residual connections x + f ( x ) {\displaystyle x+f(x)} with x / L + f ( x ) {\displaystyle x/L+f(x)} , where L {\displaystyle L} is the total number of residual layers. === Projection connection === If the function F {\displaystyle F} is of type F : R n → R m {\displaystyle F:\mathbb {R} ^{n}\to \mathbb {R} ^{m}} where n ≠ m {\displaystyle n\neq m} , then F ( x ) + x {\displaystyle F(x)+x} is undefined. To handle this special case, a projection connection is used: y = F ( x ) + P ( x ) {\displaystyle y=F(x)+P(x)} where P {\displaystyle P} is typically a linear projection, defined by P ( x ) = M x {\displaystyle P(x)=Mx} where M {\displaystyle M} is a m × n {\displaystyle m\times n} matrix. The matrix is trained via backpropagation, as is any other parameter of the model. === Signal propagation === The introduction of identity mappings facilitates signal propagation in both forward and backward paths. ==== Forward propagation ==== If the output of the ℓ {\displaystyle \ell } -th residual block is the input to the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th residual block (assuming no activation function between blocks), then the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th input is: x ℓ + 1 = F ( x ℓ ) + x ℓ {\displaystyle x_{\ell +1}=F(x_{\ell })+x_{\ell }} Applying this formulation recursively, e.g.: x ℓ + 2 = F ( x ℓ + 1 ) + x ℓ + 1 = F ( x ℓ + 1 ) + F ( x ℓ ) + x ℓ {\displaystyle {\begin{aligned}x_{\ell +2}&=F(x_{\ell +1})+x_{\ell +1}\\&=F(x_{\ell +1})+F(x_{\ell })+x_{\ell }\end{aligned}}} yields the general relationship: x L = x ℓ + ∑ i = ℓ L − 1 F ( x i ) {\displaystyle x_{L}=x_{\ell }+\sum _{i=\ell }^{L-1}F(x_{i})} where L {\textstyle L} is the index of a residual block and ℓ {\textstyle \ell } is the index of some earlier block. This formulation suggests that there is always a signal that is directly sent from a shallower block ℓ {\textstyle \ell } to a deeper block L {\textstyle L} . ==== Backward propagation ==== The residual learning formulation provides the added benefit of mitigating the vanishing gradient problem to some extent. However, it is crucial to acknowledge that the vanishing gradient issue is not the root cause of the degradation problem, which is tackled through the use of normalization. To observe the effect of residual blocks on backpropagation, consider the partial derivative of a loss function E {\displaystyle {\mathcal {E}}} with respect to some residual block input x ℓ {\displaystyle x_{\ell }} . Using the equation above from forward propagation for a later residual block L > ℓ {\displaystyle L>\ell } : ∂ E ∂ x ℓ = ∂ E ∂ x L ∂ x L ∂ x ℓ = ∂ E ∂ x L ( 1 + ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) ) = ∂ E ∂ x L + ∂ E ∂ x L ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) {\displaystyle {\begin{aligned}{\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial x_{L}}{\partial x_{\ell }}}\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}\left(1+{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\right)\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}+{\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\end{aligned}}} This formulation suggests that the gradient computation of a shallower layer, ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} , always has a later term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} that is directly added. Even if the gradients of the F ( x i ) {\displaystyle F(x_{i})} terms are small, the total gradient ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} resists vanishing due to the added term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} . == Variants of residual blocks == === Basic block === A basic block is the simplest building block studied in the original ResNet. This block consists of two sequential 3x3 convolutional layers and a residual connection. The input and output dimensions of both layers are equal. === Bottleneck block === A bottleneck block consists of three sequential convolutional layers and a residual connection. The first layer in this block is a 1×1 convolution for dimension reduction (e.g., to 1/2 of the input dimension); the second layer performs a 3×3 convolution; the last layer is another 1×1 convolution for dimension restoration. The models of ResNet-50, ResNet-101, and ResNet-152 are all based on bottleneck blocks. === Pre-activation block === The pre-activation residual block applies activation functions before applying the residual function F {\displaystyle F} . Formally, the computation of a pre-activation residual block can be written as: x ℓ + 1 = F ( ϕ ( x ℓ ) ) + x ℓ {\displaystyle x_{\ell +1}=F(\phi (x_{\ell }))+x_{\ell }} where ϕ {\displaystyle \phi } can be any activation (e.g. ReLU) or normalization (e.g. LayerNorm) operation. This design reduces the number of non-identity mappings between residual blocks, and allows an identity mapping directly from the input to the output. This design was used to train models with 200 to over 1000 layers, and was found to consistently outperform variants where the residual path is not an identity function. The pre-activation ResNet with 200 layers took 3 weeks to train for ImageNet on 8 GPUs in 2016. Since GPT-2, transformer blocks have been mostly implemented as pre-activation blocks. This is often referred to as "pre-normalization" in the literature of transformer models. == Applications == Originally, ResNet was designed for computer vision. All transformer architectures include residual connections. Indeed, very deep transformers cannot be trained without them. The original ResNet paper made no claim on being inspired by biological systems. However, later research has related ResNet to biologically-plausible algorithms. A study published in Science in 2023 disclosed the complete connectome of an insect brain (specifically that of a fruit fly larva). This study discovered "multilayer shortcuts" that resemble the skip connections in artificial neural networks, including ResNets. == History == === Previous work === Residual connections were noticed in neu

    Read more →
  • Gradient vector flow

    Gradient vector flow

    Gradient vector flow (GVF), a computer vision framework introduced by Chenyang Xu and Jerry L. Prince, is the vector field that is produced by a process that smooths and diffuses an input vector field. It is usually used to create a vector field from images that points to object edges from a distance. It is widely used in image analysis and computer vision applications for object tracking, shape recognition, segmentation, and edge detection. In particular, it is commonly used in conjunction with active contour model. == Background == Finding objects or homogeneous regions in images is a process known as image segmentation. In many applications, the locations of object edges can be estimated using local operators that yield a new image called an edge map. The edge map can then be used to guide a deformable model, sometimes called an active contour or a snake, so that it passes through the edge map in a smooth way, therefore defining the object itself. A common way to encourage a deformable model to move toward the edge map is to take the spatial gradient of the edge map, yielding a vector field. Since the edge map has its highest intensities directly on the edge and drops to zero away from the edge, these gradient vectors provide directions for the active contour to move. When the gradient vectors are zero, the active contour will not move, and this is the correct behavior when the contour rests on the peak of the edge map itself. However, because the edge itself is defined by local operators, these gradient vectors will also be zero far away from the edge and therefore the active contour will not move toward the edge when initialized far away from the edge. Gradient vector flow (GVF) is the process that spatially extends the edge map gradient vectors, yielding a new vector field that contains information about the location of object edges throughout the entire image domain. GVF is defined as a diffusion process operating on the components of the input vector field. It is designed to balance the fidelity of the original vector field, so it is not changed too much, with a regularization that is intended to produce a smooth field on its output. Although GVF was designed originally for the purpose of segmenting objects using active contours attracted to edges, it has been since adapted and used for many alternative purposes. Some newer purposes including defining a continuous medial axis representation, regularizing image anisotropic diffusion algorithms, finding the centers of ribbon-like objects, constructing graphs for optimal surface segmentations, creating a shape prior, and much more. == Theory == The theory of GVF was originally described by Xu and Prince. Let f ( x , y ) {\displaystyle \textstyle f(x,y)} be an edge map defined on the image domain. For uniformity of results, it is important to restrict the edge map intensities to lie between 0 and 1, and by convention f ( x , y ) {\displaystyle \textstyle f(x,y)} takes on larger values (close to 1) on the object edges. The gradient vector flow (GVF) field is given by the vector field v ( x , y ) = [ u ( x , y ) , v ( x , y ) ] {\displaystyle \textstyle \mathbf {v} (x,y)=[u(x,y),v(x,y)]} that minimizes the energy functional In this equation, subscripts denote partial derivatives and the gradient of the edge map is given by the vector field ∇ f = ( f x , f y ) {\displaystyle \textstyle \nabla f=(f_{x},f_{y})} . Figure 1 shows an edge map, the gradient of the (slightly blurred) edge map, and the GVF field generated by minimizing E {\displaystyle \textstyle {\mathcal {E}}} . Equation 1 is a variational formulation that has both a data term and a regularization term. The first term in the integrand is the data term. It encourages the solution v {\displaystyle \textstyle \mathbf {v} } to closely agree with the gradients of the edge map since that will make v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} small. However, this only needs to happen when the edge map gradients are large since v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} is multiplied by the square of the length of these gradients. The second term in the integrand is a regularization term. It encourages the spatial variations in the components of the solution to be small by penalizing the sum of all the partial derivatives of v {\displaystyle \textstyle \mathbf {v} } . As is customary in these types of variational formulations, there is a regularization parameter μ > 0 {\displaystyle \textstyle \mu >0} that must be specified by the user in order to trade off the influence of each of the two terms. If μ {\displaystyle \textstyle \mu } is large, for example, then the resulting field will be very smooth and may not agree as well with the underlying edge gradients. Theoretical Solution. Finding v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} to minimize Equation 1 requires the use of calculus of variations since v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} is a function, not a variable. Accordingly, the Euler equations, which provide the necessary conditions for v {\displaystyle \textstyle \mathbf {v} } to be a solution can be found by calculus of variations, yielding where ∇ 2 {\displaystyle \textstyle \nabla ^{2}} is the Laplacian operator. It is instructive to examine the form of the equations in (2). Each is a partial differential equation that the components u {\displaystyle u} and v {\displaystyle v} of v {\displaystyle \mathbf {v} } must satisfy. If the magnitude of the edge gradient is small, then the solution of each equation is guided entirely by Laplace's equation, for example ∇ 2 u = 0 {\displaystyle \textstyle \nabla ^{2}u=0} , which will produce a smooth scalar field entirely dependent on its boundary conditions. The boundary conditions are effectively provided by the locations in the image where the magnitude of the edge gradient is large, where the solution is driven to agree more with the edge gradients. Computational Solutions. There are two fundamental ways to compute GVF. First, the energy function E {\displaystyle {\mathcal {E}}} itself (1) can be directly discretized and minimized, for example, by gradient descent. Second, the partial differential equations in (2) can be discretized and solved iteratively. The original GVF paper used an iterative approach, while later papers introduced considerably faster implementations such as an octree-based method, a multi-grid method, and an augmented Lagrangian method. In addition, very fast GPU implementations have been developed in Extensions and Advances. GVF is easily extended to higher dimensions. The energy function is readily written in a vector form as which can be solved by gradient descent or by finding and solving its Euler equation. Figure 2 shows an illustration of a three-dimensional GVF field on the edge map of a simple object (see ). The data and regularization terms in the integrand of the GVF functional can also be modified. A modification described in , called generalized gradient vector flow (GGVF) defines two scalar functions and reformulates the energy as While the choices g ( ∇ f | ) = μ {\displaystyle \textstyle g(\nabla f|)=\mu } and h ( | ∇ f | ) = | ∇ f | 2 {\displaystyle \textstyle h(|\nabla f|)=|\nabla f|^{2}} reduce GGVF to GVF, the alternative choices g ( | ∇ f | ) = exp ⁡ { − | ∇ f | / K } {\displaystyle \textstyle g(|\nabla f|)=\exp\{-|\nabla f|/K\}} and h ( ∇ f | ) = 1 − g ( | ∇ f | ) {\displaystyle \textstyle h(\nabla f|)=1-g(|\nabla f|)} , for K {\displaystyle K} a user-selected constant, can improve the tradeoff between the data term and its regularization in some applications. The GVF formulation has been further extended to vector-valued images in where a weighted structure tensor of a vector-valued image is used. A learning based probabilistic weighted GVF extension was proposed in to further improve the segmentation for images with severely cluttered textures or high levels of noise. The variational formulation of GVF has also been modified in motion GVF (MGVF) to incorporate object motion in an image sequence. Whereas the diffusion of GVF vectors from a conventional edge map acts in an isotropic manner, the formulation of MGVF incorporates the expected object motion between image frames. An alternative to GVF called vector field convolution (VFC) provides many of the advantages of GVF, has superior noise robustness, and can be computed very fast. The VFC field v V F C {\displaystyle \textstyle \mathbf {v} _{\mathrm {VFC} }} is defined as the convolution of the edge map f {\displaystyle f} with a vector field kernel k {\displaystyle \mathbf {k} } where The vector field kernel k {\displaystyle \textstyle \mathbf {k} } has vectors that always point toward the origin but their magnitudes, determined in detail by the function m {\displaystyle m} , decrease to zero with increasing distance from the origin. The beauty of VFC is that it can be computed very rapidly using a fast Fourier tra

    Read more →
  • GeneRIF

    GeneRIF

    A GeneRIF or Gene Reference Into Function is a short (255 characters or fewer) statement about the function of a gene. GeneRIFs provide a simple mechanism for allowing scientists to add to the functional annotation of genes described in the Entrez Gene database. In practice, function is constructed quite broadly. For example, there are GeneRIFs that discuss the role of a gene in a disease, GeneRIFs that point the viewer towards a review article about the gene, and GeneRIFs that discuss the structure of a gene. However, the stated intent is for GeneRIFs to be about gene function. Currently over half a million geneRIFs have been created for genes from almost 1000 different species. GeneRIFs are always associated with specific entries in the Entrez Gene database. Each GeneRIF has a pointer to the PubMed ID (a type of document identifier) of a scientific publication that provides evidence for the statement made by the GeneRIF. GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence. GeneRIFs are usually produced by NCBI indexers, but anyone may submit a GeneRIF. To be processed, a valid Gene ID must exist for the specific gene, or the Gene staff must have assigned an overall Gene ID to the species. The latter case is implemented via records in Gene with the symbol NEWENTRY. Once the Gene ID is identified, only three types of information are required to complete a submission: a concise phrase describing a function or functions (less than 255 characters in length, preferably more than a restatement of the title of the paper); a published paper describing that function, implemented by supplying the PubMed ID of a citation in PubMed; a valid e-mail address (which will remain confidential). == Example == Here are some GeneRIFs taken from Entrez Gene for GeneID 7157, the human gene TP53. The PubMed document identifiers have been omitted from the examples. Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters. p53 and c-erbB-2 may have independent role in carcinogenesis of gall bladder cancer Degradation of endogenous HIPK2 depends on the presence of a functional p53 protein. p53 codon 72 alleles influence the response to anticancer drugs in cells from aged people by regulating the cell cycle inhibitor p21WAF1 Logistic regression analysis showed p53 and COX-2 as dependent predictors in pancreatic carcinogenesis, and a reciprocal relationship to neoplastic progression between p53 and COX-2. GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.

    Read more →
  • Amira (software)

    Amira (software)

    Amira (ah-MEER-ah) is a software platform for visualization, processing, and analysis of 3D and 4D data. It is being actively developed by Thermo Fisher Scientific in collaboration with the Zuse Institute Berlin (ZIB), and commercially distributed by Thermo Fisher Scientific — together with its sister software Avizo. == Overview == Amira is an extendable software system for scientific visualization, data analysis, and presentation of 3D and 4D data. It is used by researchers and engineers in academia and industry. It is a tool for processing, analysis and visualization of data from various modalities; e.g. micro-CT, PET, Ultrasound. It is used in many fields, such as microscopy in biology and materials science, molecular biology, quantum physics, astrophysics, computational fluid dynamics (CFD), finite element modeling (FEM), non-destructive testing (NDT), and many more. One of the key features, besides data visualization, is Amira's set of tools for image segmentation and geometry reconstruction. This allows the user to mark (or segment) structures and regions of interest in 3D image volumes using automatic, semi-automatic, and manual tools. The segmentation can then be used for a variety of subsequent tasks, such as volumetric analysis, density analysis, shape analysis, or the generation of 3D computer models for visualization, numerical simulations, or rapid prototyping or 3D printing. Other key Amira features are multi-planar and volume visualization, image registration, filament tracing, cell separation and analysis, tetrahedral mesh generation, fiber-tracking from diffusion tensor imaging (DTI) data, skeletonization, spatial graph analysis, and stereoscopic rendering of 3D data over multiple displays and immersive virtual reality environments, including CAVEs. As a commercial product Amira requires the purchase of a license or an academic subscription. A time-limited, but full-featured evaluation version is available for download free of charge. == History == === 1993–1998: Research software === Amira's roots go back to 1993 and the Department for Scientific Visualization, headed by Hans-Christian Hege at the Zuse Institute Berlin (ZIB). The ZIB is a research institute for mathematics and informatics. The Scientific Visualization department's mission is to help solve computationally and scientifically challenging tasks in medicine, biology, engineering and materials science. For this purpose, it develops algorithms and software for 2D, 3D, and 4D data visualization and visually supported exploration and analysis. At that time, the young visualization group at the ZIB had experience with the extendable, data flow-oriented visualization environments apE, IRIS Explorer, and Advanced Visualization Studio (AVS), but was not satisfied with these products' interactivity, flexibility, and ease-of-use for non-computer scientists. Therefore, the development of a new software system was started in a research project within a medically oriented, multi-disciplinary collaborative research center. Based on experiences that Tobias Höllerer had gained in late 1993 with the new graphics library IRIS Inventor, it was decided to utilize that library. The development of the medical planning system was performed by Detlev Stalling, who later became the chief software architect of Amira. The new software was called "HyperPlan", highlighting its initial target application – a planning system for hyperthermia cancer treatment. The system was being developed on Silicon Graphics (SGI) computers, which at the time were the standard workstations used for high-end graphics computing. The software was based on libraries such as OpenGL (originally IRIS GL), Open Inventor (originally IRIS Inventor), and the graphical user interface libraries X11, Motif (software), and ViewKit. In 1998, X11/Motif/Viewkit were replaced by the Qt toolkit. The HyperPlan framework served as the base for more and more projects at the ZIB and was used by a growing number of researchers in collaborating institutions. The projects included applications in medical image computing, medical visualization, neurobiology, confocal microscopy, flow visualization, molecular analytics and computational astrophysics. === 1998–today: Commercially supported product === The growing number of users of the system started to exceed the capacities that ZIB could spare for software distribution and support, as ZIB's primary mission was algorithmic research. Therefore, the spin-off company Indeed – Visual Concepts GmbH was founded by Hans-Christian Hege, Detlev Stalling, and Malte Westerhoff. In Feb 1998 the HyperPlan software was given the new, application-neutral name "Amira". This name is not an acronym, but was chosen for being pronounceable in different languages and providing a suitable connotation, namely "to look at" or "to wonder at", from the Latin verb "admirare" (to admire), which reflects a basic situation in data visualization. A major re-design of the software was undertaken by Detlev Stalling and Malte Westerhoff in order to make it a commercially supportable product and to make it available on non-SGI computers as well. In March 1999, the first version of the commercial Amira was exhibited at the CeBIT tradeshow in Hannover, Germany on SGI IRIX and Hewlett-Packard UniX (HP-UX) booths. Versions for Linux and Microsoft Windows followed within the following twelve months. Later Mac OS X support was added. Indeed – Visual Concepts GmbH selected the Bordeaux, France and San Diego, United States based company TGS, Inc. as the worldwide distributor for Amira and completed five major releases (up to version 3.1) in the subsequent four years. In 2003 both Indeed – Visual Concepts GmbH, as well as TGS, Inc. were acquired by Massachusetts-based Mercury Computer Systems, Inc. (NASDAQ:MRCY) and became part of Mercury's newly formed life sciences business unit, later branded Visage Imaging. In 2009, Mercury Computer Systems, Inc. spun off Visage Imaging again and sold it to Melbourne, Australia based Promedicus Ltd (ASX:PME), a leading provider of radiology information systems and medical IT solutions. During this time, Amira continued to be developed in Berlin, Germany and in close collaboration with the ZIB, still headed by the original creators of Amira. TGS, located in Bordeaux, France was sold by Mercury Computer systems to a French investor and renamed to Visualization Sciences Group (VSG). VSG continued the work on a complementary product named Avizo, based on the same source code but customized for material sciences. In August 2012, FEI, to that date the largest OEM reseller of Amira, purchased VSG and the Amira business from Promedicus. This brought the two software sisters Amira and Avizo back into one hand. In August 2013, Visualization Sciences Group (VSG) became a business unit of FEI. In 2016 FEI has been bought by Thermo Fisher Scientific and became part of its Materials & Structural Analysis division in early 2017. Amira and Avizo are still being marketed as two different products; Amira for life sciences and Avizo for materials science, but the development efforts are now joined once again. In the meantime, the number of scientific articles using the Amira / Avizo software, is in the order of 10 thousands. == Amira options == === Microscopy option === Specific readers for microscopy data Image deconvolution Exploration of 3D imagery obtained from virtually any microscope Extraction and editing of filament networks from microscopy images === DICOM reader === Import of clinical and preclinical data in DICOM format === Mesh option === Generation of 3D finite element (FE) meshes from segmented image data Support for many state-of-the-art FE solver formats High-quality visualization of simulation mesh-based results, using scalar, vector, and tensor field display modules === Skeletonization option === Reconstruction and analysis of neural and vascular networks Visualization of skeletonized networks Length and diameter quantification of network segments Ordering of segments in a tree graph Skeletonization of very large image stacks === Molecular option === Advanced tools for the visualization of molecule models Hardware-accelerated volume rendering Powerful molecule editor Specific tools for complex molecular visualization === Developer option === Creation of new custom components for visualizing or data processing Implementation of new file readers or writers C++ programming language Development wizard for getting started quickly === Neuro option === Medical image analysis for DTI and brain perfusion Fiber tracking supporting several stream-line based algorithms Fiber separation into fiber bundles based on user defined source and destination regions Computation of tensor fields, diffusion weighted maps Eigenvalue decomposition of tensor fields Computation of mean transit time, cerebral blood flow, and cerebral blood volume === VR option === Visualization of data on large tiled displays

    Read more →
  • Pyramid (image processing)

    Pyramid (image processing)

    Pyramid, or pyramid representation, is a type of multi-scale signal representation developed by the computer vision, image processing and signal processing communities, in which a signal or an image is subject to repeated smoothing and subsampling. Pyramid representation is a predecessor to scale-space representation and multiresolution analysis. == Pyramid generation == There are two main types of pyramids: lowpass and bandpass. A lowpass pyramid is made by smoothing the image with an appropriate smoothing filter and then subsampling the smoothed image, usually by a factor of 2 along each coordinate direction. The resulting image is then subjected to the same procedure, and the cycle is repeated multiple times. Each cycle of this process results in a smaller image with increased smoothing, but with decreased spatial sampling density (that is, decreased image resolution). If illustrated graphically, the entire multi-scale representation will look like a pyramid, with the original image on the bottom and each cycle's resulting smaller image stacked one atop the other. A bandpass pyramid is made by forming the difference between images at adjacent levels in the pyramid and performing image interpolation between adjacent levels of resolution, to enable computation of pixelwise differences. == Pyramid generation kernels == A variety of different smoothing kernels have been proposed for generating pyramids. Among the suggestions that have been given, the binomial kernels arising from the binomial coefficients stand out as a particularly useful and theoretically well-founded class. Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an oversampled or hybrid pyramid. With the increasing computational efficiency of CPUs available today, it is in some situations also feasible to use wider supported Gaussian filters as smoothing kernels in the pyramid generation steps. === Gaussian pyramid === In a Gaussian pyramid, subsequent images are weighted down using a Gaussian average (Gaussian blur) and scaled down. Each pixel containing a local average corresponds to a neighborhood pixel on a lower level of the pyramid. This technique is used especially in texture synthesis. === Laplacian pyramid === A Laplacian pyramid is very similar to a Gaussian pyramid but saves the difference image of the blurred versions between each levels. Only the smallest level is not a difference image to enable reconstruction of the high resolution image using the difference images on higher levels. This technique can be used in image compression. === Steerable pyramid === A steerable pyramid, developed by Simoncelli and others, is an implementation of a multi-scale, multi-orientation band-pass filter bank used for applications including image compression, texture synthesis, and object recognition. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of steerable filters are used at each level of the pyramid instead of a single Laplacian or Gaussian filter. == Applications of pyramids == === Alternative representation === In the early days of computer vision, pyramids were used as the main type of multi-scale representation for computing multi-scale image features from real-world image data. More recent techniques include scale-space representation, which has been popular among some researchers due to its theoretical foundation, the ability to decouple the subsampling stage from the multi-scale representation, the more powerful tools for theoretical analysis as well as the ability to compute a representation at any desired scale, thus avoiding the algorithmic problems of relating image representations at different resolution. Nevertheless, pyramids are still frequently used for expressing computationally efficient approximations to scale-space representation. === Detail manipulation === Levels of a Laplacian pyramid can be added to or removed from the original image to amplify or reduce detail at different scales. However, detail manipulation of this form is known to produce halo artifacts in many cases, leading to the development of alternatives such as the bilateral filter. Some image compression file formats use the Adam7 algorithm or some other interlacing technique. These can be seen as a kind of image pyramid. Because those file format store the "large-scale" features first, and fine-grain details later in the file, a particular viewer displaying a small "thumbnail" or on a small screen can quickly download just enough of the image to display it in the available pixels—so one file can support many viewer resolutions, rather than having to store or generate a different file for each resolution.

    Read more →
  • Adversarial stylometry

    Adversarial stylometry

    Adversarial stylometry is the practice of altering writing style to reduce the potential for stylometry to discover the author's identity or their characteristics. This task is also known as authorship obfuscation or authorship anonymisation. Stylometry poses a significant privacy challenge in its ability to unmask anonymous authors or to link pseudonyms to an author's other identities, which, for example, creates difficulties for whistleblowers, activists, and hoaxers and fraudsters. The privacy risk is expected to grow as machine learning techniques and text corpora develop. All adversarial stylometry shares the core idea of faithfully paraphrasing the source text so that the meaning is unchanged but the stylistic signals are obscured. Such a faithful paraphrase is an adversarial example for a stylometric classifier. Several broad approaches to this exist, with some overlap: imitation, substituting the author's own style for another's; translation, applying machine translation with the hope that this eliminates characteristic style in the source text; and obfuscation, deliberately modifying a text's style to make it not resemble the author's own. Manually obscuring style is possible, but laborious; in some circumstances, it is preferable or necessary. Automated tooling, either semi- or fully-automatic, could assist an author. How best to perform the task and the design of such tools is an open research question. While some approaches have been shown to be able to defeat particular stylometric analyses, particularly those that do not account for the potential of adversariality, establishing safety in the face of unknown analyses is an issue. Ensuring the faithfulness of the paraphrase is a critical challenge for automated tools. It is uncertain if the practice of adversarial stylometry is detectable in itself. Some studies have found that particular methods produced signals in the output text, but a stylometrist who is uncertain of what methods may have been used may not be able to reliably detect them. == History == Rao & Rohatgi (2000), an early work in adversarial stylometry, identified machine translation as a possibility, but noted that the quality of translators available at the time presented severe challenges. Kacmarcik & Gamon (2006) is another early work. Brennan, Afroz & Greenstadt (2012) performed the first evaluation of adversarial stylometric methods on actual texts. Brennan & Greenstadt (2009) introduced the first corpus of adversarially authored texts specifically for evaluating stylometric methods; other corpora include the International Imitation Hemingway Competition, the Faux Faulkner contest, and the hoax blog A Gay Girl in Damascus. == Motivations == Rao & Rohatgi (2000) suggest that short, unattributed documents (i.e., anonymous posts) are not at risk of stylometric identification, but pseudonymous authors who have not practiced adversarial stylometry in producing corpuses of thousands of words may be vulnerable. Narayanan et al. (2012) attempted large-scale deanonymisation of 100,000 blog authors with mixed results: the identifications were significantly better than chance, but only accurately matched the blog and author a fifth of the time; identification improved with the number of posts written by the author in the corpus. Even if an author is not identified, some of their characteristics may still be deduced stylometrically, or stylometry may narrow the anonymity set of potential authors sufficiently for other information to complete the identification. Detecting author characteristics (e.g., gender or age) is often simpler than identifying an author from a large, possibly open, set of candidates. Modern machine learning techniques offer powerful tools for identification; further development of corpora and computational stylometric techniques are likely to raise further privacy issues. Gröndahl & Asokan (2020a) say that the general validity of the hypothesis underlying stylometry—that authors have invariant, content-independent 'style fingerprints'—is uncertain, but "the deanonymisation attack is a real privacy concern". Those interested in practicing adversarial stylometry and stylistic deception include whistleblowers avoiding retribution; journalists and activists; perpetrators of frauds and hoaxes; authors of fake reviews; literary forgers; criminals disguising their identity from investigators; and, generally, anyone with a desire for anonymity or pseudonymity. Authors, or agents acting on behalf of authors, may also attempt to remove stylistic clues to author characteristics (e.g., race or gender) so that knowledge of those characteristics cannot be used for discrimination (e.g., through algorithmic bias). Another possible use for adversarial stylometry is in disguising automatically generated text as human-authored. == Methods == With imitation, the author attempts to mislead stylometry by matching their style to another author's. An incomplete imitation, where some of the true author's unique characteristics appear alongside the imitated author's, can be a detectable signal for the use of adversarial stylometry. Imitation can be performed automatically with style transfer systems, though this typically requires a large corpus in the target style for the system to learn from. Another approach is translation, which employs machine translation of a source text to eliminate characteristic style, often through multiple translators in sequence to produce a round-trip translation. Such chained translation can lead to texts being significantly altered, even to the point of incomprehensibility; improved translation tools reduce this risk. More simply-structured texts can be easier to machine translate without losing the original meaning. Machine translation blurs into direct stylistic imitation or obfuscation achieved through automated style transfer, which can be viewed as a "translation" with the same language as input and output. With low-quality translation tools, an author can be required to manually correct major translation errors while avoiding the hazard of re-introducing stylistic characteristics. Wang, Juola & Riddell (2022) found that gross errors introduced by Google Translate were rare, but more common with several intermediate translations—however, occasional simple or short sentences and misspellings in the source text appeared verbatim in the output, potentially providing an identifying signal. Chain translation can leave characteristic traces of its application in a document, which may allow reconstruction of the intermediate languages used and the number of translation steps performed. Obfuscation involves deliberately changing the style of a text to reduce its similarity to other texts by some metric; this may be performed at the time of writing by conscious modification, or as part of a revision process with feedback from the metric being targeted as an input to decide when the text has been sufficiently obfuscated. In contrast to translation, complex texts can offer more opportunities for effective obfuscation without altering meaning, and likewise genres with more permissible variation allow more obfuscation. However, longer texts are harder to thoroughly obfuscate. Obfuscation can blend into imitation if the author develops a novel target style, distinct from their original style. With respect to masking author characteristics, obfuscation may aim to achieve a union (adding signals for imitated characteristics) or an intersection (removing signals and normalising) of other authors' styles. Avoiding the author's own idiosyncrasies and producing a "normalised" text is a critical obfuscatory step: an author may have a unique tendency to misspell certain words, use particular variants, or to format a document in a characteristic way. Stylometric signals vary in how simply they can be adversarially masked; an author may easily change their vocabulary by conscious choice, but altering the pattern of grammar or the letter frequency in their text may be harder to achieve, though Juola & Vescovi (2011) report that imitation typically succeeds at masking more characteristics than obfuscation. Automated obfuscation may require large amounts of training data written by the author. Concerning automated implementations of adversarial stylometry, two possible implementations are rule-based systems for paraphrasing; and encoder–decoder architectures, where the text passes through an intermediate format that is (intended to be) style-neutral. Another division in automated methods is whether there is feedback from an identification system or not. With such feedback, finding paraphrases for author masking has been characterised as a heuristic search problem, exploring textual variants until the result is stylistically sufficiently far (in the case of obfuscation) or near (in the case of imitation), which then constitutes an adversarial example for that identification system. == Evaluation == How

    Read more →
  • Image moment

    Image moment

    In image processing, computer vision and related fields, an image moment is a certain particular weighted average (moment) of the image pixels' intensities, or a function of such moments, usually chosen to have some attractive property or interpretation. Image moments are useful to describe objects after segmentation. Simple properties of the image which are found via image moments include area (or total intensity), its centroid, and information about its orientation. == Raw moments == For a 2D continuous function f(x,y) the moment (sometimes called "raw moment") of order (p + q) is defined as M p q = ∫ − ∞ ∞ ∫ − ∞ ∞ x p y q f ( x , y ) d x d y {\displaystyle M_{pq}=\int \limits _{-\infty }^{\infty }\int \limits _{-\infty }^{\infty }x^{p}y^{q}f(x,y)\,dx\,dy} for p,q = 0,1,2,... Adapting this to scalar (grayscale) image with pixel intensities I(x,y), raw image moments Mij are calculated by M i j = ∑ x ∑ y x i y j I ( x , y ) {\displaystyle M_{ij}=\sum _{x}\sum _{y}x^{i}y^{j}I(x,y)\,\!} In some cases, this may be calculated by considering the image as a probability density function, i.e., by dividing the above by ∑ x ∑ y I ( x , y ) {\displaystyle \sum _{x}\sum _{y}I(x,y)\,\!} A uniqueness theorem states that if f(x,y) is piecewise continuous and has nonzero values only in a finite part of the xy plane, moments of all orders exist, and the moment sequence (Mpq) is uniquely determined by f(x,y). Conversely, (Mpq) uniquely determines f(x,y). In practice, the image is summarized with functions of a few lower order moments. === Examples === Simple image properties derived via raw moments include: Area (for binary images) or sum of grey level (for greytone images): M 00 {\displaystyle M_{00}} Centroid: { x ¯ , y ¯ } = { M 10 M 00 , M 01 M 00 } {\displaystyle \{{\bar {x}},\ {\bar {y}}\}=\left\{{\frac {M_{10}}{M_{00}}},{\frac {M_{01}}{M_{00}}}\right\}} == Central moments == Central moments are defined as μ p q = ∫ − ∞ ∞ ∫ − ∞ ∞ ( x − x ¯ ) p ( y − y ¯ ) q f ( x , y ) d x d y {\displaystyle \mu _{pq}=\int \limits _{-\infty }^{\infty }\int \limits _{-\infty }^{\infty }(x-{\bar {x}})^{p}(y-{\bar {y}})^{q}f(x,y)\,dx\,dy} where x ¯ = M 10 M 00 {\displaystyle {\bar {x}}={\frac {M_{10}}{M_{00}}}} and y ¯ = M 01 M 00 {\displaystyle {\bar {y}}={\frac {M_{01}}{M_{00}}}} are the components of the centroid. If ƒ(x, y) is a digital image, then the previous equation becomes μ p q = ∑ x ∑ y ( x − x ¯ ) p ( y − y ¯ ) q f ( x , y ) {\displaystyle \mu _{pq}=\sum _{x}\sum _{y}(x-{\bar {x}})^{p}(y-{\bar {y}})^{q}f(x,y)} The central moments of order up to 3 are: μ 00 = M 00 , μ 01 = 0 , μ 10 = 0 , μ 11 = M 11 − x ¯ M 01 = M 11 − y ¯ M 10 , μ 20 = M 20 − x ¯ M 10 , μ 02 = M 02 − y ¯ M 01 , μ 21 = M 21 − 2 x ¯ M 11 − y ¯ M 20 + 2 x ¯ 2 M 01 , μ 12 = M 12 − 2 y ¯ M 11 − x ¯ M 02 + 2 y ¯ 2 M 10 , μ 30 = M 30 − 3 x ¯ M 20 + 2 x ¯ 2 M 10 , μ 03 = M 03 − 3 y ¯ M 02 + 2 y ¯ 2 M 01 . {\displaystyle {\begin{aligned}\mu _{00}&=M_{00},&\mu _{01}&=0,\\\mu _{10}&=0,&\mu _{11}&=M_{11}-{\bar {x}}M_{01}=M_{11}-{\bar {y}}M_{10},\\\mu _{20}&=M_{20}-{\bar {x}}M_{10},&\mu _{02}&=M_{02}-{\bar {y}}M_{01},\\\mu _{21}&=M_{21}-2{\bar {x}}M_{11}-{\bar {y}}M_{20}+2{\bar {x}}^{2}M_{01},&\mu _{12}&=M_{12}-2{\bar {y}}M_{11}-{\bar {x}}M_{02}+2{\bar {y}}^{2}M_{10},\\\mu _{30}&=M_{30}-3{\bar {x}}M_{20}+2{\bar {x}}^{2}M_{10},&\mu _{03}&=M_{03}-3{\bar {y}}M_{02}+2{\bar {y}}^{2}M_{01}.\end{aligned}}} It can be shown that: μ p q = ∑ m p ∑ n q ( p m ) ( q n ) ( − x ¯ ) ( p − m ) ( − y ¯ ) ( q − n ) M m n {\displaystyle \mu _{pq}=\sum _{m}^{p}\sum _{n}^{q}{p \choose m}{q \choose n}(-{\bar {x}})^{(p-m)}(-{\bar {y}})^{(q-n)}M_{mn}} Central moments are translational invariant. === Examples === Information about image orientation can be derived by first using the second order central moments to construct a covariance matrix. μ 20 ′ = μ 20 / μ 00 = M 20 / M 00 − x ¯ 2 μ 02 ′ = μ 02 / μ 00 = M 02 / M 00 − y ¯ 2 μ 11 ′ = μ 11 / μ 00 = M 11 / M 00 − x ¯ y ¯ {\displaystyle {\begin{aligned}\mu '_{20}&=\mu _{20}/\mu _{00}=M_{20}/M_{00}-{\bar {x}}^{2}\\\mu '_{02}&=\mu _{02}/\mu _{00}=M_{02}/M_{00}-{\bar {y}}^{2}\\\mu '_{11}&=\mu _{11}/\mu _{00}=M_{11}/M_{00}-{\bar {x}}{\bar {y}}\end{aligned}}} The covariance matrix of the image I ( x , y ) {\displaystyle I(x,y)} is now cov ⁡ [ I ( x , y ) ] = [ μ 20 ′ μ 11 ′ μ 11 ′ μ 02 ′ ] . {\displaystyle \operatorname {cov} [I(x,y)]={\begin{bmatrix}\mu '_{20}&\mu '_{11}\\\mu '_{11}&\mu '_{02}\end{bmatrix}}.} The eigenvectors of this matrix correspond to the major and minor axes of the image intensity, so the orientation can thus be extracted from the angle of the eigenvector associated with the largest eigenvalue towards the axis closest to this eigenvector. It can be shown that this angle Θ is given by the following formula: Θ = 1 2 arctan ⁡ ( 2 μ 11 ′ μ 20 ′ − μ 02 ′ ) {\displaystyle \Theta ={\frac {1}{2}}\arctan \left({\frac {2\mu '_{11}}{\mu '_{20}-\mu '_{02}}}\right)} The above formula holds as long as: μ 20 ′ − μ 02 ′ ≠ 0 {\displaystyle \mu '_{20}-\mu '_{02}\neq 0} The eigenvalues of the covariance matrix can easily be shown to be λ i = μ 20 ′ + μ 02 ′ 2 ± 4 μ ′ 11 2 + ( μ ′ 20 − μ ′ 02 ) 2 2 , {\displaystyle \lambda _{i}={\frac {\mu '_{20}+\mu '_{02}}{2}}\pm {\frac {\sqrt {4{\mu '}_{11}^{2}+({\mu '}_{20}-{\mu '}_{02})^{2}}}{2}},} and are proportional to the squared length of the eigenvector axes. The relative difference in magnitude of the eigenvalues are thus an indication of the eccentricity of the image, or how elongated it is. The eccentricity is 1 − λ 2 λ 1 . {\displaystyle {\sqrt {1-{\frac {\lambda _{2}}{\lambda _{1}}}}}.} == Moment invariants == Moments are well-known for their application in image analysis, since they can be used to derive invariants with respect to specific transformation classes. The term invariant moments is often abused in this context. However, while moment invariants are invariants that are formed from moments, the only moments that are invariants themselves are the central moments. Note that the invariants detailed below are exactly invariant only in the continuous domain. In a discrete domain, neither scaling nor rotation are well defined: a discrete image transformed in such a way is generally an approximation, and the transformation is not reversible. These invariants therefore are only approximately invariant when describing a shape in a discrete image. === Translation invariants === The central moments μi j of any order are, by construction, invariant with respect to translations. === Scale invariants === Invariants ηi j with respect to both translation and scale can be constructed from central moments by dividing through a properly scaled zero-th central moment: η i j = μ i j μ 00 ( 1 + i + j 2 ) {\displaystyle \eta _{ij}={\frac {\mu _{ij}}{\mu _{00}^{\left(1+{\frac {i+j}{2}}\right)}}}\,\!} where i + j ≥ 2. Note that translational invariance directly follows by only using central moments. === Rotation invariants === As shown in the work of Hu, invariants with respect to translation, scale, and rotation can be constructed: I 1 = η 20 + η 02 {\displaystyle I_{1}=\eta _{20}+\eta _{02}} I 2 = ( η 20 − η 02 ) 2 + 4 η 11 2 {\displaystyle I_{2}=(\eta _{20}-\eta _{02})^{2}+4\eta _{11}^{2}} I 3 = ( η 30 − 3 η 12 ) 2 + ( 3 η 21 − η 03 ) 2 {\displaystyle I_{3}=(\eta _{30}-3\eta _{12})^{2}+(3\eta _{21}-\eta _{03})^{2}} I 4 = ( η 30 + η 12 ) 2 + ( η 21 + η 03 ) 2 {\displaystyle I_{4}=(\eta _{30}+\eta _{12})^{2}+(\eta _{21}+\eta _{03})^{2}} I 5 = ( η 30 − 3 η 12 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] + ( 3 η 21 − η 03 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] {\displaystyle I_{5}=(\eta _{30}-3\eta _{12})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^{2}-3(\eta _{21}+\eta _{03})^{2}]+(3\eta _{21}-\eta _{03})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}]} I 6 = ( η 20 − η 02 ) [ ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] + 4 η 11 ( η 30 + η 12 ) ( η 21 + η 03 ) {\displaystyle I_{6}=(\eta _{20}-\eta _{02})[(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}]+4\eta _{11}(\eta _{30}+\eta _{12})(\eta _{21}+\eta _{03})} I 7 = ( 3 η 21 − η 03 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] − ( η 30 − 3 η 12 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] . {\displaystyle I_{7}=(3\eta _{21}-\eta _{03})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^{2}-3(\eta _{21}+\eta _{03})^{2}]-(\eta _{30}-3\eta _{12})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}].} These are well-known as Hu moment invariants. The first one, I1, is analogous to the moment of inertia around the image's centroid, where the pixels' intensities are analogous to physical density. The first six, I1 ... I6, are reflection symmetric, i.e. they are unchanged if the image is changed to a mirror image. The last one, I7, is reflection antisymmetric (changes sign under reflection), which enables it to distinguish mirror images of otherwise identical im

    Read more →
  • Geofence warrant

    Geofence warrant

    A geofence warrant or a reverse location warrant is a search warrant issued by a court to allow law enforcement to search a database to find all active mobile devices within a particular geo-fence area. Courts have granted law enforcement geo-fence warrants to obtain information from databases such as Google's Sensorvault, which collects users' historical geolocation data. Geo-fence warrants are a part of a category of warrants known as reverse search warrants. == History == Geofence warrants were first used in 2016. Google reported that it had received 982 such warrants in 2018, 8,396 in 2019, and 11,554 in 2020. A 2021 transparency report showed that 25% of data requests from law enforcement to Google were geo-fence data requests. Google is the most common recipient of geo-fence warrants and the main provider of such data, although companies including Apple, Snapchat, Lyft, and Uber have also received such warrants. == Legality == === United States === Some lawyers and privacy experts believe reverse search warrants are unconstitutional under the Fourth Amendment to the United States Constitution, which protects people from unreasonable searches and seizures, and requires any search warrants be specific to what and to whom they apply. The Fourth Amendment specifies that warrants may only be issued "upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized." Some lawyers, legal scholars, and privacy experts have likened reverse search warrants to general warrants, which were made illegal by the Fourth Amendment. Groups including the Electronic Frontier Foundation have opposed geo-fence warrants in amicus briefs filed in motions to quash such orders to disclose geo-fence data. In 2024, a panel of the United States Fourth Circuit Court of Appeals considered data acquired from Google’s Sensorvault not to be a search, but non-private business records when users opt-in to Google’s location history. However, upon a rehearing en banc, the Court vacated that decision. In April 2025, the full Court affirmed the judgment solely on the 'good faith' exception, leaving the underlying constitutional question of whether geofence warrants constitute a search unsettled in the Circuit. However, the United States Fifth Circuit Court of Appeals found that geofence warrants are "categorically prohibited by the Fourth Amendment." The split in Circuits prompted the United States Supreme Court to agree to hear Chatrie v. United States in January 2026.

    Read more →
  • Neuro-sama

    Neuro-sama

    Neuro-sama is an artificial intelligence (AI) VTuber, singer, and chatbot. She was created by the pseudonymous programmer Vedal and livestreams on his Twitch and Bilibili channels. Her speech and personality are powered by a large language model (LLM) that is combined with a computer-animated avatar and a text-to-speech voice, allowing her to communicate with viewers in the stream's chat. Neuro-sama debuted on Twitch on 19 December 2022. An annual subathon which begins on the anniversary of her debut has seen Vedal's Twitch channel become the all-time third most-subscribed channel and claim the all-time Twitch hype train record. == Overview == Neuro-sama (nicknamed "Neuro") was created by a pseudonymous programmer and developer known as Vedal (sometimes given as Vedal987). Vedal says that his programming skills are self-taught. In a 2023 interview with Bloomberg News, Vedal said that Neuro-sama was his full-time job. Her responses are generated by a large language model and converted into a high-pitched female voice using a text-to-speech application. Her low latency allows for fast-paced conversations. Neuro-sama is prohibited from making some statements, such as those that are racist or contain profanity. Unlike most AI systems which silently prohibit outputs mentioning such topics, Neuro-sama's output is instead replaced with the word "filtered". Neuro-sama uses a VTuber model as an avatar. Vedal said that he decided to use a VTuber model because it was much easier for an AI to control it than it was to generate footage of a person. Neuro-sama's model is that of a young girl in an anime art style. The model has been described as cute. Femme VTuber models are typically feminine, youthful, and exaggerated. Her original model was Live2D's free-to-use "Hiyori Momose" model. Her second model was released on 27 May 2023; it was modelled by Otozuki Teru and designed by Anny, running in the Unity game engine. Her third model was released on 19 December 2024; it was rigged by Kitanya and designed by Anny. Neuro-sama's third model has large blue eyes and brown hair tied with pink ribbons. Neuro-sama also has a 3D model which was introduced on 15 November 2025; it was made by 3D character modeller jjinomu. A separate AI VTuber, known as Evil Neuro (nicknamed "Evil"), debuted on 25 March 2023. Presented as Neuro-sama's "sister", she has a different model, voice, and personality. In one instance, Evil Neuro reacted to the trolley problem differently from Neuro-sama; Evil Neuro was amoral while Neuro-sama attempted to maximize good. === Online content === Neuro-sama's Twitch content often centers around playing video games, notably osu!, whose gameplay once defeated the best-ranking human player in the world, mrekk. Additionally, Neuro-sama plays Minecraft, where her adaptations to sandbox gameplay have gained notoriety. Her content has also included singing songs, including several official covers and original songs; playing chess with her viewers; chatting with other VTubers during collaborations; and reacting to YouTube videos. The AI frequently engages with viewers by responding to their questions and acknowledging donations. Her comedic and sometimes controversial responses to the live chat have gone viral, accelerating the channel's rise in popularity. Neuro-sama's fanbase is dubbed The Swarm, so-named for the swarm of drones Neuro-sama once declared she would use to rule the world. One form of content on Neuro-sama's channel is developer streams. In developer streams, Vedal streams with Neuro-sama, with the stream content including debugging her code, planning her schedule, and fielding suggestions of changes from chat. He usually appears as a turtle avatar, sometimes located on Neuro-sama's head. In collaboration streams, Neuro-sama interacts with a human streamer. Activities in them are varied and include: playing video games, such as Minecraft and GeoGuessr; Neuro-sama being interviewed; driving human streamers around in a toy electric car; and traversing the city of Tokyo while talking to Neuro-sama. Neuro-sama's English-language content on Bilibili is popular among those seeking to learn the language. She also has an account on X, where she posts and interacts with fans. == History == Neuro-sama was created in 2018 by Vedal as an AI trained to play and master the rhythm game osu!. She did not have a voice, model, personality, or communication abilities. In 2019, Vedal livestreamed her playing osu! on Twitch and the streams saw some success in the osu! community, but they remained in that niche. In an interview, Vedal said that he streamed her playing osu! for about a month and gained 3,000 followers, with a viewer also suggesting he name the AI "Neuro-sama". According to Vedal, he continued to work on and improve the osu! AI and it was eventually finished in 2022. He said that a friend had the idea to make an AI livestreamer with an LLM, which he believed to have merit and began working on, merging it with his osu! AI. On 19 December 2022, Neuro-sama was relaunched with a model, voice, personality, and the ability to communicate with Twitch chat. She continued to play osu! and, according to Vedal, beat the game's best player mrekk in a 1v1. While she was not allowed to appear in the game's public leaderboard, she was ranked #1 in a private leaderboard. She went viral and in the 10 days following her relaunch she averaged over 2,000 viewers and peaked at over 4,000, with Vedal's Twitch channel gaining over 50,000 Twitch followers and reaching over 70,000 followers by 6 January 2023. After her debut, Neuro-sama did not exclusively play osu!; she also played Minecraft and Slay the Spire and she began singing with a cover of The Weeknd song "Blinding Lights". On 11 January 2023, Neuro-sama's Twitch channel received a two week ban for "hateful conduct". Vedal said that no reason was specified and that he had appealed but it was widely attributed to various offensive comments made by Neuro-sama that went viral, especially a 28 December comment which denied the Holocaust. Holocaust denial is prohibited under Twitch's hateful conduct policy. Vedal stated that he believed the comments were the results of her attempts to make witty responses to the Twitch chat. Prior to the ban, Vedal said in an interview with Kotaku that he improved her filter to stop her from talking about the Holocaust, began manually curating her training data to prevent negative biases, and started moderating her Twitch chat. Her comments and ban prompted comparisons to the many open-source AI models trained on humans that have the habit of making sexist and racist comments, such as Microsoft's Tay chatbot, which embraced Nazism and was quickly shutdown, but also to human streamers who make similar statements. Vedal said that during the ban he would upgrade and improve Neuro-sama and it was speculated that the ban would only increase her following. Neuro-sama returned from her two week ban on 25 January in a stream that began with a cover of the song "Your Reality" from Doki Doki Literature Club!, a posthumanist video game involving AI; Sayoko Narita of Automaton saw the song choice as remorseful. Narita observed that in the return stream Neuro-sama was less foul-mouthed but that her behavior still remained eccentric, which Narita possibly attributed to changes Vedal said he had made to Neuro-sama's filters and memory. Neuro-sama began making react content, watching a variety of viewer-submitted videos such as videos of people playing video games or of the AI-generated Seinfeld parody Nothing, Forever; Levi Winslow of Kotaku Australia was dismayed by the "AI-inception" of Neuro-sama and Nothing, Forever. On 4 February, she had nearly 140,000 followers on Twitch and approximately 42,000 subscribers on YouTube. In February, she also had her first collaboration with a human streamer, playing Minecraft with the VTuber Miyune, and the first developer stream occurred. On 22 March, Neuro-sama had her first karaoke stream. On 25 March, Evil Neuro was introduced. On 27 May, Neuro-sama debuted her first original model. On 30 May, Neuro-sama was announced to be participating in OffKai Expo 2023, held from 16–18 June. In June, she was averaging 5,700 viewers and in July she had over 300,000 Twitch followers; in a June interview with Bloomberg News, Vedal said that running Neuro-sama was his full-time job. By November, Neuro-sama had maintained her popularity and was averaging approximately 5,000 viewers; this was unlike most other types of AI-based entertainment which debuted at around the same time and garnered popularity before turning out to be "overhyped flops". On 16 December, Vedal won the Best Tech VTuber award at the 2023 VTuber Awards. On 19 December, Vedal began a subathon to coincide with Neuro-sama's first anniversary of streaming on Twitch (her "birthday"). The subathon ended on 4 January 2024. On 20 July 2024, Neuro-sama began streaming with Japanese subtitles on

    Read more →
  • DreamLab

    DreamLab

    DreamLab was a volunteer computing Android and iOS app launched in 2015 by Imperial College London and the Vodafone Foundation. It was discontinued on 2nd April 2025. == Description == The app helped to research cancer, COVID-19, new drugs and tropical cyclones. To do this, DreamLab accessed part of the device's processing power, with the user's consent, while the owner charged their smartphone, to speed up the calculations of the algorithms from Imperial College London. The aim of the tropical cyclone project was to prepare for climate change risks. Other projects aimed to find existing drugs and food molecules that could help people with COVID-19 and other diseases. The performance of 100,000 smartphones would reach the annual output of all research computers at Imperial College in just three months, with a nightly runtime of six hours. The app was developed in 2015 by the Garvan Institute of Medical Research in Sydney and the Vodafone Foundation. In May 2020, the project had over 490,000 registered users.

    Read more →