AI App Gemini

AI App Gemini — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hancom Office

    Hancom Office

    Hancom Office is a proprietary office suite that includes a word processor, spreadsheet software, presentation software, and a PDF editor as well as their online versions accessible via a web browser. It is primarily addressed to Korean users. Hancom Office is written in Java and C++ that runs on Android, iOS, macOS and Windows platforms. == Products == Hangul - Hangul is a word processor developed by Hancom. It is a product that eliminates the inconvenience of the original Hangul word processor, which was limited to Hangul cards or PC models. Originally, the name was written using the '아래아' character, a vowel letter that is obsolete in modern Korean, and it was referred to as 'HWP' (an abbreviation for Hangul Word Processor), '아래아 한글' (Arae-a Hangul), '한/글' (Han/Geul), and so on. Hangul is currently the most widely used word processor in South Korea, often used alongside Microsoft Word. HanWord - word processor compatible with Word HanCell - spreadsheet program HanShow - presentation program Hancom Office Hanword Viewer - For viewing documents created by Hancom Office or Microsoft Office

    Read more →
  • Concept class

    Concept class

    In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory. Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning. In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map c : X → Y {\displaystyle c:X\to Y} , i.e. from examples to classifier labels (where Y = { 0 , 1 } {\displaystyle Y=\{0,1\}} and where c is a subset of X), c is then said to be a concept. A concept class C {\displaystyle C} is then a collection of such concepts. Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s. Not every subclass is reachable. == Background == A sample s {\displaystyle s} is a partial function from X {\displaystyle X} to { 0 , 1 } {\displaystyle \{0,1\}} . Identifying a concept with its characteristic function mapping X {\displaystyle X} to { 0 , 1 } {\displaystyle \{0,1\}} , it is a special case of a sample. Two samples are consistent if they agree on the intersection of their domains. A sample s ′ {\displaystyle s'} extends another sample s {\displaystyle s} if the two are consistent and the domain of s {\displaystyle s} is contained in the domain of s ′ {\displaystyle s'} . == Examples == Suppose that C = S + ( X ) {\displaystyle C=S^{+}(X)} . Then: the subclass { { x } } {\displaystyle \{\{x\}\}} is reachable with the sample s = { ( x , 1 ) } {\displaystyle s=\{(x,1)\}} ; the subclass S + ( Y ) {\displaystyle S^{+}(Y)} for Y ⊆ X {\displaystyle Y\subseteq X} are reachable with a sample that maps the elements of X − Y {\displaystyle X-Y} to zero; the subclass S ( X ) {\displaystyle S(X)} , which consists of the singleton sets, is not reachable. == Applications == Let C {\displaystyle C} be some concept class. For any concept c ∈ C {\displaystyle c\in C} , we call this concept 1 / d {\displaystyle 1/d} -good for a positive integer d {\displaystyle d} if, for all x ∈ X {\displaystyle x\in X} , at least 1 / d {\displaystyle 1/d} of the concepts in C {\displaystyle C} agree with c {\displaystyle c} on the classification of x {\displaystyle x} . The fingerprint dimension F D ( C ) {\displaystyle FD(C)} of the entire concept class C {\displaystyle C} is the least positive integer d {\displaystyle d} such that every reachable subclass C ′ ⊆ C {\displaystyle C'\subseteq C} contains a concept that is 1 / d {\displaystyle 1/d} -good for it. This quantity can be used to bound the minimum number of equivalence queries needed to learn a class of concepts according to the following inequality: F D ( C ) − 1 ≤ # E Q ( C ) ≤ ⌈ F D ( C ) ln ⁡ ( | C | ) ⌉ {\textstyle FD(C)-1\leq \#EQ(C)\leq \lceil FD(C)\ln(|C|)\rceil } .

    Read more →
  • Latent and observable variables

    Latent and observable variables

    In statistics, latent variables (from Latin: present participle of lateo 'lie hidden') are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured. Such latent variable models are used in many disciplines, including engineering, medicine, ecology, physics, machine learning/artificial intelligence, natural language processing, bioinformatics, chemometrics, demography, economics, management, political science, psychology and the social sciences. Latent variables may correspond to aspects of physical reality. These could in principle be measured, but may not be for practical reasons. Among the earliest expressions of this idea is Francis Bacon's polemic the Novum Organum, itself a challenge to the more traditional logic expressed in Aristotle's Organon: But the latent process of which we speak, is far from being obvious to men’s minds, beset as they now are. For we mean not the measures, symptoms, or degrees of any process which can be exhibited in the bodies themselves, but simply a continued process, which, for the most part, escapes the observation of the senses. In this situation, the term hidden variables is commonly used, reflecting the fact that the variables are meaningful, but not observable. Other latent variables correspond to abstract concepts, like categories, behavioral or mental states, or data structures. The terms hypothetical variables or hypothetical constructs may be used in these situations. The use of latent variables can serve to reduce the dimensionality of data. Many observable variables can be aggregated in a model to represent an underlying concept, making it easier to understand the data. In this sense, they serve a function similar to that of scientific theories. At the same time, latent variables link observable "sub-symbolic" data in the real world to symbolic data in the modeled world. == Examples == === Psychology === Latent variables, as created by factor analytic methods, generally represent "shared" variance, or the degree to which variables "move" together. Variables that have no correlation cannot result in a latent construct based on the common factor model. The "Big Five personality traits" have been inferred using factor analysis. extraversion spatial ability wisdom: “Two of the more predominant means of assessing wisdom include wisdom-related performance and latent variable measures.” Spearman's g, or the general intelligence factor in psychometrics === Economics === Examples of latent variables from the field of economics include quality of life, business confidence, morale, happiness and conservatism: these are all variables which cannot be measured directly. However, by linking these latent variables to other, observable variables, the values of the latent variables can be inferred from measurements of the observable variables. Quality of life is a latent variable which cannot be measured directly, so observable variables are used to infer quality of life. Observable variables to measure quality of life include wealth, employment, environment, physical and mental health, education, recreation and leisure time, and social belonging. === Medicine === Latent-variable methodology is used in many branches of medicine. A class of problems that naturally lend themselves to latent variables approaches are longitudinal studies where the time scale (e.g. age of participant or time since study baseline) is not synchronized with the trait being studied. For such studies, an unobserved time scale that is synchronized with the trait being studied can be modeled as a transformation of the observed time scale using latent variables. Examples of this include disease progression modeling and modeling of growth (see box). == Inferring latent variables == There exists a range of different model classes and methodology that make use of latent variables and allow inference in the presence of latent variables. Models include: linear mixed-effects models and nonlinear mixed-effects models Hidden Markov models Factor analysis Item response theory Analysis and inference methods include: Principal component analysis Instrumented principal component analysis Partial least squares regression Latent semantic analysis and probabilistic latent semantic analysis EM algorithms Metropolis–Hastings algorithm === Bayesian algorithms and methods === Bayesian statistics is often used for inferring latent variables. Latent Dirichlet allocation The Chinese restaurant process is often used to provide a prior distribution over assignments of objects to latent categories. The Indian buffet process is often used to provide a prior distribution over assignments of latent binary features to objects.

    Read more →
  • Tucker decomposition

    Tucker decomposition

    In mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tucker although it goes back to Hitchcock in 1927. Initially described as a three-mode extension of factor analysis and principal component analysis it may actually be generalized to higher mode analysis, which is also called higher-order singular value decomposition (HOSVD) or the M-mode SVD. The algorithm to which the literature typically refers when discussing the Tucker decomposition or the HOSVD is the M-mode SVD algorithm introduced by Vasilescu and Terzopoulos, but misattributed to Tucker or De Lathauwer etal. It may be regarded as a more flexible PARAFAC (parallel factor analysis) model. In PARAFAC the core tensor is restricted to be "diagonal". In practice, Tucker decomposition is used as a modelling tool. For instance, it is used to model three-way (or higher way) data by means of relatively small numbers of components for each of the three or more modes, and the components are linked to each other by a three- (or higher-) way core array. The model parameters are estimated in such a way that, given fixed numbers of components, the modelled data optimally resemble the actual data in the least squares sense. The model gives a summary of the information in the data, in the same way as principal components analysis does for two-way data. For a 3rd-order tensor T ∈ F n 1 × n 2 × n 3 {\displaystyle T\in F^{n_{1}\times n_{2}\times n_{3}}} , where F {\displaystyle F} is either R {\displaystyle \mathbb {R} } or C {\displaystyle \mathbb {C} } , Tucker Decomposition can be denoted as follows, T = T × 1 U ( 1 ) × 2 U ( 2 ) × 3 U ( 3 ) {\displaystyle T={\mathcal {T}}\times _{1}U^{(1)}\times _{2}U^{(2)}\times _{3}U^{(3)}} where T ∈ F d 1 × d 2 × d 3 {\displaystyle {\mathcal {T}}\in F^{d_{1}\times d_{2}\times d_{3}}} is the core tensor, a 3rd-order tensor that contains the 1-mode, 2-mode and 3-mode singular values of T {\displaystyle T} , which are defined as the Frobenius norm of the 1-mode, 2-mode and 3-mode slices of tensor T {\displaystyle {\mathcal {T}}} respectively. U ( 1 ) , U ( 2 ) , U ( 3 ) {\displaystyle U^{(1)},U^{(2)},U^{(3)}} are unitary matrices in F d 1 × n 1 , F d 2 × n 2 , F d 3 × n 3 {\displaystyle F^{d_{1}\times n_{1}},F^{d_{2}\times n_{2}},F^{d_{3}\times n_{3}}} respectively. The k-mode product (k = 1, 2, 3) of T {\displaystyle {\mathcal {T}}} by U ( k ) {\displaystyle U^{(k)}} is denoted as T × U ( k ) {\displaystyle {\mathcal {T}}\times U^{(k)}} with entries as ( T × 1 U ( 1 ) ) ( i 1 , j 2 , j 3 ) = ∑ j 1 = 1 d 1 T ( j 1 , j 2 , j 3 ) U ( 1 ) ( j 1 , i 1 ) ( T × 2 U ( 2 ) ) ( j 1 , i 2 , j 3 ) = ∑ j 2 = 1 d 2 T ( j 1 , j 2 , j 3 ) U ( 2 ) ( j 2 , i 2 ) ( T × 3 U ( 3 ) ) ( j 1 , j 2 , i 3 ) = ∑ j 3 = 1 d 3 T ( j 1 , j 2 , j 3 ) U ( 3 ) ( j 3 , i 3 ) {\displaystyle {\begin{aligned}({\mathcal {T}}\times _{1}U^{(1)})(i_{1},j_{2},j_{3})&=\sum _{j_{1}=1}^{d_{1}}{\mathcal {T}}(j_{1},j_{2},j_{3})U^{(1)}(j_{1},i_{1})\\({\mathcal {T}}\times _{2}U^{(2)})(j_{1},i_{2},j_{3})&=\sum _{j_{2}=1}^{d_{2}}{\mathcal {T}}(j_{1},j_{2},j_{3})U^{(2)}(j_{2},i_{2})\\({\mathcal {T}}\times _{3}U^{(3)})(j_{1},j_{2},i_{3})&=\sum _{j_{3}=1}^{d_{3}}{\mathcal {T}}(j_{1},j_{2},j_{3})U^{(3)}(j_{3},i_{3})\end{aligned}}} Altogether, the decomposition may also be written more directly as T ( i 1 , i 2 , i 3 ) = ∑ j 1 = 1 d 1 ∑ j 2 = 1 d 2 ∑ j 3 = 1 d 3 T ( j 1 , j 2 , j 3 ) U ( 1 ) ( j 1 , i 1 ) U ( 2 ) ( j 2 , i 2 ) U ( 3 ) ( j 3 , i 3 ) {\displaystyle T(i_{1},i_{2},i_{3})=\sum _{j_{1}=1}^{d_{1}}\sum _{j_{2}=1}^{d_{2}}\sum _{j_{3}=1}^{d_{3}}{\mathcal {T}}(j_{1},j_{2},j_{3})U^{(1)}(j_{1},i_{1})U^{(2)}(j_{2},i_{2})U^{(3)}(j_{3},i_{3})} Taking d i = n i {\displaystyle d_{i}=n_{i}} for all i {\displaystyle i} is always sufficient to represent T {\displaystyle T} exactly, but often T {\displaystyle T} can be compressed or efficiently approximately by choosing d i < n i {\displaystyle d_{i} Read more →

  • Coalition for App Fairness

    Coalition for App Fairness

    The Coalition for App Fairness (CAF) is a coalition comprised by companies, who aim to reach a fairer deal for the inclusion of their apps into the Apple App Store or the Google Play Store. The organization's executive director is Meghan DiMuzio and its headquarters are located in Washington, D.C. == Background == In July 2015, Spotify launched an email campaign to urge its App Store subscribers to cancel their subscriptions and start new ones through its website, bypassing the 30% transaction fee for in-app purchases required for iOS applications by technology company Apple Inc. A later update to the Spotify app on iOS was rejected by Apple, prompting Spotify's general counsel Horacio Gutierrez to write a letter to Apple's then-general counsel Bruce Sewell, stating: "This latest episode raises serious concerns under both U.S. and EU competition law. It continues a troubling pattern of behavior by Apple to exclude and diminish the competitiveness of Spotify on iOS and as a rival to Apple Music, particularly when seen against the backdrop of Apple's previous anticompetitive conduct aimed at Spotify … we cannot stand by as Apple uses the App Store approval process as a weapon to harm competitors." In August 2020, Epic Games updated their Fortnite Battle Royale game app on both Apple's App Store and Google's Google Play to include its own storefront that offered a 20% discount on V-Bucks, the in-game currency, if players bought through there rather than through the app stores' storefront, both which take a 30% revenue cut of the sale. Both Apple and Google removed the Fortnite app within hours, as this alternate storefront violated their terms of use that required all in-app purchases to be made through their storefronts. Epic immediately filed lawsuits against both companies challenging their storefront policies on antitrust principles, arguing that their non-negotiable 30% revenue cut is too high and the restrictions against alternate storefronts anticompetitive. Apple countersued Epic over its behavior, leading to a highly publicized 2021 bench trial. Ultimately, Epic largely lost its lawsuit against Apple, though the court did order Apple to allow developers to point users to alternative payment methods. Conversely, Epic won its antitrust lawsuit against Google in late 2023. == Foundation == On 24 September 2020, Epic Games joined forces with thirteen other prominent companies—including the music streaming platform Spotify, Tinder owner Match Group, the encrypted mail service Proton Mail, and the crypto currency website Blockchain.com—to establish the Coalition for App Fairness. It also includes Basecamp. The coalition criticizes the fact that for now the app stores of both Apple and Google charge their clients a 30% fee on any purchases made over their stores. Apple and Google defended themselves by arguing that the 30% transaction fee is a standard in the industry while the Coalition for App Fairness states that there is no other transaction fee which is even close to the 30%. In October 2020, it was reported that the coalition grew from 13 to 40 members since its foundation and received more than 400 applications for membership. In October 2025, X (formerly Twitter) joined CAF. This was seen as a larger pushback in the industry against Apple and Google, and a step towards hopefully passing the Bipartisan Open App Markets Act. == Aims == The group has broadened their demands for the app stores and now also aim for a better treatment for the apps available in the App Store. They claim that Apple favors its own services before other services available on the market and unjustifiably excludes other apps from their App Store. The group has also been viewing other transaction fees like the 5% fee which is charged by credit card companies, and states that Apple charges up to 600% more and would like the 30% fee, which was only included in 2011 by Apple, adapted to a comparable percentage that charge other providers of payment solutions. Its demands are mainly directed at Apple's strict control over its App Store, but to a lesser extent are also directed towards Google. Google allows apps to be downloaded over an independent web link or also another App Store, such as the Epic Game App Store. The organization emphasizes that no app developer should come into the position in which they are discriminated and are not granted the same rights as to the developers of the owner of the app store. == Reactions == In October 2020, Microsoft presented a new framework concerning the access to its Windows 10 operating system by app stores other than the one offered by Microsoft. The new framework is based on the demands of the Coalition for App Fairness. Microsoft emphasized though, that these principles would not apply to the Xbox. In December 2020, Apple announced that they would be lowering the revenue cut Apple takes for app developers making $1M or less from 30% to 15% if app developers fill out an application for the lowered revenue cut. In March 2021, Google followed suit by also lowering the revenue cut from the Play Store from 30% to 15% for the first million in revenue earned by a developer each year. == Notable members == Members listed are notable companies listed as members the groups website: Blockchain.com Deezer Epic Games European Digital SME Alliance Fanfix Life360 Masimo Nium Proton Mail Spotify TapTap Threema Vipps

    Read more →
  • Types of artificial neural networks

    Types of artificial neural networks

    Types of neural networks (NN) include a family of techniques. The simplest types have static components, including number of units, number of layers, unit weights and topology. Dynamic NNs evolve via learning. Some types allow/require learning to be "supervised" by the operator, while others operate independently. Some types operate purely in hardware, while others are purely software and run on general purpose computers. The main types are: Transformers: these use attention to analyze every token in the input stream against every other token in the stream. That technique has enabled neural networks to reach the general public via chatbots, code generators and many other forms. Convolutional neural networks (CNN): a FNN that uses kernels and regularization to evade problems in prior generations of NNs. They are typically used to analyze visual and other two-dimensional data. Generative adversarial networks set networks (of varying structure) against each other, each trying to push the other(s) to produce better results such as winning a game or to deceive the opponent about the authenticity of an input. == Feedforward == In feedforward neural networks the information moves from the input to output directly in every layer. There can be hidden layers with or without cycles/loops to sequence inputs. Feedforward networks can be constructed with various types of units, such as binary McCulloch–Pitts neurons, the simplest of which is the perceptron. Continuous neurons, frequently with sigmoidal activation, are used in the context of backpropagation. == Group method of data handling == The Group Method of Data Handling (GMDH) features fully automatic structural and parametric model optimization. The node activation functions are Kolmogorov–Gabor polynomials that permit additions and multiplications. It uses a deep multilayer perceptron with eight layers. It is a supervised learning network that grows layer by layer, where each layer is trained by regression analysis. Useless items are detected using a validation set, and pruned through regularization. The size and depth of the resulting network depends on the task. == Autoencoder == An autoencoder, autoassociator or Diabolo network is similar to the multilayer perceptron (MLP) – with an input layer, an output layer and one or more hidden layers connecting them. However, the output layer has the same number of units as the input layer. Its purpose is to reconstruct its own inputs (instead of emitting a target value). Therefore, autoencoders are unsupervised learning models. An autoencoder is used for unsupervised learning of efficient codings, typically for the purpose of dimensionality reduction and for learning generative models of data. == Probabilistic == A probabilistic neural network (PNN) is a four-layer feedforward neural network. The layers are Input, hidden pattern, hidden summation, and output. In the PNN algorithm, the parent probability distribution function (PDF) of each class is approximated by a Parzen window and a non-parametric function. Then, using PDF of each class, the class probability of a new input is estimated and Bayes’ rule is employed to allocate it to the class with the highest posterior probability. It was derived from the Bayesian network and a statistical algorithm called Kernel Fisher discriminant analysis. It is used for classification and pattern recognition. == Time delay == A time delay neural network (TDNN) is a feedforward architecture for sequential data that recognizes features independent of sequence position. In order to achieve time-shift invariance, delays are added to the input so that multiple data points (points in time) are analyzed together. It usually forms part of a larger pattern recognition system. It has been implemented using a perceptron network whose connection weights were trained with back propagation (supervised learning). == Convolutional == A convolutional neural network (CNN, or ConvNet or shift invariant or space invariant) is a class of deep network, composed of one or more convolutional layers with fully connected layers (matching those in typical ANNs) on top. It uses tied weights and pooling layers. In particular, max-pooling. It is often structured via Fukushima's convolutional architecture. They are variations of multilayer perceptrons that use minimal preprocessing. This architecture allows CNNs to take advantage of the 2D structure of input data. Its unit connectivity pattern is inspired by the organization of the visual cortex. Units respond to stimuli in a restricted region of space known as the receptive field. Receptive fields partially overlap, over-covering the entire visual field. Unit response can be approximated mathematically by a convolution operation. CNNs are suitable for processing visual and other two-dimensional data. They have shown superior results in both image and speech applications. They can be trained with standard backpropagation. CNNs are easier to train than other regular, deep, feed-forward neural networks and have many fewer parameters to estimate. Capsule Neural Networks (CapsNet) add structures called capsules to a CNN and reuse output from several capsules to form more stable (with respect to various perturbations) representations. Examples of applications in computer vision include DeepDream and robot navigation. They have wide applications in image and video recognition, recommender systems and natural language processing. == Deep stacking network == A deep stacking network (DSN) (deep convex network) is based on a hierarchy of blocks of simplified neural network modules. It was introduced in 2011 by Deng and Yu. It formulates the learning as a convex optimization problem with a closed-form solution, emphasizing the mechanism's similarity to stacked generalization. Each DSN block is a simple module that is easy to train by itself in a supervised fashion without backpropagation for the entire blocks. Each block consists of a simplified multi-layer perceptron (MLP) with a single hidden layer. The hidden layer h has logistic sigmoidal units, and the output layer has linear units. Connections between these layers are represented by weight matrix U; input-to-hidden-layer connections have weight matrix W. Target vectors t form the columns of matrix T, and the input data vectors x form the columns of matrix X. The matrix of hidden units is H = σ ( W T X ) {\displaystyle {\boldsymbol {H}}=\sigma ({\boldsymbol {W}}^{T}{\boldsymbol {X}})} . Modules are trained in order, so lower-layer weights W are known at each stage. The function performs the element-wise logistic sigmoid operation. Each block estimates the same final label class y, and its estimate is concatenated with original input X to form the expanded input for the next block. Thus, the input to the first block contains the original data only, while downstream blocks' input adds the output of preceding blocks. Then learning the upper-layer weight matrix U given other weights in the network can be formulated as a convex optimization problem: min U T f = ‖ U T H − T ‖ F 2 , {\displaystyle \min _{U^{T}}f=\|{\boldsymbol {U}}^{T}{\boldsymbol {H}}-{\boldsymbol {T}}\|_{F}^{2},} which has a closed-form solution. Unlike other deep architectures, such as DBNs, the goal is not to discover the transformed feature representation. The structure of the hierarchy of this kind of architecture makes parallel learning straightforward, as a batch-mode optimization problem. In purely discriminative tasks, DSNs outperform conventional DBNs. === Tensor deep stacking networks === This architecture is a DSN extension. It offers two important improvements: it uses higher-order information from covariance statistics, and it transforms the non-convex problem of a lower-layer to a convex sub-problem of an upper-layer. TDSNs use covariance statistics in a bilinear mapping from each of two distinct sets of hidden units in the same layer to predictions, via a third-order tensor. While parallelization and scalability are not considered seriously in conventional DNNs, all learning for DSNs and TDSNs is done in batch mode, to allow parallelization. Parallelization allows scaling the design to larger (deeper) architectures and data sets. The basic architecture is suitable for diverse tasks such as classification and regression. == Physics-informed == Such a neural network is designed for the numerical solution of mathematical equations, such as differential, integral, delay, fractional and others. As input parameters, PINN accepts variables (spatial, temporal, and others), transmits them through the network block. At the output, it produces an approximate solution and substitutes it into the mathematical model, considering the initial and boundary conditions. If the solution does not satisfy the required accuracy, one uses the backpropagation and rectify the solution. Besides PINN, other architectures have been developed to produce surrogate models for scientific comput

    Read more →
  • Confirmatory blockmodeling

    Confirmatory blockmodeling

    Confirmatory blockmodeling is a deductive approach in blockmodeling, where a blockmodel (or part of it) is prespecify before the analysis, and then the analysis is fit to this model. When only a part of analysis is prespecify (like individual cluster(s) or location of the block types), it is called partially confirmatory blockmodeling. This is so-called indirect approach, where the blockmodeling is done on the blockmodel fitting (e.g., a priori hypothesized blockmodel). Opposite approach to the confirmatory blockmodeling is an inductive exploratory blockmodeling.

    Read more →
  • List of text mining software

    List of text mining software

    Text mining computer programs are available from many commercial and open source companies and sources. == Commercial == Angoss – Angoss Text Analytics provides entity and theme extraction, topic categorization, sentiment analysis and document summarization capabilities via the embedded AUTINDEX – is a commercial text mining software package based on sophisticated linguistics by IAI (Institute for Applied Information Sciences), Saarbrücken. DigitalMR – social media listening & text+image analytics tool for market research. FICO Score – leading provider of analytics. General Sentiment – Social Intelligence platform that uses natural language processing to discover affinities between the fans of brands with the fans of traditional television shows in social media. Stand alone text analytics to capture social knowledge base on billions of topics stored to 2004. IBM LanguageWare – the IBM suite for text analytics (tools and Runtime). IBM SPSS – provider of Modeler Premium (previously called IBM SPSS Modeler and IBM SPSS Text Analytics), which contains advanced NLP-based text analysis capabilities (multi-lingual sentiment, event and fact extraction), that can be used in conjunction with Predictive Modeling. Text Analytics for Surveys provides the ability to categorize survey responses using NLP-based capabilities for further analysis or reporting. Inxight – provider of text analytics, search, and unstructured visualization technologies. (Inxight was bought by Business Objects that was bought by SAP AG in 2008). Language Computer Corporation – text extraction and analysis tools, available in multiple languages. Lexalytics – provider of a text analytics engine used in Social Media Monitoring, Voice of Customer, Survey Analysis, and other applications. Salience Engine. The software provides the unique capability of merging the output of unstructured, text-based analysis with structured data to provide additional predictive variables for improved predictive models and association analysis. Linguamatics – provider of natural language processing (NLP) based enterprise text mining and text analytics software, I2E, for high-value knowledge discovery and decision support. Mathematica – provides built in tools for text alignment, pattern matching, clustering and semantic analysis. See Wolfram Language, the programming language of Mathematica. MATLAB offers Text Analytics Toolbox for importing text data, converting it to numeric form for use in machine and deep learning, sentiment analysis and classification tasks. Medallia – offers one system of record for survey, social, text, written and online feedback. NetMiner – software for network analysis and text mining. Supports social media and bibliographic data collection, NLP for english and chinese, sentiment analysis, work co-occurrence network(text network analysis) and visualization. NetOwl – suite of multilingual text and entity analytics products, including entity extraction, link and event extraction, sentiment analysis, geotagging, name translation, name matching, and identity resolution, among others. PolyAnalyst - text analytics environment. PoolParty Semantic Suite - graph-based text mining platform. RapidMiner with its Text Processing Extension – data and text mining software. SAS – SAS Text Miner and Teragram; commercial text analytics, natural language processing, and taxonomy software used for Information Management. Sketch Engine – a corpus manager and analysis software which providing creating text corpora from uploaded texts or the Web including part-of-speech tagging and lemmatization or detecting a particular website. Sysomos – provider social media analytics software platform, including text analytics and sentiment analysis on online consumer conversations. WordStat – Content analysis and text mining add-on module of QDA Miner for analyzing large amounts of text data. == Open source == Carrot2 – text and search results clustering framework. GATE – general Architecture for Text Engineering, an open-source toolbox for natural language processing and language engineering. Gensim – large-scale topic modelling and extraction of semantic information from unstructured text (Python). KH Coder – for Quantitative Content Analysis or Text Mining The KNIME Text Processing extension. Natural Language Toolkit (NLTK) – a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language. OpenNLP – natural language processing. Orange with its text mining add-on. The PLOS Text Mining Collection. The programming language R provides a framework for text mining applications in the package tm. The Natural Language Processing task view contains tm and other text mining library packages. spaCy – open-source Natural Language Processing library for Python Stanbol – an open source text mining engine targeted at semantic content management. Voyant Tools – a web-based text analysis environment, created as a scholarly project.

    Read more →
  • Personoid

    Personoid

    Personoid is the concept coined by Stanisław Lem, a Polish science-fiction writer, in Non Serviam, from his book A Perfect Vacuum (1971). His personoids are an abstraction of functions of human mind and they live in computers; they do not need any human-like physical body. In cognitive and software modeling, personoid is a research approach to the development of intelligent autonomous agents. In frame of the IPK (Information, Preferences, Knowledge) architecture, it is a framework of abstract intelligent agent with a cognitive and structural intelligence. It can be seen as an essence of high intelligent entities. From the philosophical and systemics perspectives, personoid societies can also be seen as the carriers of a culture. According to N. Gessler, the personoids study can be a base for the research on artificial culture and culture evolution. == Personoids on TV and cinema == Welt am Draht (1973) The Thirteenth Floor (1999)

    Read more →
  • Latent class model

    Latent class model

    In statistics, a latent class model (LCM) is a model for clustering multivariate discrete data. It assumes that the data arise from a mixture of discrete distributions, within each of which the variables are independent. It is called a latent class model because the class to which each data point belongs is unobserved (or latent). Latent class analysis (LCA) is a subset of structural equation modeling used to find groups or subtypes of cases in multivariate categorical data. These groups or subtypes of cases are called "latent classes". When faced with the following situation, a researcher might opt to use LCA to better understand the data: Symptoms a, b, c, and d have been recorded in a variety of patients diagnosed with diseases X, Y, and Z. Disease X is associated with symptoms a, b, and c; disease Y is linked to symptoms b, c, and d; and disease Z is connected to symptoms a, c, and d. In this context, the LCA would attempt to detect the presence of latent classes (i.e., the disease entities), thus creating patterns of association in the symptoms. As in factor analysis, LCA can also be used to classify cases according to their maximum likelihood class membership probability. The key criterion for resolving the LCA is identifying latent classes in which the observed symptom associations are effectively rendered null. This is because within each class, the diseases responsible for the symptoms create a structure of dependencies. As a result, the symptoms become conditionally independent, meaning that, given the class a case belongs to, the symptoms are no longer related to one another. == Model == Within each latent class, the observed variables are statistically independent—an essential aspect of latent class modeling. Usually, the observed variables are statistically dependent. By introducing the latent variable, independence is restored in the sense that within classes, variables are independent (local independence). Therefore, the association between the observed variables is explained by the classes of the latent variable (McCutcheon, 1987). In one form, the LCM is written as p i 1 , i 2 , … , i N ≈ ∑ t T p t ∏ n N p i n , t n , {\displaystyle p_{i_{1},i_{2},\ldots ,i_{N}}\approx \sum _{t}^{T}p_{t}\,\prod _{n}^{N}p_{i_{n},t}^{n},} where T {\displaystyle T} is the number of latent classes and p t {\displaystyle p_{t}} are the so-called recruitment or unconditional probabilities that should sum to one. p i n , t n {\displaystyle p_{i_{n},t}^{n}} are the marginal or conditional probabilities. For a two-way latent class model, the form is p i j ≈ ∑ t T p t p i t p j t . {\displaystyle p_{ij}\approx \sum _{t}^{T}p_{t}\,p_{it}\,p_{jt}.} This two-way model is related to probabilistic latent semantic analysis and non-negative matrix factorization. The probability model used in LCA is closely related to the Naive Bayes classifier. The main difference is that in LCA, the class membership of an individual is a latent variable, whereas in Naive Bayes classifiers, the class membership is an observed label. == Related methods == There are a number of methods with distinct names and uses that share a common relationship. Cluster analysis is, like LCA, used to discover taxon-like groups of cases in data. Multivariate mixture estimation (MME) is applicable to continuous data and assumes that such data arise from a mixture of distributions, such as a set of heights arising from a mixture of men and women. If a multivariate mixture estimation is constrained so that measures must be uncorrelated within each distribution, it is termed latent profile analysis. Modified to handle discrete data, this constrained analysis is known as LCA. Discrete latent trait models further constrain the classes to form from segments of a single dimension, allocating members to classes based on that dimension. An example would be assigning cases to social classes based on ability or merit. In a practical instance, the variables could be multiple choice items of a political questionnaire. In this case, the data consists of an N-way contingency table with answers to the items for a number of respondents. In this example, the latent variable refers to political opinion, and the latent classes to political groups. Given group membership, the conditional probabilities specify the chance that certain answers are chosen. == Application == LCA may be used in many fields, such as: collaborative filtering, Behavior Genetics and Evaluation of diagnostic tests.

    Read more →
  • Optical neural network

    Optical neural network

    An optical neural network is a physical implementation of an artificial neural network with optical components. Early optical neural networks used a photorefractive Volume hologram to interconnect arrays of input neurons to arrays of output with synaptic weights in proportion to the multiplexed hologram's strength. Volume holograms were further multiplexed using spectral hole burning to add one dimension of wavelength to space to achieve four dimensional interconnects of two dimensional arrays of neural inputs and outputs. This research led to extensive research on alternative methods using the strength of the optical interconnect for implementing neuronal communications. Some artificial neural networks that have been implemented as optical neural networks include the Hopfield neural network and the Kohonen self-organizing map with liquid crystal spatial light modulators Optical neural networks can also be based on the principles of neuromorphic engineering, creating neuromorphic photonic systems. Typically, these systems encode information in the networks using spikes, mimicking the functionality of spiking neural networks in optical and photonic hardware. Photonic devices that have demonstrated neuromorphic functionalities include (among others) vertical-cavity surface-emitting lasers, integrated photonic modulators, optoelectronic systems based on superconducting Josephson junctions or systems based on resonant tunnelling diodes. == Electrochemical vs. optical neural networks == Biological neural networks function on an electrochemical basis, while optical neural networks use electromagnetic waves. Optical interfaces to biological neural networks can be created with optogenetics, but is not the same as an optical neural networks. In biological neural networks there exist a lot of different mechanisms for dynamically changing the state of the neurons, these include short-term and long-term synaptic plasticity. Synaptic plasticity is among the electrophysiological phenomena used to control the efficiency of synaptic transmission, long-term for learning and memory, and short-term for short transient changes in synaptic transmission efficiency. Implementing this with optical components is difficult, and ideally requires advanced photonic materials. Properties that might be desirable in photonic materials for optical neural networks include the ability to change their efficiency of transmitting light, based on the intensity of incoming light. == Rising Era of Optical Neural Networks == With the increasing significance of computer vision in various domains, the computational cost of these tasks has increased, making it more important to develop the new approaches of the processing acceleration. Optical computing has emerged as a potential alternative to GPU acceleration for modern neural networks, particularly considering the looming obsolescence of Moore's Law. Consequently, optical neural networks have garnered increased attention in the research community. Presently, two primary methods of optical neural computing are under research: silicon photonics-based and free-space optics. Each approach has its benefits and drawbacks; while silicon photonics may offer superior speed, it lacks the massive parallelism that free-space optics can deliver. Given the substantial parallelism capabilities of free-space optics, researchers have focused on taking advantage of it. One implementation, proposed by Lin et al., involves the training and fabrication of phase masks for a handwritten digit classifier. By stacking 3D-printed phase masks, light passing through the fabricated network can be read by a photodetector array of ten detectors, each representing a digit class ranging from 1 to 10. Although this network can achieve terahertz-range classification, it lacks flexibility, as the phase masks are fabricated for a specific task and cannot be retrained. An alternative method for classification in free-space optics, introduced by Cahng et al., employs a 4F system that is based on the convolution theorem to perform convolution operations. This system uses two lenses to execute the Fourier transforms of the convolution operation, enabling passive conversion into the Fourier domain without power consumption or latency. However, the convolution operation kernels in this implementation are also fabricated phase masks, limiting the device's functionality to specific convolutional layers of the network only. In contrast, Li et al. proposed a technique involving kernel tiling to use the parallelism of the 4F system while using a Digital Micromirror Device (DMD) instead of a phase mask. This approach allows users to upload various kernels into the 4F system and execute the entire network's inference on a single device. Unfortunately, modern neural networks are not designed for the 4F systems, as they were primarily developed during the CPU/GPU era. Mostly because they tend to use a lower resolution and a high number of channels in their feature maps. == Other Implementations == In 2007 there was one model of Optical Neural Network: the Programmable Optical Array/Analogic Computer (POAC). It had been implemented in the year 2000 and reported based on modified Joint Fourier Transform Correlator (JTC) and Bacteriorhodopsin (BR) as a holographic optical memory. Full parallelism, large array size and the speed of light are three promises offered by POAC to implement an optical CNN. They had been investigated during the last years with their practical limitations and considerations yielding the design of the first portable POAC version. The practical details – hardware (optical setups) and software (optical templates) – were published. However, POAC is a general purpose and programmable array computer that has a wide range of applications including: image processing pattern recognition target tracking real-time video processing document security optical switching == Progress in the 2020s == Taichi from Tsinghua University in Beijing is a hybrid ONN that combines the power efficiency and parallelism of optical diffraction and the configurability of optical interference. Taichi offers 13.96 million parameters. Taichi avoids the high error rates that afflict deep (multi-layer) networks by combining clusters of fewer-layer diffractive units with arrays of interferometers for reconfigurable computation. Its encoding protocol divides large network models into sub-models that can be distributed across multiple chiplets in parallel. Taichi achieved 91.89% accuracy in tests with the Omniglot database. It was also used to generate music Bach and generate images the styles of Van Gogh and Munch. The developers claimed energy efficiency of up to 160 trillion operations second−1 watt−1 and an area efficiency of 880 trillion multiply-accumulate operations mm−2 or 103 more energy efficient than the NVIDIA H100, and 102 times more energy efficient and 10 times more area efficient than previous ONNs. Time dimension has recently been introduced into diffractive neural network by fs laser lithography of perovskite hydration. The temporal behaviour of the neuron can be modulated by the fs laser at the nanoscale, enabling a programmable holographic neural network with temporal evolution functionality, i.e., the functionality can change with time under the hydration stimuli. An in-memory temporal inference functionality was demonstrated to mimic the function evolution of the human brain, i.e., the functionality can change from simple digit image classification to more complicated digit and clothing product image classification with time. This is the first time of introducing time dimension into the optical neural network, laying a foundation for future brain-like photonic chip development.

    Read more →
  • Tanagra (machine learning)

    Tanagra (machine learning)

    Tanagra is a free suite of machine learning software for research and academic purposes developed by Ricco Rakotomalala at the Lumière University Lyon 2, France. Tanagra supports several standard data mining tasks such as: Visualization, Descriptive statistics, Instance selection, feature selection, feature construction, regression, factor analysis, clustering, classification and association rule learning. Tanagra is an academic project. It is widely used in French-speaking universities. Tanagra is frequently used in real studies and in software comparison papers. == History == The development of Tanagra was started in June 2003. The first version was distributed in December 2003. Tanagra is the successor of Sipina, another free data mining tool which is intended only for supervised learning tasks (classification), especially the interactive and visual construction of decision trees. Sipina is still available online and is maintained. Tanagra is an "open source project" as every researcher can access the source code and add their own algorithms, as long as they agree and conform to the software distribution license. The main purpose of the Tanagra project is to give researchers and students a user-friendly data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing the analyzation of either real or synthetic data. From 2006, Ricco Rakotomalala made an important documentation effort. A large number of tutorials are published on a dedicated website. They describe the statistical and machine learning methods and their implementation with Tanagra on real case studies. The use of other free data mining tools on the same problems is also widely described. The comparison of the tools enables readers to understand the possible differences in the presentation of results. == Description == Tanagra works similarly to current data mining tools. The user can design visually a data mining process in a diagram. Each node is a statistical or machine learning technique, the connection between two nodes represents the data transfer. But unlike the majority of tools which are based on the workflow paradigm, Tanagra is very simplified. The treatments are represented in a tree diagram. The results are displayed in an HTML format. This makes it is easy to export the outputs in order to visualize the results in a browser. It is also possible to copy the result tables to a spreadsheet. Tanagra makes a good compromise between statistical approaches (e.g. parametric and nonparametric statistical tests), multivariate analysis methods (e.g. factor analysis, correspondence analysis, cluster analysis, regression) and machine learning techniques (e.g. neural network, support vector machine, decision trees, random forest).

    Read more →
  • AI Mode

    AI Mode

    AI Mode is a search feature used within Google Search. In March 2025, Google introduced an experimental "AI Mode" within its search platform, enabling users to input complex, multi-part queries and receive comprehensive, AI-generated responses. This feature uses Google's Gemini model, which enhances the system's reasoning capabilities and supports multimodal inputs, including text, images, and voice. Users need to be signed in to be able to use the image generation features. Initially, AI Mode was available to Google One AI Premium subscribers in the United States, who could access it through the Search Labs platform. This phased rollout allowed Google to gather user feedback and refine the feature before a broader release.

    Read more →
  • Mean squared error

    Mean squared error

    In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the true value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive (and not zero) is because of randomness or because the estimator does not account for information that could produce a more accurate estimate. In machine learning, specifically empirical risk minimization, MSE may refer to the empirical risk (the average loss on an observed data set), as an estimate of the true MSE (the true risk: the average loss on the actual population distribution). The MSE is a measure of the quality of an estimator. As it is derived from the square of Euclidean distance, it is always a positive value that decreases as the error approaches zero. The MSE is the second moment (about the origin) of the error, and thus incorporates both the variance of the estimator (how widely spread the estimates are from one data sample to another) and its bias (how far off the average estimated value is from the true value). For an unbiased estimator, the MSE is the variance of the estimator. Like the variance, MSE has the same units of measurement as the square of the quantity being estimated. In an analogy to standard deviation, taking the square root of MSE yields the root-mean-square error or root-mean-square deviation (RMSE or RMSD), which has the same units as the quantity being estimated; for an unbiased estimator, the RMSE is the square root of the variance, known as the standard error. == Definition and basic properties == The MSE either assesses the quality of a predictor (i.e., a function mapping arbitrary inputs to a sample of values of some random variable), or of an estimator (i.e., a mathematical function mapping a sample of data to an estimate of a parameter of the population from which the data is sampled). In the context of prediction, understanding the prediction interval can also be useful as it provides a range within which a future observation will fall, with a certain probability. The definition of an MSE differs according to whether one is describing a predictor or an estimator. === Predictor === If a vector of n {\displaystyle n} predictions is generated from a sample of n {\displaystyle n} data points on all variables, and Y {\displaystyle Y} is the vector of observed values of the variable being predicted, with Y ^ {\displaystyle {\hat {Y}}} being the predicted values (e.g. as from a least-squares fit), then the within-sample MSE of the predictor is computed as MSE = 1 n ∑ i = 1 n ( Y i − Y i ^ ) 2 {\displaystyle \operatorname {MSE} ={\frac {1}{n}}\sum _{i=1}^{n}\left(Y_{i}-{\hat {Y_{i}}}\right)^{2}} In other words, the MSE is the mean ( 1 n ∑ i = 1 n ) {\textstyle \left({\frac {1}{n}}\sum _{i=1}^{n}\right)} of the squares of the errors ( Y i − Y i ^ ) 2 {\textstyle \left(Y_{i}-{\hat {Y_{i}}}\right)^{2}} . This is an easily computable quantity for a particular sample (and hence is sample-dependent). In matrix notation, MSE = 1 n ∑ i = 1 n ( e i ) 2 = 1 n e T e {\displaystyle \operatorname {MSE} ={\frac {1}{n}}\sum _{i=1}^{n}(e_{i})^{2}={\frac {1}{n}}\mathbf {e} ^{\mathsf {T}}\mathbf {e} } where e i {\displaystyle e_{i}} is Y i − Y i ^ {\displaystyle Y_{i}-{\hat {Y_{i}}}} and e {\displaystyle \mathbf {e} } is a n × 1 {\displaystyle n\times 1} column vector. The MSE can also be computed on q data points that were not used in estimating the model, either because they were held back for this purpose, or because these data have been newly obtained. Within this process, known as cross-validation, the MSE is often called the test MSE, and is computed as MSE = 1 q ∑ i = n + 1 n + q ( Y i − Y i ^ ) 2 {\displaystyle \operatorname {MSE} ={\frac {1}{q}}\sum _{i=n+1}^{n+q}\left(Y_{i}-{\hat {Y_{i}}}\right)^{2}} === Estimator === The MSE of an estimator θ ^ {\displaystyle {\hat {\theta }}} with respect to an unknown parameter θ {\displaystyle \theta } is defined as MSE ⁡ ( θ ^ ) = E θ ⁡ [ ( θ ^ − θ ) 2 ] . {\displaystyle \operatorname {MSE} ({\hat {\theta }})=\operatorname {E} _{\theta }\left[({\hat {\theta }}-\theta )^{2}\right].} This definition depends on the unknown parameter, therefore the MSE is a priori property of an estimator. The MSE could be a function of unknown parameters, in which case any estimator of the MSE based on estimates of these parameters would be a function of the data (and thus a random variable). If the estimator θ ^ {\displaystyle {\hat {\theta }}} is derived as a sample statistic and is used to estimate some population parameter, then the expectation is with respect to the sampling distribution of the sample statistic. The MSE can be written as the sum of the variance of the estimator and the squared bias of the estimator, providing a useful way to calculate the MSE and implying that in the case of unbiased estimators, the MSE and variance are equivalent. MSE ⁡ ( θ ^ ) = Var θ ⁡ ( θ ^ ) + Bias ⁡ ( θ ^ , θ ) 2 . {\displaystyle \operatorname {MSE} ({\hat {\theta }})=\operatorname {Var} _{\theta }({\hat {\theta }})+\operatorname {Bias} ({\hat {\theta }},\theta )^{2}.} ==== Proof of variance and bias relationship ==== MSE ⁡ ( θ ^ ) = E θ ⁡ [ ( θ ^ − θ ) 2 ] = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] + E θ ⁡ [ θ ^ ] − θ ) 2 ] = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] ) 2 + 2 ( θ ^ − E θ ⁡ [ θ ^ ] ) ( E θ ⁡ [ θ ^ ] − θ ) + ( E θ ⁡ [ θ ^ ] − θ ) 2 ] = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] ) 2 ] + E θ ⁡ [ 2 ( θ ^ − E θ ⁡ [ θ ^ ] ) ( E θ ⁡ [ θ ^ ] − θ ) ] + E θ ⁡ [ ( E θ ⁡ [ θ ^ ] − θ ) 2 ] = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] ) 2 ] + 2 ( E θ ⁡ [ θ ^ ] − θ ) E θ ⁡ [ θ ^ − E θ ⁡ [ θ ^ ] ] + ( E θ ⁡ [ θ ^ ] − θ ) 2 E θ ⁡ [ θ ^ ] − θ = constant = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] ) 2 ] + 2 ( E θ ⁡ [ θ ^ ] − θ ) ( E θ ⁡ [ θ ^ ] − E θ ⁡ [ θ ^ ] ) + ( E θ ⁡ [ θ ^ ] − θ ) 2 E θ ⁡ [ θ ^ ] = constant = E θ ⁡ [ ( θ ^ − E θ ⁡ [ θ ^ ] ) 2 ] + ( E θ ⁡ [ θ ^ ] − θ ) 2 = Var θ ⁡ ( θ ^ ) + Bias θ ⁡ ( θ ^ , θ ) 2 {\displaystyle {\begin{aligned}\operatorname {MSE} ({\hat {\theta }})&=\operatorname {E} _{\theta }\left[({\hat {\theta }}-\theta )^{2}\right]\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]+\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}\right]\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)^{2}+2\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)+\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}\right]\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)^{2}\right]+\operatorname {E} _{\theta }\left[2\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)\right]+\operatorname {E} _{\theta }\left[\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}\right]\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)^{2}\right]+2\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)\operatorname {E} _{\theta }\left[{\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right]+\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}&&\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta ={\text{constant}}\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)^{2}\right]+2\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)+\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}&&\operatorname {E} _{\theta }[{\hat {\theta }}]={\text{constant}}\\&=\operatorname {E} _{\theta }\left[\left({\hat {\theta }}-\operatorname {E} _{\theta }[{\hat {\theta }}]\right)^{2}\right]+\left(\operatorname {E} _{\theta }[{\hat {\theta }}]-\theta \right)^{2}\\&=\operatorname {Var} _{\theta }({\hat {\theta }})+\operatorname {Bias} _{\theta }({\hat {\theta }},\theta )^{2}\end{aligned}}} An even shorter proof can be achieved using the well-known formula that for a random variable X {\textstyle X} , E ( X 2 ) = Var ⁡ ( X ) + ( E ( X ) ) 2 {\textstyle \mathbb {E} (X^{2})=\operatorname {Var} (X)+(\mathbb {E} (X))^{2}} . By substituting X {\textstyle X} with, θ ^ − θ {\textstyle {\hat {\theta }}-\theta } , we have MSE ⁡ ( θ ^ ) = E [ ( θ ^ − θ ) 2 ] = Var ⁡ ( θ ^ − θ ) + ( E [ θ ^ − θ ] ) 2 = Var ⁡ ( θ ^ ) + Bias 2 ⁡ ( θ ^ , θ ) {\displaystyle {\begin{aligned}\operatorname {MSE} ({\hat {\theta }})&=\mathbb {E} [({\hat {\theta }}-\theta )^{2}]\\&=\operator

    Read more →
  • NSynth

    NSynth

    NSynth (a portmanteau of "Neural Synthesis") is a WaveNet-based autoencoder for synthesizing audio, outlined in a paper in April 2017. == Overview == The model generates sounds through a neural network based synthesis, employing a WaveNet-style autoencoder to learn its own temporal embeddings from four different sounds. Google then released an open source hardware interface for the algorithm called NSynth Super, used by notable musicians such as Grimes and YACHT to generate experimental music using artificial intelligence. The research and development of the algorithm was part of a collaboration between Google Brain, Magenta and DeepMind. == Technology == === Dataset === The NSynth dataset is composed of 305,979 one-shot instrumental notes featuring a unique pitch, timbre, and envelope, sampled from 1,006 instruments from commercial sample libraries. For each instrument the dataset contains four-second 16 kHz audio snippets by ranging over every pitch of a standard MIDI piano, as well as five different velocities. The dataset is made available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. === Machine learning model === A spectral autoencoder model and a WaveNet autoencoder model are publicly available on GitHub. The baseline model uses a spectrogram with fft_size 1024 and hop_size 256, MSE loss on the magnitudes, and the Griffin-Lim algorithm for reconstruction. The WaveNet model trains on mu-law encoded waveform chunks of size 6144. It learns embeddings with 16 dimensions that are downsampled by 512 in time. == NSynth Super == In 2018 Google released a hardware interface for the NSynth algorithm, called NSynth Super, designed to provide an accessible physical interface to the algorithm for musicians to use in their artistic production. Design files, source code and internal components are released under an open source Apache License 2.0, enabling hobbyists and musicians to freely build and use the instrument. At the core of the NSynth Super there is a Raspberry Pi, extended with a custom printed circuit board to accommodate the interface elements. == Influence == Despite not being publicly available as a commercial product, NSynth Super has been used by notable artists, including Grimes and YACHT. Grimes reported using the instrument in her 2020 studio album Miss Anthropocene. YACHT announced an extensive use of NSynth Super in their album Chain Tripping. Claire L. Evans compared the potential influence of the instrument to the Roland TR-808. The NSynth Super design was honored with a D&AD Yellow Pencil award in 2018.

    Read more →