AI Face Combiner

AI Face Combiner — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Research software engineering

    Research software engineering

    Research software engineering is the application of software engineering practices, methods and techniques for research software, i.e. software that was made for and is mainly used within research projects. As usual for software engineering, this also includes knowledge of other (and in this case varying) research fields as well as open science that need to be incorporated into a software development process. The term was proposed in a research paper in 2010 in response to an empirical survey on tools used for software development in research projects. It started to be used in United Kingdom in 2012, when it was needed to define the type of software development needed in research. This focuses on reproducibility, reusability, and accuracy of data analysis and applications created for research. == Support == Various type of associations and organisations have been created around this role to support the creation of posts in universities and research institutes. In 2014 a Research Software Engineer Association was created in UK, which attracted 160 members in the first three months and which lead to the creation of the Society of Research Software Engineering in 2019. Other countries like the Netherlands, Germany, and the USA followed creating similar communities and there are similar efforts being pursued in Asia, Australia, Canada, New Zealand, the Nordic countries, and Belgium. In January 2021 the International Council of RSE Associations was introduced. UK counts over 40 universities and institutes with groups that provide access to software expertise to different areas of research. Additionally, the Engineering and Physical Sciences Research Council created a Research Software Engineer fellowship to promote this role and help the creation of RSE groups across UK, with calls in 2015, 2017, and 2020. The world first RSE conference took place in UK in September 2016 and it has been repeated annually (except for a gap in 2020) since. In 2019 the first national RSE conferences in Germany and the Netherlands were held, next editions were planned for 2020 and then cancelled. US-RSE held its first national conference in 2023. The Research Software Alliance was formed in 2019 to advance the global research software ecosystem by collaborating with decision makers and key influencers. The SORSE (A Series of Online Research Software Events) community was established in late‑2020 in response to the COVID-19 pandemic and ran its first online event in September 2020.

    Read more →
  • Noisy text analytics

    Noisy text analytics

    Noisy text analytics is a process of information extraction whose goal is to automatically extract structured or semistructured information from noisy unstructured text data. While Text analytics is a growing and mature field that has great value because of the huge amounts of data being produced, processing of noisy text is gaining in importance because a lot of common applications produce noisy text data. Noisy unstructured text data is found in informal settings such as online chat, text messages, e-mails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech using automatic speech recognition and printed or handwritten text using optical character recognition contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing letter case information, pause filling words such as “um” and “uh” and other texting and speech disfluencies. Such text can be seen in large amounts in contact centers, chat rooms, optical character recognition (OCR) of text documents, short message service (SMS) text, etc. Documents with historical language can also be considered noisy with respect to today's knowledge about the language. Such text contains important historical, religious, ancient medical knowledge that is useful. The nature of the noisy text produced in all these contexts warrants moving beyond traditional text analysis techniques. == Techniques for noisy text analysis == Missing punctuation and the use of non-standard words can often hinder standard natural language processing tools such as part-of-speech tagging and parsing. Techniques to both learn from the noisy data and then to be able to process the noisy data are only now being developed. == Possible source of noisy text == World Wide Web: Poorly written text is found in web pages, online chat, blogs, wikis, discussion forums, newsgroups. Most of these data are unstructured and the style of writing is very different from, say, well-written news articles. Analysis for the web data is important because they are sources for market buzz analysis, market review, trend estimation, etc. Also, because of the large amount of data, it is necessary to find efficient methods of information extraction, classification, automatic summarization and analysis of these data. Contact centers: This is a general term for help desks, information lines and customer service centers operating in domains ranging from computer sales and support to mobile phones to apparels. On an average a person in the developed world interacts at least once a week with a contact center agent. A typical contact center agent handles over a hundred calls per day. They operate in various modes such as voice, online chat and E-mail. The contact center industry produces gigabytes of data in the form of E-mails, chat logs, voice conversation transcriptions, customer feedback, etc. A bulk of the contact center data is voice conversations. Transcription of these using state of the art automatic speech recognition results in text with 30-40% word error rate. Further, even written modes of communication like online chat between customers and agents and even the interactions over email tend to be noisy. Analysis of contact center data is essential for customer relationship management, customer satisfaction analysis, call modeling, customer profiling, agent profiling, etc., and it requires sophisticated techniques to handle poorly written text. Printed Documents: Many libraries, government organizations and national defence organizations have vast repositories of hard copy documents. To retrieve and process the content from such documents, they need to be processed using Optical Character Recognition. In addition to printed text, these documents may also contain handwritten annotations. OCRed text can be highly noisy depending on the font size, quality of the print etc. It can range from 2-3% word error rates to as high as 50-60% word error rates. Handwritten annotations can be particularly hard to decipher, and error rates can be quite high in their presence. Short Messaging Service (SMS): Language usage over computer mediated discourses, like chats, emails and SMS texts, significantly differs from the standard form of the language. An urge towards shorter message length facilitating faster typing and the need for semantic clarity, shape the structure of this non-standard form known as the texting language.

    Read more →
  • Semantic compression

    Semantic compression

    In natural language processing, semantic compression is a process of compacting a lexicon used to build a textual document (or a set of documents) by reducing language heterogeneity, while maintaining text semantics. As a result, the same ideas can be represented using a smaller set of words. In most applications, semantic compression is a lossy compression. Increased prolixity does not compensate for the lexical compression and an original document cannot be reconstructed in a reverse process. == By generalization == Semantic compression is basically achieved in two steps, using frequency dictionaries and semantic network: determining cumulated term frequencies to identify target lexicon, replacing less frequent terms with their hypernyms (generalization) from target lexicon. Step 1 requires assembling word frequencies and information on semantic relationships, specifically hyponymy. Moving upwards in word hierarchy, a cumulative concept frequency is calculating by adding a sum of hyponyms' frequencies to frequency of their hypernym: c u m f ( k i ) = f ( k i ) + ∑ j c u m f ( k j ) {\displaystyle cumf(k_{i})=f(k_{i})+\sum _{j}cumf(k_{j})} where k i {\displaystyle k_{i}} is a hypernym of k j {\displaystyle k_{j}} . Then a desired number of words with top cumulated frequencies are chosen to build a target lexicon. In the second step, compression mapping rules are defined for the remaining words in order to handle every occurrence of a less frequent hyponym as its hypernym in output text. Example The below fragment of text has been processed by the semantic compression. Words in bold have been replaced by their hypernyms. They are both nest building social insects, but paper wasps and honey bees organize their colonies in very different ways. In a new study, researchers report that despite their differences, these insects rely on the same network of genes to guide their social behavior.The study appears in the Proceedings of the Royal Society B: Biological Sciences. Honey bees and paper wasps are separated by more than 100 million years of evolution, and there are striking differences in how they divvy up the work of maintaining a colony. The procedure outputs the following text: They are both facility building insect, but insects and honey insects arrange their biological groups in very different structure. In a new study, researchers report that despite their difference of opinions, these insects act the same network of genes to steer their party demeanor. The study appears in the proceeding of the institution bacteria Biological Sciences. Honey insects and insect are separated by more than hundred million years of organic processes, and there are impinging differences of opinions in how they divvy up the work of affirming a biological group. == Implicit semantic compression == A natural tendency to keep natural language expressions concise can be perceived as a form of implicit semantic compression, by omitting unmeaningful words or redundant meaningful words (especially to avoid pleonasms). == Applications and advantages == In the vector space model, compacting a lexicon leads to a reduction of dimensionality, which results in less computational complexity and a positive influence on efficiency. Semantic compression is advantageous in information retrieval tasks, improving their effectiveness (in terms of both precision and recall). This is due to more precise descriptors (reduced effect of language diversity – limited language redundancy, a step towards a controlled dictionary). As in the example above, it is possible to display the output as natural text (re-applying inflexion, adding stop words).

    Read more →
  • Document mosaicing

    Document mosaicing

    Document mosaicing is a process that stitches multiple, overlapping snapshot images of a document together to produce one large, high resolution composite. The document is slid under a stationary, over-the-desk camera by hand until all parts of the document are snapshotted by the camera's field of view. As the document slid under the camera, all motion of the document is coarsely tracked by the vision system. The document is periodically snapshotted such that the successive snapshots are overlap by about 50%. The system then finds the overlapped pairs and stitches them together repeatedly until all pairs are stitched together as one piece of document. The document mosaicing can be divided into four main processes. Tracking Feature detecting Correspondences establishing Images mosaicing. == Tracking (simple correlation process) == In this process, the motion of the document slid under the camera is coarsely tracked by the system. Tracking is performed by a process called simple correlation process. In the first frame of snapshots, a small patch is extracted from the center of the image as a correlation template. The correlation process is performed in the four times size of the patch area of the next frame. The motion of the paper is indicated by the peak in the correlation function. The peak in the correlation function indicates the motion of the paper. The template is resampled from this frame and the tracking continues until the template reaches the edge of the document. After the template reaches the edge of the document, another snapshot is taken and the tracking process performs repeatedly until the whole document is imaged. The snapshots are stored in an ordered list to facilitate pairing the overlapped images in later processes. == Feature detecting for efficient matching == Feature detection is the process of finding the transformation that aligns one image with another. There are two main approaches for feature detection. Feature-based approach : Motion parameters are estimated from point correspondences. This approach is suitable for the case that there is plenty supply of stable and detectable features. Featureless approach : When the motion between the two images is small, the motion parameters are estimated using optical flow. On the other hand, when the motion between the two images is large, the motion parameters are estimated using generalised cross-correlation. However, this approach requires a computationally expensive resources. Each image is segmented into a hierarchy of columns, lines, and words to match the organised sets of features across images. Skew angle estimation and columns, lines and words finding are the examples of feature detection operations. === Skew angle estimation === Firstly, the angle that the rows of text make with the image raster lines (skew angle) is estimated. It is assumed to lie in the range of ±20°. A small patch of text in the image is selected randomly and then rotated in the range of ±20° until the variance of the pixel intensities of the patch summed along the raster lines is maximised. To ensure that the found skew angle is accurate, the document mosaic system performs calculation at many image patches and derive the final estimation by finding the average of the individual angles weighted by the variance of the pixel intensities of each patch. === Columns, lines and words finding === In this operation, the de-skewed document is intuitively segmented into a hierarchy of columns, lines and words. The sensitivity to illumination and page coloration of the de-skewed document can be removed by applying a Sobel operator to the de-skewed image and thresholding the output to obtain the binary gradient, de-skewed image. The operation can be roughly separated into 3 steps: column segmentation, line segmentation and word segmentation. Columns are easily segmented from the binary gradient, de-skewed images by summing pixels vertically. Baselines of each row are segmented in the same way as the column segmentation process but horizontally. Finally, individual words are segmented by applying the vertical process at each segmented row. These segmentations are important because the document mosaic is created by matching the lower right corners of words in overlapping images pair. Moreover, the segmentation operation can organize the list of images in the context of a hierarchy of rows and column reliably. The segmentation operation involves a considerable amount of summing in the binary gradient, de-skewed images, which done by construct a matrix of partial sums whose elements are given by p i y = ∑ u = 1 i ∑ v = 1 j b u v {\displaystyle p_{iy}=\sum _{u=1}^{i}\sum _{v=1}^{j}b_{uv}} The matrix of partial sums is calculated in one pass through the binary gradient, de-skewed image. ∑ u = u 1 u 2 ∑ v = v 1 v 2 b u v = p u 2 v 2 + p u 1 v 1 − p u 1 v 2 − p u 2 v 1 {\displaystyle \sum _{u=u_{1}}^{u_{2}}\sum _{v=v_{1}}^{v_{2}}b_{uv}=p_{u_{2}v_{2}}+p_{u_{1}v_{1}}-p_{u_{1}v_{2}}-p_{u_{2}v_{1}}} == Correspondences establishing == The two images are now organized in hierarchy of linked lists in following structure : image=list of columns row=list of words column=list of row word=length (in pixels) At the bottom of the structure, the length of each word is recorded for establishing correspondence between two images to reduce to search only the corresponding structures for the groups of words with the matching lengths. === Seed match finding === A seed match finding is done by comparing each row in image1 with each row in image2. The two rows are then compared to each other by every word. If the length (in pixel) of the two words (one from image1 and one from image2) and their immediate neighbours agree with each other within a predefined tolerance threshold (5 pixels, for example), then they are assumed to match. The row of each image is assumed a match if there are three or more word matches between the two rows. The seed match finding operation is terminated when two pairs of consecutive row match are found. === Match list building === After finishing a seed match finding operation, the next process is to build the match list to generate the correspondences points of the two images. The process is done by searching the matching pairs of rows away from the seed row. == Images mosaicing == Given the list of corresponding points of the two images, finding the transformation of the overlapping portion of the images is the next process. Assuming a pinhole camera model, the transformation between pixels (u,v) of image 1 and pixels (u0, v0) of image 2 is demonstrated by a plane-to-plane projectivity. [ s u ′ s v ′ s ] = [ p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 1 ] [ u v 1 ] E q .1 {\displaystyle \left[{\begin{array}{c}su'\\sv'\\s\end{array}}\right]=\left[{\begin{array}{ccc}p_{11}&p_{12}&p_{13}\\p_{21}&p_{22}&p_{23}\\p_{31}&p_{32}&1\end{array}}\right]\left[{\begin{array}{c}u\\v\\1\end{array}}\right]\qquad Eq.1} The parameters of the projectivity is found from four pairs of matching points. RANSAC regression technique is used to reject outlying matches and estimate the projectivity from the remaining good matches. The projectivity is fine-tuned using correlation at the corners of the overlapping portion to obtain four correspondences to sub-pixel accuracy. Therefore, image1 is then transformed into image2's coordinate system using Eq.1. The typical result of the process is shown in Figure 5. === Many images coping === Finally, the whole page composition is built up by mapping all the images into the coordinate system of an "anchor" image, which is normally the one nearest the page center. The transformations to the anchor frame are calculated by concatenating the pair-wise transformations found earlier. The raw document mosaic is shown in Figure 6. However, there might be a problem of non-consecutive images that are overlap. This problem can be solved by performing Hierarchical sub-mosaics. As shown in Figure 7, image1 and image2 are registered, as are image3 and image4, creating two sub-mosaics. These two sub-mosaics are later stitched together in another mosaicing process. == Applied areas == There are various areas that the technique of document mosaicing can be applied to such as : Text segmentation of images of documents Document Recognition Interaction with paper on the digital desk Video mosaics for virtual environments Image registration techniques == Relevant research papers == Huang, T.S.; Netravali, A.N. (1994). "Motion and structure from feature correspondences: A review". Proceedings of the IEEE. 82 (2): 252–268. doi:10.1109/5.265351. D.G. Lowe. [1] Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Boston, 1985. Irani, M.; Peleg, S. (1991). "Improving resolution by image registration". CVGIP: Graphical Models and Image Processing. 53 (3): 231–239. doi:10.1016/1049-9652(91)90045-L. S2CID 4834546. Shivakumara, P.; Kumar, G. Hemantha; Guru, D. S.; Nagabhushan, P. (2006). "

    Read more →
  • 30 Boxes

    30 Boxes

    30 Boxes is a minimalist calendaring IOS application created by 83 Degrees. Originating as a web application in March 2006, 30 Boxes was founded by Webshots cofounder Narendra Rocherolle. The website shut down some time in 2020, but relaunched for the IOS in February 2021. The original website was tailored towards "social media junkies". == Reception == Barry Collins of The Sunday Times appreciated the website's plain-language event adding feature, but did not appreciate that he was unable to see more than one month of events at a time. Collins was also unhappy that the website was not capable of warning him when he had two events scheduled at the same time. In a list of the best web-based calendar software for small businesses, Forbes ranked 30 Boxes second, after Google Calendar. They described 30 Boxes like “buying a new car with manual transmission and lots of extras—you don't just want to drive it, you want to fool around with it to see what it can do”.

    Read more →
  • Convolutional layer

    Convolutional layer

    In artificial neural networks, a convolutional layer is a type of network layer that applies a convolution operation to the input. Convolutional layers are some of the primary building blocks of convolutional neural networks (CNNs), a class of neural network most commonly applied to images, video, audio, and other data that have the property of uniform translational symmetry. The convolution operation in a convolutional layer involves sliding a small window (called a kernel or filter) across the input data and computing the dot product between the values in the kernel and the input at each position. This process creates a feature map that represents detected features in the input. == Concepts == === Kernel === Kernels, also known as filters, are small matrices of weights that are learned during the training process. Each kernel is responsible for detecting a specific feature in the input data. The size of the kernel is a hyperparameter that affects the network's behavior. === Convolution === For a 2D input x {\displaystyle x} and a 2D kernel w {\displaystyle w} , the 2D convolution operation can be expressed as: y [ i , j ] = ∑ m = 0 k h − 1 ∑ n = 0 k w − 1 x [ i + m , j + n ] ⋅ w [ m , n ] {\displaystyle y[i,j]=\sum _{m=0}^{k_{h}-1}\sum _{n=0}^{k_{w}-1}x[i+m,j+n]\cdot w[m,n]} where k h {\displaystyle k_{h}} and k w {\displaystyle k_{w}} are the height and width of the kernel, respectively. This generalizes immediately to nD convolutions. Commonly used convolutions are 1D (for audio and text), 2D (for images), and 3D (for spatial objects, and videos). === Stride === Stride determines how the kernel moves across the input data. A stride of 1 means the kernel shifts by one pixel at a time, while a larger stride (e.g., 2 or 3) results in less overlap between convolutions and produces smaller output feature maps. === Padding === Padding involves adding extra pixels around the edges of the input data. It serves two main purposes: Preserving spatial dimensions: Without padding, each convolution reduces the size of the feature map. Handling border pixels: Padding ensures that border pixels are given equal importance in the convolution process. Common padding strategies include: No padding/valid padding. This strategy typically causes the output to shrink. Same padding: Any method that ensures the output size same as input size is a same padding strategy. Full padding: Any method that ensures each input entry is convolved over for the same number of times is a full padding strategy. Common padding algorithms include: Zero padding: Add zero entries to the borders of input. Mirror/reflect/symmetric padding: Reflect the input array on the border. Circular padding: Cycle the input array back to the opposite border, like a torus. The exact numbers used in convolutions is complicated, for which we refer to (Dumoulin and Visin, 2018) for details. == Variants == === Standard === The basic form of convolution as described above, where each kernel is applied to the entire input volume. === Depthwise separable === Depthwise separable convolution separates the standard convolution into two steps: depthwise convolution and pointwise convolution. The depthwise separable convolution decomposes a single standard convolution into two convolutions: a depthwise convolution that filters each input channel independently and a pointwise convolution ( 1 × 1 {\displaystyle 1\times 1} convolution) that combines the outputs of the depthwise convolution. This factorization significantly reduces computational cost. It was first developed by Laurent Sifre during an internship at Google Brain in 2013 as an architectural variation on AlexNet to improve convergence speed and model size. === Dilated === Dilated convolution, or atrous convolution, introduces gaps between kernel elements, allowing the network to capture a larger receptive field without increasing the kernel size. === Transposed === Transposed convolution, also known as deconvolution, fractionally strided convolution, and upsampling convolution, is a convolution where the output tensor is larger than its input tensor. It's often used in encoder-decoder architectures for upsampling. It's used in image generation, semantic segmentation, and super-resolution tasks. == History == The concept of convolution in neural networks was inspired by the visual cortex in biological brains. Early work by Hubel and Wiesel in the 1960s on the cat's visual system laid the groundwork for artificial convolution networks. An early convolution neural network was developed by Kunihiko Fukushima in 1969. It had mostly hand-designed kernels inspired by convolutions in mammalian vision. In 1979 he improved it to the Neocognitron, which learns all convolutional kernels by unsupervised learning (in his terminology, "self-organized by 'learning without a teacher'"). During the 1988 to 1998 period, a series of CNN were introduced by Yann LeCun et al., ending with LeNet-5 in 1998. It was an early influential CNN architecture for handwritten digit recognition, trained on the MNIST dataset, and was used in ATM. (Olshausen & Field, 1996) discovered that simple cells in the mammalian primary visual cortex implement localized, oriented, bandpass receptive fields, which could be recreated by fitting sparse linear codes for natural scenes. This was later found to also occur in the lowest-level kernels of trained CNNs. The field saw a resurgence in the 2010s with the development of deeper architectures and the availability of large datasets and powerful GPUs. AlexNet, developed by Alex Krizhevsky et al. in 2012, was a catalytic event in modern deep learning. In that year’s ImageNet competition, the AlexNet model achieved a 16% top-five error rate, significantly outperforming the next best entry, which had a 26% error rate. The network used eight trainable layers, approximately 650,000 neurons, and around 60 million parameters, highlighting the impact of deeper architectures and GPU acceleration on image recognition performance. From the 2013 ImageNet competition, most entries adopted deep convolutional neural networks, building on the success of AlexNet. Over the following years, performance steadily improved, with the top-five error rate falling from 16% in 2012 and 12% in 2013 to below 3% by 2017, as networks grew increasingly deep.

    Read more →
  • Grammar induction

    Grammar induction

    Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of re-write rules or productions or alternatively as a finite-state machine or automaton of some kind) from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as strings, trees and graphs. == Grammar classes == Grammatical inference has often been very focused on the problem of learning finite-state machines of various types (see the article Induction of regular languages for details on these approaches), since there have been efficient algorithms for this problem since the 1980s. Since the beginning of the century, these approaches have been extended to the problem of inference of context-free grammars and richer formalisms, such as multiple context-free grammars and parallel multiple context-free grammars. Other classes of grammars for which grammatical inference has been studied are combinatory categorial grammars, stochastic context-free grammars, contextual grammars and pattern languages. == Learning models == The simplest form of learning is where the learning algorithm merely receives a set of examples drawn from the language in question: the aim is to learn the language from examples of it (and, rarely, from counter-examples, that is, example that do not belong to the language). However, other learning models have been studied. One frequently studied alternative is the case where the learner can ask membership queries as in the exact query learning model or minimally adequate teacher model introduced by Angluin. == Methodologies == There is a wide variety of methods for grammatical inference. Two of the classic sources are Fu (1977) and Fu (1982). Duda, Hart & Stork (2001) also devote a brief section to the problem, and cite a number of references. The basic trial-and-error method they present is discussed below. For approaches to infer subclasses of regular languages in particular, see Induction of regular languages. A more recent textbook is de la Higuera (2010), which covers the theory of grammatical inference of regular languages and finite state automata. D'Ulizia, Ferri and Grifoni provide a survey that explores grammatical inference methods for natural languages. === Induction of probabilistic grammars === There are several methods for induction of probabilistic context-free grammars. === Grammatical inference by trial-and-error === The method proposed in Section 8.7 of Duda, Hart & Stork (2001) suggests successively guessing grammar rules (productions) and testing them against positive and negative observations. The rule set is expanded so as to be able to generate each positive example, but if a given rule set also generates a negative example, it must be discarded. This particular approach can be characterized as "hypothesis testing" and bears some similarity to Mitchel's version space algorithm. The Duda, Hart & Stork (2001) text provide a simple example which nicely illustrates the process, but the feasibility of such an unguided trial-and-error approach for more substantial problems is dubious. === Grammatical inference by genetic algorithms === Grammatical induction using evolutionary algorithms is the process of evolving a representation of the grammar of a target language through some evolutionary process. Formal grammars can easily be represented as tree structures of production rules that can be subjected to evolutionary operators. Algorithms of this sort stem from the genetic programming paradigm pioneered by John Koza. Other early work on simple formal languages used the binary string representation of genetic algorithms, but the inherently hierarchical structure of grammars couched in the EBNF language made trees a more flexible approach. Koza represented Lisp programs as trees. He was able to find analogues to the genetic operators within the standard set of tree operators. For example, swapping sub-trees is equivalent to the corresponding process of genetic crossover, where sub-strings of a genetic code are transplanted into an individual of the next generation. Fitness is measured by scoring the output from the functions of the Lisp code. Similar analogues between the tree structured lisp representation and the representation of grammars as trees, made the application of genetic programming techniques possible for grammar induction. In the case of grammar induction, the transplantation of sub-trees corresponds to the swapping of production rules that enable the parsing of phrases from some language. The fitness operator for the grammar is based upon some measure of how well it performed in parsing some group of sentences from the target language. In a tree representation of a grammar, a terminal symbol of a production rule corresponds to a leaf node of the tree. Its parent nodes corresponds to a non-terminal symbol (e.g. a noun phrase or a verb phrase) in the rule set. Ultimately, the root node might correspond to a sentence non-terminal. === Grammatical inference by greedy algorithms === Like all greedy algorithms, greedy grammar inference algorithms make, in iterative manner, decisions that seem to be the best at that stage. The decisions made usually deal with things like the creation of new rules, the removal of existing rules, the choice of a rule to be applied or the merging of some existing rules. Because there are several ways to define 'the stage' and 'the best', there are also several greedy grammar inference algorithms. These context-free grammar generating algorithms make the decision after every read symbol: Lempel-Ziv-Welch algorithm creates a context-free grammar in a deterministic way such that it is necessary to store only the start rule of the generated grammar. Sequitur and its modifications. These context-free grammar generating algorithms first read the whole given symbol-sequence and then start to make decisions: Byte pair encoding and its optimizations. === Distributional learning === A more recent approach is based on distributional learning. Algorithms using these approaches have been applied to learning context-free grammars and mildly context-sensitive languages and have been proven to be correct and efficient for large subclasses of these grammars. === Learning of pattern languages === Angluin defines a pattern to be "a string of constant symbols from Σ and variable symbols from a disjoint set". The language of such a pattern is the set of all its nonempty ground instances i.e. all strings resulting from consistent replacement of its variable symbols by nonempty strings of constant symbols. A pattern is called descriptive for a finite input set of strings if its language is minimal (with respect to set inclusion) among all pattern languages subsuming the input set. Angluin gives a polynomial algorithm to compute, for a given input string set, all descriptive patterns in one variable x. To this end, she builds an automaton representing all possibly relevant patterns; using sophisticated arguments about word lengths, which rely on x being the only variable, the state count can be drastically reduced. Erlebach et al. give a more efficient version of Angluin's pattern learning algorithm, as well as a parallelized version. Arimura et al. show that a language class obtained from limited unions of patterns can be learned in polynomial time. === Pattern theory === Pattern theory, formulated by Ulf Grenander, is a mathematical formalism to describe knowledge of the world as patterns. It differs from other approaches to artificial intelligence in that it does not begin by prescribing algorithms and machinery to recognize and classify patterns; rather, it prescribes a vocabulary to articulate and recast the pattern concepts in precise language. In addition to the new algebraic vocabulary, its statistical approach was novel in its aim to: Identify the hidden variables of a data set using real world data rather than artificial stimuli, which was commonplace at the time. Formulate prior distributions for hidden variables and models for the observed variables that form the vertices of a Gibbs-like graph. Study the randomness and variability of these graphs. Create the basic classes of stochastic models applied by listing the deformations of the patterns. Synthesize (sample) from the models, not just analyze signals with it. Broad in its mathematical coverage, pattern theory spans algebra and statistics, as well as local topological and global entropic properties. == Applications == The principle of grammar induction has been applied to other aspects of natural language processing, and has been applied (among many other problems) to semantic parsing, natural language understanding, example-based translation, language acquisition, grammar-based compre

    Read more →
  • Lesk algorithm

    Lesk algorithm

    The Lesk algorithm is a classical algorithm for word sense disambiguation introduced by Michael E. Lesk in 1986. It operates on the premise that words within a given context are likely to share a common meaning. This algorithm compares the dictionary definitions of an ambiguous word with the words in its surrounding context to determine the most appropriate sense. Variations, such as the Simplified Lesk algorithm, have demonstrated improved precision and efficiency. However, the Lesk algorithm has faced criticism for its sensitivity to definition wording and its reliance on brief glosses. Researchers have sought to enhance its accuracy by incorporating additional resources like thesauruses and syntactic models. == Overview == The Lesk algorithm is based on the assumption that words in a given "neighborhood" (section of text) will tend to share a common topic. A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained in its neighborhood. Versions have been adapted to use WordNet. An implementation might look like this: for every sense of the word being disambiguated one should count the number of words that are in both the neighborhood of that word and in the dictionary definition of that sense the sense that is to be chosen is the sense that has the largest number of this count. A frequently used example illustrating this algorithm is for the context "pine cone". The following dictionary definitions are used: PINE 1. kinds of evergreen tree with needle-shaped leaves 2. waste away through sorrow or illness CONE 1. solid body which narrows to a point 2. something of this shape whether solid or hollow 3. fruit of certain evergreen trees As can be seen, the best intersection is Pine #1 ⋂ Cone #3 = 2. == Simplified Lesk algorithm == In Simplified Lesk algorithm, the correct meaning of each word in a given context is determined individually by locating the sense that overlaps the most between its dictionary definition and the given context. Rather than simultaneously determining the meanings of all words in a given context, this approach tackles each word individually, independent of the meaning of the other words occurring in the same context. "A comparative evaluation performed by Vasilescu et al. (2004) has shown that the simplified Lesk algorithm can significantly outperform the original definition of the algorithm, both in terms of precision and efficiency. By evaluating the disambiguation algorithms on the Senseval-2 English all words data, they measure a 58% precision using the simplified Lesk algorithm compared to the only 42% under the original algorithm. Note: Vasilescu et al. implementation considers a back-off strategy for words not covered by the algorithm, consisting of the most frequent sense defined in WordNet. This means that words for which all their possible meanings lead to zero overlap with current context or with other word definitions are by default assigned sense number one in WordNet." Simplified LESK Algorithm with smart default word sense (Vasilescu et al., 2004) The COMPUTEOVERLAP function returns the number of words in common between two sets, ignoring function words or other words on a stop list. The original Lesk algorithm defines the context in a more complex way. == Criticisms == Unfortunately, Lesk’s approach is very sensitive to the exact wording of definitions, so the absence of a certain word can radically change the results. Further, the algorithm determines overlaps only among the glosses of the senses being considered. This is a significant limitation in that dictionary glosses tend to be fairly short and do not provide sufficient vocabulary to relate fine-grained sense distinctions. A lot of work has appeared offering different modifications of this algorithm. These works use other resources for analysis (thesauruses, synonyms dictionaries or morphological and syntactic models): for instance, it may use such information as synonyms, different derivatives, or words from definitions of words from definitions. == Lesk variants == Original Lesk (Lesk, 1986) Adapted/Extended Lesk (Banerjee and Pederson, 2002/2003): In the adaptive lesk algorithm, a word vector is created corresponds to every content word in the wordnet gloss. Concatenating glosses of related concepts in WordNet can be used to augment this vector. The vector contains the co-occurrence counts of words co-occurring with w in a large corpus. Adding all the word vectors for all the content words in its gloss creates the Gloss vector g for a concept. Relatedness is determined by comparing the gloss vector using the Cosine similarity measure. There are a lot of studies concerning Lesk and its extensions: Wilks and Stevenson, 1998, 1999; Mahesh et al., 1997; Cowie et al., 1992; Yarowsky, 1992; Pook and Catlett, 1988; Kilgarriff and Rosensweig, 2000; Kwong, 2001; Nastase and Szpakowicz, 2001; Gelbukh and Sidorov, 2004.

    Read more →
  • Machine-learned interatomic potential

    Machine-learned interatomic potential

    Machine-learned interatomic potentials (MLIPs), or simply machine learning potentials (MLPs), are interatomic potentials constructed using machine learning. Beginning in the 1990s, researchers have employed such programs to construct interatomic potentials by mapping atomic structures to their potential energies. These potentials are referred to as MLIPs or MLPs. Such machine learning potentials promised to fill the gap between density functional theory, a highly accurate but computationally intensive modelling method, and empirically derived or intuitively-approximated potentials, which were far lighter computationally but substantially less accurate. Improvements in artificial intelligence technology heightened the accuracy of MLPs while lowering their computational cost, increasing the role of machine learning in fitting potentials. Machine learning potentials began by using neural networks to tackle low-dimensional systems. While promising, these models could not systematically account for interatomic energy interactions; they could be applied to small molecules in a vacuum, or molecules interacting with frozen surfaces, but not much else – and even in these applications, the models often relied on force fields or potentials derived empirically or with simulations. These models thus remained confined to academia. Modern neural networks construct highly accurate and computationally light potentials, as theoretical understanding of materials science was increasingly built into their architectures and preprocessing. Almost all are local, accounting for all interactions between an atom and its neighbor up to some cutoff radius. There exist some nonlocal models, but these have been experimental for almost a decade. For most systems, reasonable cutoff radii enable highly accurate results. Almost all neural networks intake atomic coordinates and output potential energies. For some, these atomic coordinates are converted into atom-centered symmetry functions. From this data, a separate atomic neural network is trained for each element; each atomic network is evaluated whenever that element occurs in the given structure, and then the results are pooled together at the end. This process – in particular, the atom-centered symmetry functions which convey translational, rotational, and permutational invariances – has greatly improved machine learning potentials by significantly constraining the neural network search space. Other models use a similar process but emphasize bonds over atoms, using pair symmetry functions and training one network per atom pair. Other models to learn their own descriptors rather than using predetermined symmetry-dictating functions. These models, called message-passing neural networks (MPNNs), are graph neural networks. Treating molecules as three-dimensional graphs (where atoms are nodes and bonds are edges), the model takes feature vectors describing the atoms as input, and iteratively updates these vectors as information about neighboring atoms is processed through message functions and convolutions. These feature vectors are then used to predict the final potentials. The flexibility of this method often results in stronger, more generalizable models. In 2017, the first-ever MPNN model (a deep tensor neural network) was used to calculate the properties of small organic molecules. == Gaussian Approximation Potential (GAP) == One popular class of machine-learned interatomic potential is the Gaussian Approximation Potential (GAP), which combines compact descriptors of local atomic environments with Gaussian process regression to machine learn the potential energy surface of a given system. To date, the GAP framework has been used to successfully develop a number of MLIPs for various systems, including for elemental systems such as carbon, silicon, phosphorus, and tungsten, as well as for multicomponent systems such as Ge2Sb2Te5 and austenitic stainless steel, Fe7Cr2Ni. == Equivariant graph neural networks == A significant limitation of early MPNNs was that they were not inherently equivariant to rotations and reflections of atomic structures — meaning predictions could change depending on how a molecule was oriented in space. Beginning around 2021, a new class of models addressed this by incorporating equivariance directly into the message-passing layers using spherical harmonics and irreducible representations. Notable examples include NequIP (2021), MACE (2022), and GemNet-OC (2022). These equivariant architectures proved substantially more data-efficient and accurate than their predecessors, and became the dominant paradigm for high-accuracy MLIPs. == Universal MLIPs and large-scale datasets == Early MLIPs were system-specific, trained on a few thousand structures of a single material. A major shift occurred with the creation of large, chemically diverse datasets enabling models that generalize across many elements, bonding environments, and application domains — so-called universal MLIPs. A key driver was the Open Catalyst Project (OC20, OC22), a collaboration between Meta AI (FAIR) and Carnegie Mellon University launched in 2020. OC20 comprises approximately 1.3 million DFT relaxations across 82 elements, designed to accelerate the discovery of catalysts for renewable energy applications. It was among the first datasets large enough to train GNNs that generalize across diverse chemical systems, and established a widely-used benchmark for the field. A subsequent dataset, Open Direct Air Capture (OpenDAC 2023 and OpenDAC 2025), applied the same approach to carbon capture, providing a large computational database of metal-organic frameworks and sorbent candidates evaluated for CO₂ capture, generated using nearly 400 million CPU hours of quantum chemistry calculations in collaboration with Georgia Tech. These datasets revealed a new challenge: the GNN architectures most effective for atomic simulations were memory-intensive, as they model higher-order interactions between triplets or quadruplets of atoms, making it difficult to scale model size. Graph Parallelism, introduced by Sriram et al. (ICLR 2022), addressed this by distributing a single input graph across multiple GPUs — a distinct strategy from data parallelism (which distributes training examples) or model parallelism (which distributes layers). This enabled training GNNs with hundreds of millions to billions of parameters for the first time. Building on these foundations, Meta FAIR released the Universal Model for Atoms (UMA) in 2025, trained on approximately 500 million unique 3D atomic structures spanning molecules, materials, and catalysts — the largest training run to date for an MLIP. UMA introduced a Mixture of Linear Experts (MoLE) architecture, enabling one model to learn from datasets generated by different DFT codes and settings without significant inference overhead. It matches or surpasses specialized models across catalysis, materials, and molecular benchmarks without task-specific fine-tuning, and has been described as marking a "pre/post-UMA" divide in the field. == Applications == Catalyst discovery: MLIPs have significantly accelerated the computational screening of heterogeneous catalysts by replacing expensive DFT relaxations with fast neural network surrogates. The Open Catalyst Project explicitly targets this application, aiming to identify new catalysts for green hydrogen production and other renewable energy reactions. Carbon capture: The OpenDAC project applies universal MLIPs to screening sorbent materials for direct air capture of CO₂, a key technology for climate change mitigation. AI-accelerated screening allows evaluation of orders of magnitude more candidate materials than traditional DFT workflows. Drug discovery and molecular design: MLIPs are increasingly used in pharmaceutical research to model molecular conformations and binding energies. The Open Molecules 2025 (OMol25) dataset, released by Meta FAIR in 2025, provides high-accuracy calculations for a large set of molecular systems to support this use case. Materials discovery: Universal MLIPs enable high-throughput screening of novel inorganic materials, including battery electrolytes, semiconductors, and superconductors, by rapidly estimating stability and properties across large chemical spaces.

    Read more →
  • NetOwl

    NetOwl

    NetOwl is a suite of multilingual text and identity analytics products that analyze big data in the form of text data – reports, web, social media, etc. – as well as structured entity data about people, organizations, places, and things. NetOwl utilizes artificial intelligence (AI)-based approaches, including natural language processing (NLP), machine learning (ML), and computational linguistics, to extract entities, relationships, and events; to perform sentiment analysis; to assign latitude/longitude to geographical references in text; to translate names written in foreign languages; and to perform name matching and identity resolution. NetOwl's uses include semantic search and discovery, geospatial analysis, intelligence analysis, content enrichment, compliance monitoring, cyber threat monitoring, risk management, and bioinformatics. == History == The first NetOwl product was NetOwl Extractor, which was initially released in 1996. Since then, Extractor has added many new capabilities, including relationship and event extraction, categorization, name translation, geotagging, and sentiment analysis, as well as entity extraction in other languages. Other products were added later to the NetOwl suite, namely TextMiner, NameMatcher, and EntityMatcher. NetOwl has participated in several 3rd party-sponsored text and entity analytics software benchmarking events. NetOwl Extractor was the top-scoring named entity extraction system at the DARPA-sponsored Message Understanding Conference MUC-6 and the top-scoring link and event extraction system in MUC-7. It was also the top-scoring system at several of the NIST-sponsored Automatic Content Extraction (ACE) evaluation tasks. NetOwl NameMatcher was the top-scoring system at the MITRE Challenge for Multicultural Person Name Matching. == Products == The NetOwl suite includes, among others, the following text and entity analytics products: === Text Analytics === NetOwl Extractor performs entity extraction from unstructured texts using natural language processing (NLP), machine learning (ML), and computational linguistics. Extractor also performs semantic relationship and event extraction as well as geotagging of text. It is used for a variety of data sources including both traditional sources (e.g., news, reports, web pages, email) and social media (e.g., Twitter, Facebook, chats, blogs). It runs on a variety of Big Data analytics platforms, including Apache Hadoop and LexisNexis’s High-Performance Computer Cluster (HPCC) technology. It has been integrated with a number of 3rd party analytical tools such as Esri ArcGIS and Google Earth/Maps. === Identity Analytics === NetOwl NameMatcher and EntityMatcher perform name matching and identity resolution for large multicultural and multilingual entity databases using machine learning (ML) and computational linguistics approaches. They are used for applications such as anti–money laundering (AML), watch lists, regulatory compliance, fraud detection, etc.

    Read more →
  • Image formation

    Image formation

    The study of image formation encompasses the radiometric and geometric processes by which 2D images of 3D objects are formed. In the case of digital images, the image formation process also includes analog to digital conversion and sampling. == Imaging == The imaging process is a mapping of an object to an image plane. Each point on the image corresponds to a point on the object. An illuminated object will scatter light toward a lens and the lens will collect and focus the light to create the image. The ratio of the height of the image to the height of the object is the magnification. The spatial extent of the image surface and the focal length of the lens determines the field of view of the lens. Image formation of mirror these have a center of curvature and its focal length of the mirror is half of the center of curvature. == Illumination == An object may be illuminated by the light from an emitting source such as the sun, a light bulb or a Light Emitting Diode. The light incident on the object is reflected in a manner dependent on the surface properties of the object. For rough surfaces, the reflected light is scattered in a manner described by the Bi-directional Reflectance Distribution Function (BRDF) of the surface. The BRDF of a surface is the ratio of the exiting power per square meter per steradian (radiance) to the incident power per square meter (irradiance). The BRDF typically varies with angle and may vary with wavelength, but a specific important case is a surface that has constant BRDF. This surface type is referred to as Lambertian and the magnitude of the BRDF is R/π, where R is the reflectivity of the surface. The portion of scattered light that propagates toward the lens is collected by the entrance pupil of the imaging lens over the field of view. == Field of view and imagery == The Field of view of a lens is limited by the size of the image plane and the focal length of the lens. The relationship between a location on the image and a location on the object is y = ftan(θ), where y is the max extent of the image plane, f is the focal length of the lens and θ is the field of view. If y is the max radial size of the image then θ is the field of view of the lens. While the image created by a lens is continuous, it can be modeled as a set of discrete field points, each representing a point on the object. The quality of the image is limited by the aberrations in the lens and the diffraction created by the finite aperture stop. == Pupils and stops == The aperture stop of a lens is a mechanical aperture which limits the light collection for each field point. The entrance pupil is the image of the aperture stop created by the optical elements on the object side of the lens. The light scattered by an object is collected by the entrance pupil and focused onto the image plane via a series of refractive elements. The cone of the focused light at the image plane is set by the size of the entrance pupil and the focal length of the lens. This is often referred to as the f-stop or f-number of the lens. f/# = f/D where D is the diameter of the entrance pupil. == Pixelation and color vs. monochrome == In typical digital imaging systems, a sensor is placed at the image plane. The light is focused on to the sensor and the continuous image is pixelated. The light incident on each pixel in the sensor will be integrated within the pixel and a proportional electronic signal will be generated. The angular geometric resolution of a pixel is given by atan(p/f), where p is the pitch of the pixel. This is also called the pixel field of view. The sensor may be monochrome or color. In the case of a monochrome sensor, the light incident on each pixel is integrated and the resulting image is a grayscale like picture. For color images, a mosaic color filter is typically placed over the pixels to create a color image. An example is a Bayer filter. The signal incident on each pixel is then digitized to a bit stream. == Image quality == The quality of an image is dependent upon both geometric and physical items. Geometrically, higher density of pixels across an image will give less blocky pixelation and thus a better geometric image quality. Lens aberrations also contribute to the quality of the image. Physically, diffraction due to the aperture stop will limit the resolvable spatial frequencies as a function of f-number. In the frequency domain, Modulation Transfer Function (MTF) is a measure of the quality of the imaging system. The MTF is a measure of the visibility of a sinusoidal variation in irradiance on the image plane as a function of the frequency of the sinusoid. It includes the effects of diffraction, aberrations and pixelation. For the lens, the MTF is the autocorrelation of the pupil function, so it accounts for the finite pupil extent and the lens aberrations. The sensor MTF is the Fourier Transform of the pixel geometry. For a square pixel, MTF(ξ) = sin(πξp)/πξp where p is the pixel width and ξ is the spatial frequency. The MTF of the combination of the lens and detector is the product of the two component MTFs. == Perception == Color images can be perceived via two means. In the case of computer vision the light incident on the sensor comprises the image. In the case of visual perception, the human eye has a color dependent response to light so this must be accounted for. This is important consideration when converting to grayscale. == Image formation in eye == The principal difference between the lens of the eye and an ordinary optical lens is that the former is flexible. The radius of the curvature of the anterior surface of the lens is greater than the radius of its posterior surface. The shape of the lens is controlled by tension in the fibers of the ciliary body. To focus on distant objects, the controlling muscles cause the lens to be relatively flattened. Similarly, these muscles allow the lens to become thicker in order to focus on objects near the eye. The distance between the center of the lens and the retina (focal length) varies from approximately 17 mm to about 14 mm, as the refractive power of the lens increases from its minimum to its maximum. When the eye focuses on an object farther away than about 3 m, the lens exhibits its lowest refractive power. When the eye focuses on a close object, the lens is most strongly refractive.

    Read more →
  • Speculative decoding

    Speculative decoding

    Speculative decoding is an inference-time optimization for autoregressive large language models (LLMs) that generates multiple tokens per decoding step instead of one. A smaller draft model proposes a sequence of candidate tokens, and the larger target model verifies them in a single forward pass through a modified rejection sampling scheme. The verification preserves the target model's original output distribution, so the technique produces the same results as standard decoding while cutting latency by roughly two to three times. The name is an analogy to speculative execution in CPU design, where a processor runs instructions along a predicted branch before the outcome is known. == Background == Standard autoregressive decoding in large language models generates one token at a time. The model computes a probability distribution over its vocabulary, samples the next token, and feeds that token back as input. For large models, this process is bottlenecked by memory bandwidth rather than arithmetic throughput: loading the model's parameters from high-bandwidth memory (HBM) to the processor takes up most of the wall-clock time at each step. Because of this, a forward pass over one token and a forward pass over several tokens in a batch take roughly the same time. Speculative decoding relies on this property. == Mechanism == The technique alternates between two phases: drafting and verification. During drafting, a fast approximation model generates a short run of K candidate tokens, typically between 3 and 12. The draft model is usually a much smaller version of the target model or a lightweight auxiliary network. During verification, the target model scores the entire draft sequence in one batched forward pass. A modified rejection sampling algorithm compares the draft and target probabilities at each position. If the target model would have been at least as likely to produce a given token, that token is accepted; the first token that fails is resampled from a corrected distribution, and everything after it is thrown out. The result is that the output distribution is the same as if each token had been generated one at a time. How many tokens get accepted per cycle depends on how well the draft model matches the target. For common words and predictable continuations the match tends to be good, so the target model can confirm several tokens at once. == History == An early precursor was blockwise parallel decoding, proposed in 2018 by Stern, Shazeer, and Uszkoreit. Their method predicted multiple future tokens through auxiliary prediction heads and validated them against the autoregressive model, but it only worked with greedy decoding and did not preserve the full sampling distribution. The modern form of the technique came from Yaniv Leviathan, Matan Kalman, and Yossi Matias at Google Research, who posted "Fast Inference from Transformers via Speculative Decoding" on arXiv in November 2022. Separately and at about the same time, Charlie Chen and colleagues at DeepMind arrived at a closely related method they called speculative sampling, published in February 2023. Both papers introduced the use of rejection sampling to guarantee that the output distribution is unchanged. Leviathan et al. showed roughly 2–3x speedup on T5-XXL (11 billion parameters); Chen et al. reported 2–2.5x on the Chinchilla model (70 billion parameters). The Leviathan et al. paper was presented as an oral at the International Conference on Machine Learning in July 2023. == Variants == SpecInfer (Miao et al., 2024) uses multiple small language models to jointly build a tree of candidate continuations rather than a single chain. The target model verifies the whole tree in parallel and keeps the longest valid path, with reported speedups of 1.5–3.5x. Medusa (Cai et al., 2024) takes a different approach by not using a separate draft model at all. Extra lightweight decoding heads are attached to the target model itself, and each one predicts a token at a different future position. The candidates are evaluated through a tree-structured attention mechanism. The authors measured 2.2–3.6x speedup. EAGLE (Li et al., 2024) performs autoregression on the target model's internal feature representations (specifically the second-to-top layer) rather than on tokens directly. On LLaMA 2 Chat 70B, this gave a 2.7–3.5x latency reduction. Later versions added dynamic draft trees (EAGLE-2) and further optimizations (EAGLE-3), reaching 3–6.5x speedup. == Adoption == By 2024, speculative decoding had become a standard part of production LLM serving. Google uses it in the AI Overviews feature of Google Search. Open-source inference frameworks such as vLLM, NVIDIA's TensorRT-LLM, and SGLang all include built-in support for speculative decoding and its variants. Apple, AWS, and Meta have also published research extending the method or deploying it at scale.

    Read more →
  • Outline of databases

    Outline of databases

    The following is provided as an overview of and topical guide to databases: Database – organized collection of data, today typically in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies). == What type of things are databases? == Databases can be described as all of the following: Information – sequence of symbols that can be interpreted as a message. Information can be recorded as signs, or transmitted as signals. Data – values of qualitative or quantitative variables, belonging to a set of items. Data in computing (or data processing) are often represented by a combination of items organized in rows and multiple variables organized in columns. Data are typically the results of measurements and can be visualised using graphs or images. Computer data – information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform. In this sense, data is everything in software that is not program code. == Types of databases == Active database – Database with event driven features Animation database – Database for storing and reusing animation fragments or motion capture data Back-end database – Organized collection of data in computingPages displaying short descriptions of redirect targets Bibliographic database – database of bibliographic records, an organized digital collection of references to published literature, including journal and newspaper articles, conference proceedings, reports, government and legal publications, patents, books, etc. Centralized database – database located and maintained in one location, unlike a distributed database. Cloud database – Database running on a cloud computing platform Collection database – collection catalog of a museum or archive implemented using a computerized database, in which the institution's objects or material are catalogued. Collective Optimization Database – open repository to enable sharing of benchmarks, data sets and optimization cases from the community, provide web services and Plug-in (computing)|plugins to analyze optimization data and predict program transformations or better hardware designs for multi-objective optimizations based on statistical and machine learning techniques provided there is enough information collected in the repository from multiple users. Configuration management database – Database used to store info on hardware and software assets Cooperative database – holds information on customers and their transactions. Current database – conventional database that stores data that is valid now. Directory – repository or database of information which is optimized for reading, under the assumption that data updates are very rare compared to data reads. Commonly, a directory supports search and browsing in addition to simple lookups. Distributed database – database in which storage devices are not all attached to a common CPU. Document-oriented database – computer program designed for storing, retrieving, and managing document-oriented, or Semi-structured model|semi structured data, information. EDA database – database specialized for the purpose of electronic design automation. Endgame tablebase – computerized database that contains precalculated exhaustive analysis of a chess endgame position. Food composition database (FCDB) – provides detailed information on the nutritional composition of foods. Full-text database – database that contains the complete text of books, dissertations, journals, magazines, newspapers or other kinds of textual documents. Also called a "complete-text database". Government database – collects personal information for various reasons (mass surveillance, Schengen Information System in the European Union, social security, statistics, etc.). Graph database – uses graph structures with nodes, edges, and properties to represent and store data. Knowledge base – special kind of database for knowledge management. A knowledge base provides a means for information to be collected, organised, shared, searched and utilised. Mobile database – can be connected to by a mobile computing device over a mobile network. Navigational database – database in which objects (or records) in it are found primarily by following references from other objects. Non-native speech database – speech database of non-native pronunciations of English. Online database – database accessible from a network, including from the Internet. Operational database – accessed by an Operational System to carry out regular operations of an organization. Parallel database – improves performance through parallelization of various operations, such as loading data, building indexes and evaluating queries. Probabilistic database – uncertain database in which the possible worlds have associated probabilities. Real-time database – processing system designed to handle workloads whose state is constantly changing (Buchmann). Relational database – collection of data items organized as a set of formally described tables from which data can be accessed easily. Spatial database – database that is optimized to store and query data that is related to objects in space, including points, lines and polygons. Temporal database – database with built-in time aspects, for example a temporal data model and a temporal version of Structured Query Language (SQL). Time series database – a time series is an associative array of numbers indexed by a datetime or a datetime range. These time series are often called profiles or curves, depending upon the market. A time series of stock prices might be called a price curve, or a time series of energy consumption might be called a load profile. Despite the disparate naming, the operations performed on them are sufficiently common as to demand special database treatment. Triplestore – purpose-built database for the storage and retrieval of triples, a triple being a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred". Very large database (VLDB) – contains an extremely high number of tuples (database rows), or occupies an extremely large physical filesystem storage space. Vulnerability database – platform aimed at collecting, maintaining, and disseminating information about discovered vulnerabilities targeting real computer systems. XLDB – Stands for "eXtremely Large Data Base". XML database – data stored in XML format, where it can be queried, exported and serialized into the desired format. == History of databases == History of databases – History of database management systems –: == Database use == Database usage requirements – Database theory – encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems. Database machine – or is a computer or special hardware that stores and retrieves data from a database. Also called a "back end processor" Database server – computer program that provides database services to other computer programs or computers, as defined by the client-server model. Database application – computer program whose primary purpose is entering and retrieving information from a computer-managed database. Database management system (DBMS) – software package with computer programs that control the creation, maintenance, and use of a database. Database connection – facility in computer science that allows client software to communicate with database server software, whether on the same machine or not. Datasource – name given to the connection set up to a database from a server. The name is commonly used when creating a query to the database. The Database Source Name (DSN) does not have to be the same as the filename for the database. For example, a database file named "friends.mdb" could be set up with a DSN of "school". Then DSN "school" would then be used to refer to the database when performing a query. Data Source Name (DSN) – are data structures used to describe a connection to a data source. Sometimes known as a database source name though data sources are not limited to databases. Database administrator (DBA) – is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of physical databases. Lock – Comparison of database tools – (provides tables for comparing general and technical information for a number of available database administrator tools.) Database-centric architecture – software architectures in which databases play a crucial role. Also called "data-centric architecture". Intelligent database – was put forward as a system that manages information (rather than data) in a way that appears natural to users and which goes beyond simple record keeping. Two-phase locking (2PL) – is a

    Read more →
  • Ayoba

    Ayoba

    Ayoba is an African communication platform developed in South Africa. It is owned by Progressive Tech Holdings in Mauritius and managed by SIMFY Africa. Launched on May 4, 2019, as of April 2024, it has over 35 million active users. == History == Ayoba was first published on Google Play in February 2019. Its first marketing campaign and brand launch took place in Cameroon on May 4, 2019. In June 2019, the platform introduced its first eight channels. In November 2019, the platform reached one million active users, which increased to two million by June 2020. Subsequently, ayoba expanded its services, including the launch of games for Android in February 2020, Momo (Mobile Money) in Cameroon in May 2020, and MicroApps in May 2020. It also launched music and voice and video calling features in 12 territories in August 2020. The first version of ayoba for iOS was released in September 2020. In December of the same year, games and Messaging 2.0 were launched on the platform. In November 2020, it won Best Mobile Application at the African Digital Awards. In 2021, it won OTT Brand of the Year at the Marketing World Awards in Ghana. In December 2022, it received Top Innovative Technology and Telecom Product of the Year at the National Communications Awards in December 2022. In June 2023 ayoba partnered with BoomPlay and as of April 2024, it had 35 million monthly active users. Ayoba has partnered with Jumia Ghana to offer exclusive deals to users. Ayoba users can get a 10% discount on selected Jumia purchases through the app, with no data charges for MTN users. This partnership aims to make online shopping more affordable and accessible by integrating Jumia's offers into the ayoba app. Ayoba supports over 35 million users across Africa and provides services in 22 languages. To access the deals, users can download the ayoba app from the Google Play Store, iOS Store, or the official website. == Platform features == Chat, Call and Share: ayoba enables instant messaging, voice notes, picture sharing, and file sharing with contacts, even if they do not have the app installed. The app supports voice and video calls on both Android and iOS, as well as group chats, help channel and SMS continuity (non ayoba users receive messages as SMS, their responses appear in the ayoba app). Music: ayoba offers a free music player with daily updates on international and African music. Users can find playlists for different genres. Games: ayoba provides a selection of interactive games, including action, adventure, and children's games available on both Android and iOS. Mobile Money Transfers: In certain territories, ayoba supports mobile money transfers using MTN Mobile Money (MoMo) for transactions within the app. MicroApps: ayoba features individual MicroApps within the platform that offer content and services, including streaming channels, podcasts, and specialized apps. The availability of these apps may vary by country. == Operations == ayoba primarily focuses on the following territories: Nigeria, Cameroon, South Africa, Ghana, Côte d'Ivoire, Uganda, Republic of Congo, Benin, Zambia, Tanzania, Kenya, Senegal, Togo, Guinea Bissau, Guinea Conakry, Sudan, South Sudan, and Liberia. The company operates from its offices in Cape Town and Johannesburg, South Africa. David Gillaranz served as the CEO from 2019 to 2021, and Burak Akinci has been the CEO since 2021.

    Read more →
  • Semantic decomposition (natural language processing)

    Semantic decomposition (natural language processing)

    A semantic decomposition is an algorithm that breaks down the meanings of phrases or concepts into less complex concepts. The result of a semantic decomposition is a representation of meaning. This representation can be used for tasks, such as those related to artificial intelligence or machine learning. Semantic decomposition is common in natural language processing applications. The basic idea of a semantic decomposition is taken from the learning skills of adult humans, where words are explained using other words. It is based on Meaning-text theory. Meaning-text theory is used as a theoretical linguistic framework to describe the meaning of concepts with other concepts. == Background == Given that an AI does not inherently have language, it is unable to think about the meanings behind the words of a language. An artificial notion of meaning needs to be created for a strong AI to emerge. Creating an artificial representation of meaning requires the analysis of what meaning is. Many terms are associated with meaning, including semantics, pragmatics, knowledge and understanding or word sense. Each term describes a particular aspect of meaning, and contributes to a multitude of theories explaining what meaning is. These theories need to be analyzed further to develop an artificial notion of meaning best fit for our current state of knowledge. == Graph representations == Representing meaning as a graph is one of the two ways that both an AI cognition and a linguistic researcher think about meaning (connectionist view). Logicians utilize a formal representation of meaning to build upon the idea of symbolic representation, whereas description logics describe languages and the meaning of symbols. This contention between 'neat' and 'scruffy' techniques has been discussed since the 1970s. Research has so far identified semantic measures and with that word-sense disambiguation (WSD) - the differentiation of meaning of words - as the main problem of language understanding. As an AI-complete environment, WSD is a core problem of natural language understanding. AI approaches that use knowledge-given reasoning creates a notion of meaning combining the state of the art knowledge of natural meaning with the symbolic and connectionist formalization of meaning for AI. The abstract approach is shown in Figure. First, a connectionist knowledge representation is created as a semantic network consisting of concepts and their relations to serve as the basis for the representation of meaning. This graph is built out of different knowledge sources like WordNet, Wiktionary, and BabelNET. The graph is created by lexical decomposition that recursively breaks each concept semantically down into a set of semantic primes. The primes are taken from the theory of Natural Semantic Metalanguage, which has been analyzed for usefulness in formal languages. Upon this graph marker passing is used to create the dynamic part of meaning representing thoughts. The marker passing algorithm, where symbolic information is passed along relations form one concept to another, uses node and edge interpretation to guide its markers. The node and edge interpretation model is the symbolic influence of certain concepts. Future work uses the created representation of meaning to build heuristics and evaluate them through capability matching and agent planning, chatbots or other applications of natural language understanding.

    Read more →