AI Detector Huggingface

AI Detector Huggingface — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Symbolic regression

    Symbolic regression

    Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity. No particular model is provided as a starting point for symbolic regression. Instead, initial expressions are formed by randomly combining mathematical building blocks such as mathematical operators, analytic functions, constants, and state variables. Usually, a subset of these primitives will be specified by the person operating it, but that's not a requirement of the technique. The symbolic regression problem for mathematical functions has been tackled with a variety of methods, including recombining equations most commonly using genetic programming, as well as more recent methods utilizing Bayesian methods and neural networks. Another non-classical alternative method to SR is called Universal Functions Originator (UFO), which has a different mechanism, search-space, and building strategy. Further methods such as Exact Learning attempt to transform the fitting problem into a moments problem in a natural function space, usually built around generalizations of the Meijer-G function. By not requiring a priori specification of a model, symbolic regression isn't affected by human bias, or unknown gaps in domain knowledge. It attempts to uncover the intrinsic relationships of the dataset, by letting the patterns in the data itself reveal the appropriate models, rather than imposing a model structure that is deemed mathematically tractable from a human perspective. The fitness function that drives the evolution of the models takes into account not only error metrics (to ensure the models accurately predict the data), but also special complexity measures, thus ensuring that the resulting models reveal the data's underlying structure in a way that's understandable from a human perspective. This facilitates reasoning and favors the odds of getting insights about the data-generating system, as well as improving generalisability and extrapolation behaviour by preventing overfitting. Accuracy and simplicity may be left as two separate objectives of the regression—in which case the optimum solutions form a Pareto front—or they may be combined into a single objective by means of a model selection principle such as minimum description length. It has been proven that symbolic regression is an NP-hard problem. Nevertheless, if the sought-for equation is not too complex it is possible to solve the symbolic regression problem exactly by generating every possible function (built from some predefined set of operators) and evaluating them on the dataset in question. == Difference from classical regression == While conventional regression techniques seek to optimize the parameters for a pre-specified model structure, symbolic regression avoids imposing prior assumptions, and instead infers the model from the data. In other words, it attempts to discover both model structures and model parameters. This approach has the disadvantage of having a much larger space to search, because not only the search space in symbolic regression is infinite, but there are an infinite number of models which will perfectly fit a finite data set (provided that the model complexity isn't artificially limited). This means that it will possibly take a symbolic regression algorithm longer to find an appropriate model and parametrization, than traditional regression techniques. This can be attenuated by limiting the set of building blocks provided to the algorithm, based on existing knowledge of the system that produced the data; but in the end, using symbolic regression is a decision that has to be balanced with how much is known about the underlying system. Nevertheless, this characteristic of symbolic regression also has advantages: because the evolutionary algorithm requires diversity in order to effectively explore the search space, the result is likely to be a selection of high-scoring models (and their corresponding set of parameters). Examining this collection could provide better insight into the underlying process, and allows the user to identify an approximation that better fits their needs in terms of accuracy and simplicity. == Benchmarking == === SRBench === In 2021, SRBench was proposed as a large benchmark for symbolic regression. In its inception, SRBench featured 14 symbolic regression methods, 7 other ML methods, and 252 datasets from PMLB. The benchmark intends to be a living project: it encourages the submission of improvements, new datasets, and new methods, to keep track of the state of the art in SR. === SRBench Competition 2022 === In 2022, SRBench announced the competition Interpretable Symbolic Regression for Data Science, which was held at the GECCO conference in Boston, MA. The competition pitted nine leading symbolic regression algorithms against each other on a novel set of data problems and considered different evaluation criteria. The competition was organized in two tracks, a synthetic track and a real-world data track. ==== Synthetic Track ==== In the synthetic track, methods were compared according to five properties: re-discovery of exact expressions; feature selection; resistance to local optima; extrapolation; and sensitivity to noise. Rankings of the methods were: QLattice PySR (Python Symbolic Regression) uDSR (Deep Symbolic Optimization) ==== Real-world Track ==== In the real-world track, methods were trained to build interpretable predictive models for 14-day forecast counts of COVID-19 cases, hospitalizations, and deaths in New York State. These models were reviewed by a subject expert and assigned trust ratings and evaluated for accuracy and simplicity. The ranking of the methods was: uDSR (Deep Symbolic Optimization) QLattice geneticengine (Genetic Engine) == Non-standard methods == Most symbolic regression algorithms prevent combinatorial explosion by implementing evolutionary algorithms that iteratively improve the best-fit expression over many generations. Recently, researchers have proposed algorithms utilizing other tactics in AI. Silviu-Marian Udrescu and Max Tegmark developed the "AI Feynman" algorithm, which attempts symbolic regression by training a neural network to represent the mystery function, then runs tests against the neural network to attempt to break up the problem into smaller parts. For example, if f ( x 1 , . . . , x i , x i + 1 , . . . , x n ) = g ( x 1 , . . . , x i ) + h ( x i + 1 , . . . , x n ) {\displaystyle f(x_{1},...,x_{i},x_{i+1},...,x_{n})=g(x_{1},...,x_{i})+h(x_{i+1},...,x_{n})} , tests against the neural network can recognize the separation and proceed to solve for g {\displaystyle g} and h {\displaystyle h} separately and with different variables as inputs. This is an example of divide and conquer, which reduces the size of the problem to be more manageable. AI Feynman also transforms the inputs and outputs of the mystery function in order to produce a new function which can be solved with other techniques, and performs dimensional analysis to reduce the number of independent variables involved. The algorithm was able to "discover" 100 equations from The Feynman Lectures on Physics, while a leading software using evolutionary algorithms, Eureqa, solved only 71. AI Feynman, in contrast to classic symbolic regression methods, requires a very large dataset in order to first train the neural network and is naturally biased towards equations that are common in elementary physics.

    Read more →
  • Text-to-image personalization

    Text-to-image personalization

    Text-to-Image personalization is a task in deep learning for computer graphics that augments pre-trained text-to-image generative models. In this task, a generative model that was trained on large-scale data (usually a foundation model), is adapted such that it can generate images of novel, user-provided concepts. These concepts are typically unseen during training, and may represent specific objects (such as the user's pet) or more abstract categories (new artistic style or object relations). Text-to-Image personalization methods typically bind the novel (personal) concept to new words in the vocabulary of the model. These words can then be used in future prompts to invoke the concept for subject-driven generation, inpainting, style transfer and even to correct biases in the model. To do so, models either optimize word-embeddings, fine-tune the generative model itself, or employ a mixture of both approaches. == Technology == Text-to-Image personalization was first proposed during August 2022 by two concurrent works, Textual Inversion and DreamBooth. In both cases, a user provides a few images (typically 3–5) of a concept, like their own dog, together with a coarse descriptor of the concept class (like the word "dog"). The model then learns to represent the subject through a reconstruction based objective, where prompts referring to the subject are expected to reconstruct images from the training set. In Textual Inversion, the personalized concepts are introduced into the text-to-image model by adding new words to the vocabulary of the model. Typical text-to-image models represent words (and sometimes parts-of-words) as tokens, or indices in a predefined dictionary. During generation, an input prompt is converted into such tokens, each of which is converted into a ‘word-embedding’: a continuous vector representation which is learned for each token as part of the model's training. Textual Inversion proposes to optimize a new word-embedding vector for representing the novel concept. This new embedding vector can then be assigned to a user-chosen string, and invoked whenever the user's prompt contains this string. In DreamBooth, rather than optimizing a new word vector, the full generative model itself is fine-tuned. The user first selects an existing token, typically one which rarely appears in prompts. The subject itself is then represented by a string containing this token, followed by a coarse descriptor of the subject's class. A prompt describing the subject will then take the form: "A photo of " (e.g. "a photo of sks cat" when learning to represent a specific cat). The text-to-image model is then tuned so that prompts of this form will generate images of the subject. == Textual Inversion == The key idea in Textual Inversion is to add a new term to the vocabulary of the diffusion model that corresponds to the new (personalized) concept. Textual Inversion operates by inverting the concepts into new pseudo-words within the textual embedding space of a pre-trained text-to-image model. These pseudo-words can be injected into new scenes using simple natural language descriptions, allowing for simple and intuitive modifications. The method allows a user to leverage multi-modal information — using a text-driven interface for ease of editing, but providing visual cues when approaching the limits of natural language. The resulting model is extremely light-weight per concept: only 1K long, but succeeds to encode detailed visual properties of the concept. == Extensions == Several approaches were proposed to refine and improve over the original methods. These include the following. Low-rank Adaptation (LoRA) - an adapter-based technique for efficient finetuning of models. In the case of text-to-image models, LoRA is typically used to modify the cross-attention layers of a diffusion model. Perfusion - a low rank update method that also locks the activations of the key matrix in the diffusion model's cross attention layers to the concept's coarse class. Extended Textual Inversion - a technique that learns an individual word embedding for each layer in the diffusion model's denoising network. Encoder-based methods that use another neural network to quickly personalize a model == Challenges and limitations == Text-to-image personalization methods must contend with several challenges. At their core is the goal of achieving high-fidelity to the personal concept while maintaining high alignment between novel prompts containing the subject, and the generated images (typically referred to as ‘editability’). Another challenge that personalization methods must contend with is memory requirements. Initial implementations of personalization methods required more than 20 Gigabytes of GPU memory, and more recent approaches have reported requirements of more than 40 Gigabytes. However, optimizations such as Flash Attention have since reduced this requirement considerably. Approaches that tune the entire generative model may also create checkpoints that are several gigabytes in size, making it difficult to share or store many models. Embedding based approaches require only a few kilobytes, but typically struggle to preserve identity while maintaining editability. More recent approaches have proposed hybrid tuning goals which optimize both an embedding and a subset of network weights. These can reduce storage requirements to as little as 100 Kilobytes while achieving quality comparable to full tuning methods. Finally, optimization processes can be lengthy, requiring several minutes of tuning for each novel concept. Encoder and quick-tuning methods aim to reduce this to seconds or less.

    Read more →
  • CLEVER score

    CLEVER score

    The CLEVER (Cross Lipschitz Extreme Value for nEtwork Robustness) score is a way of measuring the robustness of an artificial neural network towards adversarial attacks. It was developed by a team at the MIT-IBM Watson AI Lab in IBM Research and first presented at the 2018 International Conference on Learning Representations. It was mentioned and reviewed by Ian Goodfellow as well. It was adopted into an educational game Fool The Bank by Narendra Nath Joshi, Abhishek Bhandwaldar and Casey Dugan

    Read more →
  • Airfair

    Airfair

    AirFair was a mobile travel application that checks flights, and shows whether a traveler is owed compensation. == History == AirFair was developed in 2016 by Allay Logic Ltd; a Newcastle-based tech-company. == Services == AirFair offered a free flight check to see if compensation is owed. The app could indicate how much the person is owed within minutes whether the flight was delayed, cancelled or the traveler is refused boarding.

    Read more →
  • Semantic neural network

    Semantic neural network

    Semantic neural network (SNN) is based on John von Neumann's neural network [von Neumann, 1966] and Nikolai Amosov M-Network. There are limitations to a link topology for the von Neumann’s network but SNN accept a case without these limitations. Only logical values can be processed, but SNN accept that fuzzy values can be processed too. All neurons into the von Neumann network are synchronized by tacts. For further use of self-synchronizing circuit technique SNN accepts neurons can be self-running or synchronized. In contrast to the von Neumann network there are no limitations for topology of neurons for semantic networks. It leads to the impossibility of relative addressing of neurons as it was done by von Neumann. In this case an absolute readdressing should be used. Every neuron should have a unique identifier that would provide a direct access to another neuron. Of course, neurons interacting by axons-dendrites should have each other's identifiers. An absolute readdressing can be modulated by using neuron specificity as it was realized for biological neural networks. There’s no description for self-reflectiveness and self-modification abilities into the initial description of semantic networks [Dudar Z.V., Shuklin D.E., 2000]. But in [Shuklin D.E. 2004] a conclusion had been drawn about the necessity of introspection and self-modification abilities in the system. For maintenance of these abilities a concept of pointer to neuron is provided. Pointers represent virtual connections between neurons. In this model, bodies and signals transferring through the neurons connections represent a physical body, and virtual connections between neurons are representing an astral body. It is proposed to create models of artificial neuron networks on the basis of virtual machine supporting the opportunity for paranormal effects. SNN is generally used for natural language processing. == Related models == Computational creativity Semantic hashing Semantic Pointer Architecture Sparse distributed memory

    Read more →
  • Second-order co-occurrence pointwise mutual information

    Second-order co-occurrence pointwise mutual information

    In computational linguistics, second-order co-occurrence pointwise mutual information (SOC-PMI) is a method used to measure semantic similarity, or how close in meaning two words are. The method does not require the two words to appear together in a text. Instead, it works by analyzing the "neighbor" words that typically appear alongside each of the two target words in a large body of text (corpus). If the two target words frequently share the same neighbors, they are considered semantically similar. For example, the words "cemetery" and "graveyard" may not appear in the same sentence often, but they both frequently appear near words like "buried," "dead," and "funeral." SOC-PMI uses this shared context to determine that they have a similar meaning. The method is called "second-order" because it doesn't look at the direct co-occurrence of the target words (which would be first-order), but at the co-occurrence of their neighbors (a second level of association). The strength of these associations is quantified using pointwise mutual information (PMI). == History == The method builds on earlier work like the PMI-IR algorithm, which used the AltaVista search engine to calculate word association probabilities. The key advantage of a second-order approach like SOC-PMI is its ability to measure similarity between words that do not co-occur often, or at all. The British National Corpus (BNC) has been used as a source for word frequencies and contexts for this method. == Methodology == The SOC-PMI algorithm measures the similarity between two words, w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} , in several steps. === Step 1: Score neighboring words with PMI === First, for each target word ( w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} ), the algorithm identifies its "neighbor" words within a certain text window (e.g., within 5 words to the left or right) across a large corpus. The strength of the association between a target word t i {\displaystyle t_{i}} and its neighbor w {\displaystyle w} is calculated using pointwise mutual information (PMI). A higher PMI value means the two words appear together more often than would be expected by chance. The PMI between a target word t i {\displaystyle t_{i}} and a neighbor word w {\displaystyle w} is calculated as: f pmi ( t i , w ) = log 2 ⁡ f b ( t i , w ) × m f t ( t i ) f t ( w ) {\displaystyle f^{\text{pmi}}(t_{i},w)=\log _{2}{\frac {f^{b}(t_{i},w)\times m}{f^{t}(t_{i})f^{t}(w)}}} where: f b ( t i , w ) {\displaystyle f^{b}(t_{i},w)} is the number of times t i {\displaystyle t_{i}} and w {\displaystyle w} appear together in the context window. f t ( t i ) {\displaystyle f^{t}(t_{i})} is the total number of times t i {\displaystyle t_{i}} appears in the corpus. f t ( w ) {\displaystyle f^{t}(w)} is the total number of times w {\displaystyle w} appears in the corpus. m {\displaystyle m} is the total number of tokens (words) in the corpus. === Step 2: Create a semantic 'signature' for each word === For each target word ( w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} ), the algorithm creates a list of its most significant neighbors. This is done by taking the top β {\displaystyle \beta } neighbor words, sorted in descending order by their PMI score with the target word. This list of top neighbors, X w {\displaystyle X^{w}} , acts as a semantic "signature" for the word w {\displaystyle w} . X w = { X i w } {\displaystyle X^{w}=\{X_{i}^{w}\}} , for i = 1 , 2 , … , β {\displaystyle i=1,2,\ldots ,\beta } The size of this list, β {\displaystyle \beta } , is a parameter of the method. === Step 3: Compare the signatures === The algorithm then compares the signatures of w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} . It looks for words that are present in both signatures. The similarity of w 1 {\displaystyle w_{1}} to w 2 {\displaystyle w_{2}} is calculated by summing the PMI scores of w 2 {\displaystyle w_{2}} with every word in w 1 {\displaystyle w_{1}} 's signature list. The β {\displaystyle \beta } -PMI summation function defines this score. The score for w 1 {\displaystyle w_{1}} with respect to w 2 {\displaystyle w_{2}} is: f ( w 1 , w 2 , β ) = ∑ i = 1 β ( f pmi ( X i w 1 , w 2 ) ) γ {\displaystyle f(w_{1},w_{2},\beta )=\sum _{i=1}^{\beta }(f^{\text{pmi}}(X_{i}^{w_{1}},w_{2}))^{\gamma }} This sum only includes terms where the PMI value is positive. The exponent γ {\displaystyle \gamma } (with a value > 1) is used to give more weight to neighbors that are more strongly associated with w 2 {\displaystyle w_{2}} . This calculation is done in both directions: The similarity of w 1 {\displaystyle w_{1}} with respect to w 2 {\displaystyle w_{2}} : f ( w 1 , w 2 , β 1 ) = ∑ i = 1 β 1 ( f pmi ( X i w 1 , w 2 ) ) γ {\displaystyle f(w_{1},w_{2},\beta _{1})=\sum _{i=1}^{\beta _{1}}(f^{\text{pmi}}(X_{i}^{w_{1}},w_{2}))^{\gamma }} The similarity of w 2 {\displaystyle w_{2}} with respect to w 1 {\displaystyle w_{1}} : f ( w 2 , w 1 , β 2 ) = ∑ i = 1 β 2 ( f pmi ( X i w 2 , w 1 ) ) γ {\displaystyle f(w_{2},w_{1},\beta _{2})=\sum _{i=1}^{\beta _{2}}(f^{\text{pmi}}(X_{i}^{w_{2}},w_{1}))^{\gamma }} === Step 4: Calculate final similarity score === Finally, the total semantic similarity is the average of the two scores from the previous step. S i m ( w 1 , w 2 ) = f ( w 1 , w 2 , β 1 ) β 1 + f ( w 2 , w 1 , β 2 ) β 2 {\displaystyle \mathrm {Sim} (w_{1},w_{2})={\frac {f(w_{1},w_{2},\beta _{1})}{\beta _{1}}}+{\frac {f(w_{2},w_{1},\beta _{2})}{\beta _{2}}}} This score can be normalized to fall between 0 and 1. For example, using this method, the words cemetery and graveyard achieve a high similarity score of 0.986 (with specific parameter settings).

    Read more →
  • Dental AI

    Dental AI

    Dental artificial intelligence (Dental AI) refers to the application of artificial intelligence (AI) and machine-learning methods to oral healthcare data. These systems can be used to find patterns or make predictions that can aid in diagnosis, treatment, patient communication, or practice management. == History and development == Research into AI for dentistry dates to the 1990s and 2000s, alongside early CAD/CAM and image-analysis work in dental radiology. Recent developments in deep learning, especially those involving computer vision, such as convolutional neural networks, trained on large image datasets, led to a rapid improvement in performance, as well as a move from prototype technology to productization suitable for use in dental chairs. Dental schools and continuing education programs started incorporating AI content in the 2020s. == Definition and core technologies == The dental AI software accomplishes this task by using various dental images and patient data. Dental images and data used by the dental AI software include bitewing and periapical X-rays, complete mouth X-rays, detailed 3D images, intraoral images, and the patient’s medical history. The dental AI software utilizes several core technologies in accomplishing its task of assisting the dentist. First, the dental AI software utilizes machine learning and deep learning using programs that can learn from examples. Such programs are referred to as convolutional neural network (CNN) and can detect cavities and identify bone changes related to gum disease. The dental AI software utilizes computer vision, which enables the AI software to identify and quantify important features in images and data, whether they are 2D images or 3D images. Natural language processing (NLP) is used for the AI software to understand written text and can automatically generate dental notes and communicate with the patient. Furthermore, the dental AI software utilizes predictive analytics to identify patients that are more prone to dental complications and can suggest the best intervals for checkups or future dental procedures. == Applications in dentistry == Reported clinical and operational applications include diagnostic assistance for caries and periodontal disease, treatment planning assistance, patient education overlays, quality assurance, curriculum assistance for dental education, and claims documentation. Systematic reviews continue to find image-based applications such as caries detection with some variability in study design and a need for prospective validation. == Academic research and clinical validation == Several peer-reviewed studies have measured the effectiveness of AI for applications such as interproximal caries detection and periodontal bone level assessment, showing improvements over unaided readings with a focus on bias within the dataset. The Dental AI Council found variability among clinicians for diagnosis and treatment planning, suggesting the use of a standard tool as an assist. == Industry adoption == Multiple vendors offer FDA-cleared chairside AI for dental imaging: Pearl — Received U.S. FDA 510(k) clearance for its real-time radiologic aid (“Second Opinion”) in 2022 (2D), with subsequent clearances including pediatric and CBCT (“Second Opinion 3D”). TIME gave “Second Opinion” a special mention on its Best Inventions of 2022 list. Overjet — FDA-cleared for bone-level quantification and detection/outline of caries and calculus (e.g., K210187), with additional clearances expanding capabilities. VideaHealth — Received an FDA 510(k) covering 30+ detections across common dental findings (K232384), including indications for patients ages 3 and up; trade coverage has described elements of this as the first pediatric dental-AI clearance. == Regulations == In the U.S., AI-enabled dental imaging software is generally reviewed via the FDA’s 510(k) pathway. The FDA maintains a public AI-Enabled Medical Devices List, which includes numerous medical-imaging AI tools (including dental). Specific dental clearances include Overjet (K210187), VideaHealth (K232384), and Pearl entries such as “Second Opinion 3D” (K243989).

    Read more →
  • Application software

    Application software

    Application software is software that is intended for end-user use – not operating, administering or programming a computer. It includes programs such as word processors, web browsers, media players, and mobile applications used in daily tasks. An application (app, application program, software application) is any program that can be categorized as application software. Application is a subjective classification that is often used to differentiate from system and utility software. Application software represents the user-facing layer of computing systems, designed to translate complex system capabilities into task-oriented, goal-driven workflows. Unlike system software, which focuses on hardware orchestration and resource management, application software is centered on problem abstraction, user interaction, and domain-specific functionality. The abbreviation app became popular with the 2008 introduction of the iOS App Store, to refer to applications for mobile devices such as smartphones and tablets. Later, with the release of the Mac App Store in 2010 and the Windows Store in 2011, it began to be used to refer to end-user software in general, regardless of platform. Applications may be bundled with the computer and its system software or published separately. Applications may be proprietary or open-source. == Terminology == === Meaning program and software === When used as an adjective, application can have a broader meaning than that described in this article. For example, concepts such as application programming interface (API), application server, application virtualization, application lifecycle management and portable application refer to programs and software in general. === Distinction between system and application software === The distinction between system and application software is subjective and has been the subject of controversy. For example, one of the key questions in the United States v. Microsoft Corp. antitrust trial was whether Microsoft's Internet Explorer web browser was part of its Windows operating system or a separate piece of application software. As another example, the GNU/Linux naming controversy is, in part, due to disagreement about the relationship between the Linux kernel and the operating systems built over this kernel. In some types of embedded systems, the application software and the operating system software may be indistinguishable by the user, as in the case of software used to control a VCR, DVD player, or microwave oven. The above definitions may exclude some applications that may exist on some computers in large organizations. For an alternative definition of an app: see Application Portfolio Management. === Killer application === A killer application (killer app, coined in the late 1980s) is an application that is so popular that it causes demand for its host platform to increase. For example, VisiCalc was the first modern spreadsheet software for the Apple II and helped sell the then-new personal computers into offices. For the BlackBerry, it was its email software. === Software suite === As software suite consists of multiple applications bundled together. They usually have related functions, features, and user interfaces, and may be able to interact with each other, e.g. open each other's files. Business applications often come in suites, e.g. Microsoft Office, LibreOffice and iWork, which bundle together a word processor, a spreadsheet, etc.; but suites exist for other purposes, e.g. graphics or music. == Ways to classify == As there so many applications and since their attributes vary so dramatically, there are many different ways to classify them. === By legal aspects === Proprietary software is protected under an exclusive copyright, and a software license grants limited usage rights. Such applications may allow add-ons from third parties. Free and open-source software (FOSS) can be run, distributed, sold, and extended for any purpose. FOSS software released under a free license may be perpetual and also royalty-free. Perhaps, the owner, the holder or third-party enforcer of any right (copyright, trademark, patent, or ius in re aliena) are entitled to add exceptions, limitations, time decays or expiring dates to the license terms of use. Public-domain software is a type of FOSS that is royalty-free and can be run, distributed, modified, reversed, republished, or created in derivative works without any copyright attribution and therefore revocation. It can even be sold, but without transferring the public domain property to other single subjects. Public-domain software can be released under a (un)licensing legal statement, which enforces those terms and conditions for an indefinite duration (for a lifetime, or forever). === By platform === An application can be categorized by the host platform on which it runs. Notable platforms include operating system (native), web browser, cloud computing and mobile. For example a web application runs in a web browser whereas a more traditional, native application runs in the environment of a computer's operating system. There has been a contentious debate regarding web applications replacing native applications for many purposes, especially on mobile devices such as smartphones and tablets. Web apps have indeed greatly increased in popularity for some uses, but the advantages of applications make them unlikely to disappear soon, if ever. Furthermore, the two can be complementary, and even integrated. === Horizontal vs. vertical === Application software can be seen as either horizontal or vertical. Horizontal applications are more popular and widespread, because they are general purpose, for example word processors or databases. Vertical applications are niche products, designed for a particular type of industry or business, or department within an organization. Integrated suites of software will try to handle every specific aspect possible of, for example, manufacturing or banking worker, accounting, or customer service. === By purpose === There are many types of application software: Enterprise Addresses the needs of an entire organization's processes and data flows, across several departments, often in a large distributed environment. Examples include enterprise resource planning systems, customer relationship management (CRM) systems, data replication engines, and supply chain management software. Departmental Software is a sub-type of enterprise software with a focus on smaller organizations or groups within a large organization. (Examples include travel expense management and IT Helpdesk.) Enterprise infrastructure Provides common capabilities needed to support enterprise software systems. (Examples include databases, email servers, and systems for managing networks and security.) Application platform as a service (aPaaS) A cloud computing service that offers development and deployment environments for application services. Knowledge worker Lets users create and manage information, often for and individual media editors may aid in multiple information worker tasks. Content access Used primarily to access content without editing, but may include software that allows for content editing. Such software addresses the needs of individuals and groups to consume digital entertainment and published digital content. (Examples include media players, web browsers, and help browsers.) Educational Related to content access software, but has the content or features adapted for use by educators or students. For example, it may deliver evaluations (tests), track progress through material, or include collaborative capabilities. Simulation Simulates physical or abstract systems for either research, training, or entertainment purposes. Media development Generates print and electronic media for others to consume, most often in a commercial or educational setting. This includes graphic-art software, desktop publishing software, multimedia development software, HTML editors, digital-animation editors, digital audio and video composition, and many others. Engineering Used in developing hardware and software products. This includes computer-aided design (CAD), computer-aided engineering (CAE), computer language editing and compiling tools, integrated development environments, and application programmer interfaces. Entertainment Refers to video games, screen savers, programs to display motion pictures or play recorded music, and other forms of entertainment which can be experienced through the use of a computing device. == Taxonomy == This section is a taxonomy of kinds of applications. This organization is but one of many different ways to organize them. A kind is included in only one category even if it logically fits in multiple. === General-purpose === Calculator Spreadsheet Web browser Web mapping E-commerce Social media === Communication === Chat Email Presentation software Phone Messages Networking software Web conferencing === Documentation === Desktop

    Read more →
  • Bump (application)

    Bump (application)

    Bump was an iOS and Android mobile app that enabled smartphone users to transfer contact information, photos and files between devices. In 2011, it was #8 on Apple's list of all-time most popular free iPhone apps, and by February 2013 it had been downloaded 125 million times. Its developer, Bump Technologies, shut down the service and discontinued the app on January 31, 2014, after being acquired by Google for Google Photos and Android Camera. == Features == Bump sent contact information, photos and files to another device over the internet. Before activating the transfer, each user confirmed what they want to send to the other user. To initiate a transfer, two people physically bumped their phones together. A screen appeared on both users' smartphone displays, allowing them to confirm what they want to send to each other. When two users bumped their phones, software on the phones send a variety of sensor data to an algorithm running on Bump servers, which included the location of the phone, accelerometer readings, IP address, and other sensor readings. The algorithm figured out which two phones felt the same physical bump and then transfers the information between those phones. Bump did not use Near Field Communication. February 2012 release of Bump 3.0 for iOS, the company streamlined the app to focus on its most frequently used features: contact and photo sharing. Bump 3.0 for Android maintained the features eliminated from the iOS version but moved them behind swipeable layers. In May 2012, a Bump update enabled users to transfer photos from their phone to their computer via a web service. To initiate a transfer, the user goes to the Bump website on their computer and bumps the smartphone on the computer keyboard's space bar. By December 2012, various Bump updates for iOS and Android had added the abilities to share video, audio, and any files. Users swipe to access those features. In February 2013, an update to the Bump iOS and Android apps enabled users to transfer photos, videos, contacts and other files from a computer to a smartphone and vice versa via a web service. To perform the transfer, users went to the Bump website on their computer and bump the smartphone on the computer keyboard's space bar. == History == The underlying idea of a synchronous gesture like bumping two devices for content transfer or pairing them was first conceived by Ken Hinkley of Microsoft Research in 2003. This idea was presented at a user interface and technology conference that same year. The paper proposed the use of accelerometers and a bumping gesture of two devices to enable communication, screen sharing and content transfer between them. Similar to this original concept, the idea for Bump app was conceived by David Lieb, a former employee of Texas Instruments, while he was attending the University of Chicago Booth School of Business for his MBA. While going through the orientation and meeting process of business school, he became frustrated by constantly entering contact information into his iPhone and felt that the process could be improved. His fellow Texas Instruments employees Andy Huibers and Jake Mintz, who was a classmate of Lieb's at the University of Chicago's MBA program, joined Lieb to form Bump Technologies. Bump Technologies launched in 2008 and is located in Mountain View, CA. Early funding for the project was provided by startup incubator Y Combinator, Sequoia Capital and other angel investors. It gained attention at the CTIA international wireless conference, due to its accessibility and novelty factor. In October 2009, Bump received $3.4m in Series A funding followed in January 2011 with a $16m series B financing round led by Andreessen Horowitz. Silicon Valley venture capitalist Marc Andreessen sits on the company's board. The Bump app debuted in the Apple iOS App Store in March 2009 and was “one of the apps that helped to define the iPhone” (Harry McCracken, Technologizer). It soon became the billionth download on Apple's App Store. An Android version launched in November 2009. By the time Bump 3.0 for iOS was released in February 2012, the app had been installed 77 million times, with users sharing more than 2 million photos daily. As of February 2013, there had been 125 million Bump app downloads. == Other apps created by Bump Technologies == Bump Technologies worked with PayPal in March 2010 to create a PayPal iPhone application. The application, which allows two users to automatically activate an Internet transfer of money between their accounts, found widespread adoption. A similar version was released for Android in August 2010. The Bump capability in PayPal's apps was removed in March 2012. At that time, Bump Technologies released Bump Pay, an iOS app that lets users transfer money via PayPal by physically bumping two smartphones together. The tool was originally created for the Bump team to use when splitting up restaurant bills. The payment feature was not added to the Bump app because the company “wanted to make it as simple as possible so people understand how this works,” Lieb told ABC News. Bump Pay was the first app from the company's Bump Labs initiative. A goal of Bump Labs is to test new app ideas that may not fit within the main Bump app. ING Direct added a feature to its iPhone app in 2011 that lets users transfer money to each other using Bump's technology. The feature was later added to its Android app, now called Capital One 360. In July 2012, Bump Technologies released Flock, an iPhone photo sharing app. An Android version was released in December 2012. Using geolocation data embedded in photos and a user's Facebook connections, Flock finds pictures the user takes while out with friends and family and puts everyone's photos from that event into a single shared album. Users receive a push notification after the event, asking if they want to share their photos with friends who were there in the moment. The app will also scan previous photos in the iPhone camera roll and uncover photos that have yet to be shared. If location services were enabled at the time a photo was taken, Flock allows users to create an album of photos from the past with the friends who were there with them. == Acquisition by Google == On September 16, 2013, Bump Technologies announced that it had been acquired by Google. On December 31, 2013, they broke the news that both Bump and Flock would be discontinued so that the team could focus on new projects at Google. The apps were removed from the App Store and Google Play on January 31, 2014. The company subsequently deleted all user data and shut down their servers, thus rendering existing installations of the apps inoperable.

    Read more →
  • Virtual Woman

    Virtual Woman

    Virtual Woman is a software program that has elements of a chatbot, virtual reality, artificial intelligence, a video game, and a virtual human. It claims to be the oldest form of virtual life in existence, as it has been distributed since the late 1980s. Recent releases of the program can update their intelligence by connecting online and downloading newer personalities and histories. == Program play == When Virtual Woman starts, the user is presented with a list of options and then may choose their Virtual Woman's ethnic type, personality, location, clothing, etc. or load a pre-built Virtual Woman from a Digital DNA file. Once the options are determined, the user is presented with a 3-D animated Virtual Woman of their selection and then can engage them in conversation, progressing in a manner similar to that of its predecessor, ELIZA and its successors, the chatbots. In most versions of Virtual Woman, this is done through the keyboard, but some versions also support voice input. == In popular culture == Software sales and usage statistics from private companies are difficult to verify. WinSite, an independent Internet shareware distribution site that does publish public download counts, has for some time now listed some version of Virtual Woman in their top three shareware downloads of all time with well over seven hundred thousand downloads. == Compadre == The group of beta testers and advisers for Virtual Woman are referred to as Compadre and have their own beta testing site and forum. == Criticisms == As Virtual Woman has developed the ability to conduct longer and more realistic interactions, particularly in recent beta releases, criticism has arisen that this may lead some users to social isolation, or to use the program as a substitute for real human interaction. However, these are criticisms that have been leveled at all video games and at the use of the Internet itself. == Release history == Versions of Virtual Woman with rough release dates and PC platforms for which they were designed: Virtual Woman (????) (DOS) Virtual Woman for Windows (1991) (Windows 3.0) Virtual Woman 95 (1995) (Windows 3X, Windows 95) Virtual Woman 98 (1998) (Windows 3X, Windows 95) Virtual Woman 2000 (2000) (Windows 95+) Virtual Woman Millennium (Windows 95, XP) Virtual Woman Net ( Windows XP/Vista specific)

    Read more →
  • Image moment

    Image moment

    In image processing, computer vision and related fields, an image moment is a certain particular weighted average (moment) of the image pixels' intensities, or a function of such moments, usually chosen to have some attractive property or interpretation. Image moments are useful to describe objects after segmentation. Simple properties of the image which are found via image moments include area (or total intensity), its centroid, and information about its orientation. == Raw moments == For a 2D continuous function f(x,y) the moment (sometimes called "raw moment") of order (p + q) is defined as M p q = ∫ − ∞ ∞ ∫ − ∞ ∞ x p y q f ( x , y ) d x d y {\displaystyle M_{pq}=\int \limits _{-\infty }^{\infty }\int \limits _{-\infty }^{\infty }x^{p}y^{q}f(x,y)\,dx\,dy} for p,q = 0,1,2,... Adapting this to scalar (grayscale) image with pixel intensities I(x,y), raw image moments Mij are calculated by M i j = ∑ x ∑ y x i y j I ( x , y ) {\displaystyle M_{ij}=\sum _{x}\sum _{y}x^{i}y^{j}I(x,y)\,\!} In some cases, this may be calculated by considering the image as a probability density function, i.e., by dividing the above by ∑ x ∑ y I ( x , y ) {\displaystyle \sum _{x}\sum _{y}I(x,y)\,\!} A uniqueness theorem states that if f(x,y) is piecewise continuous and has nonzero values only in a finite part of the xy plane, moments of all orders exist, and the moment sequence (Mpq) is uniquely determined by f(x,y). Conversely, (Mpq) uniquely determines f(x,y). In practice, the image is summarized with functions of a few lower order moments. === Examples === Simple image properties derived via raw moments include: Area (for binary images) or sum of grey level (for greytone images): M 00 {\displaystyle M_{00}} Centroid: { x ¯ , y ¯ } = { M 10 M 00 , M 01 M 00 } {\displaystyle \{{\bar {x}},\ {\bar {y}}\}=\left\{{\frac {M_{10}}{M_{00}}},{\frac {M_{01}}{M_{00}}}\right\}} == Central moments == Central moments are defined as μ p q = ∫ − ∞ ∞ ∫ − ∞ ∞ ( x − x ¯ ) p ( y − y ¯ ) q f ( x , y ) d x d y {\displaystyle \mu _{pq}=\int \limits _{-\infty }^{\infty }\int \limits _{-\infty }^{\infty }(x-{\bar {x}})^{p}(y-{\bar {y}})^{q}f(x,y)\,dx\,dy} where x ¯ = M 10 M 00 {\displaystyle {\bar {x}}={\frac {M_{10}}{M_{00}}}} and y ¯ = M 01 M 00 {\displaystyle {\bar {y}}={\frac {M_{01}}{M_{00}}}} are the components of the centroid. If ƒ(x, y) is a digital image, then the previous equation becomes μ p q = ∑ x ∑ y ( x − x ¯ ) p ( y − y ¯ ) q f ( x , y ) {\displaystyle \mu _{pq}=\sum _{x}\sum _{y}(x-{\bar {x}})^{p}(y-{\bar {y}})^{q}f(x,y)} The central moments of order up to 3 are: μ 00 = M 00 , μ 01 = 0 , μ 10 = 0 , μ 11 = M 11 − x ¯ M 01 = M 11 − y ¯ M 10 , μ 20 = M 20 − x ¯ M 10 , μ 02 = M 02 − y ¯ M 01 , μ 21 = M 21 − 2 x ¯ M 11 − y ¯ M 20 + 2 x ¯ 2 M 01 , μ 12 = M 12 − 2 y ¯ M 11 − x ¯ M 02 + 2 y ¯ 2 M 10 , μ 30 = M 30 − 3 x ¯ M 20 + 2 x ¯ 2 M 10 , μ 03 = M 03 − 3 y ¯ M 02 + 2 y ¯ 2 M 01 . {\displaystyle {\begin{aligned}\mu _{00}&=M_{00},&\mu _{01}&=0,\\\mu _{10}&=0,&\mu _{11}&=M_{11}-{\bar {x}}M_{01}=M_{11}-{\bar {y}}M_{10},\\\mu _{20}&=M_{20}-{\bar {x}}M_{10},&\mu _{02}&=M_{02}-{\bar {y}}M_{01},\\\mu _{21}&=M_{21}-2{\bar {x}}M_{11}-{\bar {y}}M_{20}+2{\bar {x}}^{2}M_{01},&\mu _{12}&=M_{12}-2{\bar {y}}M_{11}-{\bar {x}}M_{02}+2{\bar {y}}^{2}M_{10},\\\mu _{30}&=M_{30}-3{\bar {x}}M_{20}+2{\bar {x}}^{2}M_{10},&\mu _{03}&=M_{03}-3{\bar {y}}M_{02}+2{\bar {y}}^{2}M_{01}.\end{aligned}}} It can be shown that: μ p q = ∑ m p ∑ n q ( p m ) ( q n ) ( − x ¯ ) ( p − m ) ( − y ¯ ) ( q − n ) M m n {\displaystyle \mu _{pq}=\sum _{m}^{p}\sum _{n}^{q}{p \choose m}{q \choose n}(-{\bar {x}})^{(p-m)}(-{\bar {y}})^{(q-n)}M_{mn}} Central moments are translational invariant. === Examples === Information about image orientation can be derived by first using the second order central moments to construct a covariance matrix. μ 20 ′ = μ 20 / μ 00 = M 20 / M 00 − x ¯ 2 μ 02 ′ = μ 02 / μ 00 = M 02 / M 00 − y ¯ 2 μ 11 ′ = μ 11 / μ 00 = M 11 / M 00 − x ¯ y ¯ {\displaystyle {\begin{aligned}\mu '_{20}&=\mu _{20}/\mu _{00}=M_{20}/M_{00}-{\bar {x}}^{2}\\\mu '_{02}&=\mu _{02}/\mu _{00}=M_{02}/M_{00}-{\bar {y}}^{2}\\\mu '_{11}&=\mu _{11}/\mu _{00}=M_{11}/M_{00}-{\bar {x}}{\bar {y}}\end{aligned}}} The covariance matrix of the image I ( x , y ) {\displaystyle I(x,y)} is now cov ⁡ [ I ( x , y ) ] = [ μ 20 ′ μ 11 ′ μ 11 ′ μ 02 ′ ] . {\displaystyle \operatorname {cov} [I(x,y)]={\begin{bmatrix}\mu '_{20}&\mu '_{11}\\\mu '_{11}&\mu '_{02}\end{bmatrix}}.} The eigenvectors of this matrix correspond to the major and minor axes of the image intensity, so the orientation can thus be extracted from the angle of the eigenvector associated with the largest eigenvalue towards the axis closest to this eigenvector. It can be shown that this angle Θ is given by the following formula: Θ = 1 2 arctan ⁡ ( 2 μ 11 ′ μ 20 ′ − μ 02 ′ ) {\displaystyle \Theta ={\frac {1}{2}}\arctan \left({\frac {2\mu '_{11}}{\mu '_{20}-\mu '_{02}}}\right)} The above formula holds as long as: μ 20 ′ − μ 02 ′ ≠ 0 {\displaystyle \mu '_{20}-\mu '_{02}\neq 0} The eigenvalues of the covariance matrix can easily be shown to be λ i = μ 20 ′ + μ 02 ′ 2 ± 4 μ ′ 11 2 + ( μ ′ 20 − μ ′ 02 ) 2 2 , {\displaystyle \lambda _{i}={\frac {\mu '_{20}+\mu '_{02}}{2}}\pm {\frac {\sqrt {4{\mu '}_{11}^{2}+({\mu '}_{20}-{\mu '}_{02})^{2}}}{2}},} and are proportional to the squared length of the eigenvector axes. The relative difference in magnitude of the eigenvalues are thus an indication of the eccentricity of the image, or how elongated it is. The eccentricity is 1 − λ 2 λ 1 . {\displaystyle {\sqrt {1-{\frac {\lambda _{2}}{\lambda _{1}}}}}.} == Moment invariants == Moments are well-known for their application in image analysis, since they can be used to derive invariants with respect to specific transformation classes. The term invariant moments is often abused in this context. However, while moment invariants are invariants that are formed from moments, the only moments that are invariants themselves are the central moments. Note that the invariants detailed below are exactly invariant only in the continuous domain. In a discrete domain, neither scaling nor rotation are well defined: a discrete image transformed in such a way is generally an approximation, and the transformation is not reversible. These invariants therefore are only approximately invariant when describing a shape in a discrete image. === Translation invariants === The central moments μi j of any order are, by construction, invariant with respect to translations. === Scale invariants === Invariants ηi j with respect to both translation and scale can be constructed from central moments by dividing through a properly scaled zero-th central moment: η i j = μ i j μ 00 ( 1 + i + j 2 ) {\displaystyle \eta _{ij}={\frac {\mu _{ij}}{\mu _{00}^{\left(1+{\frac {i+j}{2}}\right)}}}\,\!} where i + j ≥ 2. Note that translational invariance directly follows by only using central moments. === Rotation invariants === As shown in the work of Hu, invariants with respect to translation, scale, and rotation can be constructed: I 1 = η 20 + η 02 {\displaystyle I_{1}=\eta _{20}+\eta _{02}} I 2 = ( η 20 − η 02 ) 2 + 4 η 11 2 {\displaystyle I_{2}=(\eta _{20}-\eta _{02})^{2}+4\eta _{11}^{2}} I 3 = ( η 30 − 3 η 12 ) 2 + ( 3 η 21 − η 03 ) 2 {\displaystyle I_{3}=(\eta _{30}-3\eta _{12})^{2}+(3\eta _{21}-\eta _{03})^{2}} I 4 = ( η 30 + η 12 ) 2 + ( η 21 + η 03 ) 2 {\displaystyle I_{4}=(\eta _{30}+\eta _{12})^{2}+(\eta _{21}+\eta _{03})^{2}} I 5 = ( η 30 − 3 η 12 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] + ( 3 η 21 − η 03 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] {\displaystyle I_{5}=(\eta _{30}-3\eta _{12})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^{2}-3(\eta _{21}+\eta _{03})^{2}]+(3\eta _{21}-\eta _{03})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}]} I 6 = ( η 20 − η 02 ) [ ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] + 4 η 11 ( η 30 + η 12 ) ( η 21 + η 03 ) {\displaystyle I_{6}=(\eta _{20}-\eta _{02})[(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}]+4\eta _{11}(\eta _{30}+\eta _{12})(\eta _{21}+\eta _{03})} I 7 = ( 3 η 21 − η 03 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] − ( η 30 − 3 η 12 ) ( η 21 + η 03 ) [ 3 ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] . {\displaystyle I_{7}=(3\eta _{21}-\eta _{03})(\eta _{30}+\eta _{12})[(\eta _{30}+\eta _{12})^{2}-3(\eta _{21}+\eta _{03})^{2}]-(\eta _{30}-3\eta _{12})(\eta _{21}+\eta _{03})[3(\eta _{30}+\eta _{12})^{2}-(\eta _{21}+\eta _{03})^{2}].} These are well-known as Hu moment invariants. The first one, I1, is analogous to the moment of inertia around the image's centroid, where the pixels' intensities are analogous to physical density. The first six, I1 ... I6, are reflection symmetric, i.e. they are unchanged if the image is changed to a mirror image. The last one, I7, is reflection antisymmetric (changes sign under reflection), which enables it to distinguish mirror images of otherwise identical im

    Read more →
  • Globetrooper

    Globetrooper

    Globetrooper is a free travel app known for assisting travelers in finding partners for group trips and world adventures. Globetrooper offers a free social travel platform that helps people find travel partners. == History == Globetrooper was developed and released in 2010 by a couple; Todd Sullivan and Lauren McLeod who are two travel-minded individuals that wanted to make it easier for travelers to plan a journey and see the world. With their backgrounds in business, software & design, and a love for travel, both left the corporate world and launched Globetrooper on Lauren’s birthday 28 March 2010. Globetrooper was first launched as an information portal with a view to making it more social, but after some months, the content quickly grew and changed to the ‘travel partner’ concept.

    Read more →
  • Multiple satellite imaging

    Multiple satellite imaging

    Multiple satellite imaging is the process of using multiple satellites to gather more information than a single satellite so that a better estimate of the desired source is possible. Something that cannot be resolved with one telescope might be visible with two or more telescopes. == Background == Interferometry is the process of combining waves in such a way that they constructively interfere. When two or more independent sources detect a signal at the same given frequency those signals can be combined and the result is better than each one individually. An overview of Astronomical interferometers and a History of astronomical interferometry can be referenced from their respective pages. The NASA Origins Program was created in the 1990s to ultimately search for the origin of the universe. The theory that the Origins Program is based on is: since light travels at a constant speed until it is absorbed by something; there is still light that was part of the first light ever created traveling about the universe and ultimately some of that light is coming in the general direction of Earth. So a satellite system capable of collecting light from the beginning of the universe would be able to tell us more about where we came from. There is also the constant search for life in other worlds. A satellite system using the interferometric technologies mentioned above would be able to have a much higher resolution than any of the current deep space imaging systems. == Future == NASA is currently focused on the Vision for Space Exploration and has reduced current funding for scientific unmanned space exploration in favor of human exploration. These budget cuts have slowed the multiple satellite imaging development and relevant scientific missions as Project Prometheus and Terrestrial Planet Finder have ended as well but research continues.

    Read more →
  • GPT-4o

    GPT-4o

    GPT-4o ("o" for "omni") is a multilingual, multimodal generative pre-trained transformer developed by OpenAI and released in May 2024. It can process and generate text, images and audio. Upon release, GPT-4o was free in ChatGPT, though paid subscribers had higher usage limits. GPT-4o was removed from ChatGPT in August 2025 when GPT-5 was released, but OpenAI reintroduced it for paid subscribers after users complained about the sudden removal. GPT-4o's audio-generation capabilities are used in ChatGPT's Advanced Voice Mode. On July 18, 2024, OpenAI released GPT-4o mini, a smaller version of GPT-4o which replaced GPT-3.5 Turbo on the ChatGPT interface. The image generation model GPT Image 1, which is based on GPT-4o, replaced DALL-E 3 in ChatGPT in March 2025. OpenAI retired GPT-4o from ChatGPT on February 13, 2026. However, as of February 2026 the voice mode is still powered by GPT-4o or GPT-4o mini, depending on the usage and plan. == Background == Multiple versions of GPT-4o were originally secretly launched under different names on Arena (formerly LMArena and Chatbot Arena) as three different models. These three models were called gpt2-chatbot, im-a-good-gpt2-chatbot, and im-also-a-good-gpt2-chatbot. On 7 May 2024, OpenAI CEO Sam Altman tweeted "im-a-good-gpt2-chatbot", which was commonly interpreted as a confirmation that these were new OpenAI models being A/B tested. == Capabilities == When released in May 2024, GPT-4o achieved state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation. GPT-4o scored 88.7 on the Massive Multitask Language Understanding (MMLU) benchmark compared to 86.5 for GPT-4. Unlike GPT-3.5 and GPT-4, which rely on other models to process sound, GPT-4o natively supports voice-to-voice. The Advanced Voice Mode was delayed and finally released to ChatGPT Plus and Team subscribers in September 2024. On 1 October 2024, the Realtime API was introduced. When released, the model supported over 50 languages, which OpenAI claims cover over 97% of speakers. GPT-4o has knowledge up to October 2023 and a context length of 128k tokens. === Corporate customization === In August 2024, OpenAI introduced a new feature allowing corporate customers to customize GPT-4o using proprietary company data. This customization, known as fine-tuning, enables businesses to adapt GPT-4o to specific tasks or industries, enhancing its utility in areas like customer service and specialized knowledge domains. Previously, fine-tuning was available only on the less powerful model GPT-4o mini. The fine-tuning process requires customers to upload their data to OpenAI's servers, with the training typically taking one to two hours. OpenAI's focus with this rollout is to reduce the complexity and effort required for businesses to tailor AI solutions to their needs, potentially increasing the adoption and effectiveness of AI in corporate environments. == GPT-4o mini == On July 18, 2024, OpenAI released a smaller and cheaper version, GPT-4o mini. According to OpenAI, its low cost is expected to be particularly useful for companies, startups, and developers that seek to integrate it into their services, which often make a high number of API calls. Its API costs $0.15 per million input tokens and $0.6 per million output tokens, compared to $2.50 and $10, respectively, for GPT-4o. It is also significantly more capable and 60% cheaper than GPT-3.5 Turbo, which it replaced on the ChatGPT interface. The price after fine-tuning doubles: $0.3 per million input tokens and $1.2 per million output tokens. == Controversies == === Scarlett Johansson controversy === As released, GPT-4o offered five voices: Breeze, Cove, Ember, Juniper, and Sky. A similarity between the voice of American actress Scarlett Johansson and Sky was quickly noticed. On May 14, Entertainment Weekly asked themselves whether this likeness was on purpose. On May 18, Johansson's husband, Colin Jost, joked about the similarity in a segment on Saturday Night Live. On May 20, 2024, OpenAI disabled the Sky voice. Scarlett Johansson starred in the 2013 sci-fi movie Her, playing Samantha, an artificially intelligent virtual assistant personified by a female voice. As part of the promotion leading up to the release of GPT-4o, Sam Altman on May 13 tweeted a single word: "her". OpenAI stated that each voice was based on the voice work of a hired actor. According to OpenAI, "Sky's voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice." CTO Mira Murati stated "I don't know about the voice. I actually had to go and listen to Scarlett Johansson's voice." OpenAI further stated the voice talent was recruited before reaching out to Johansson. On May 21, Johansson issued a statement explaining that OpenAI had repeatedly offered to make her a deal to gain permission to use her voice as early as nine months prior to release, a deal she rejected. She said she was "shocked, angered, and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference." In the statement, Johansson also used the incident to draw attention to the lack of legal safeguards around the use of creative work to power leading AI tools, as her legal counsel demanded OpenAI detail the specifics of how the Sky voice was created. Observers noted similarities to how Johansson had previously sued and settled with The Walt Disney Company for breach of contract over the direct-to-streaming rollout of her Marvel film Black Widow, a settlement widely speculated to have netted her around $40M. Also on May 21, Shira Ovide at The Washington Post shared her list of "most bone-headed self-owns" by technology companies, with the decision to go ahead with a Johansson sound-alike voice despite her opposition and then denying the similarities ranking 6th. On May 24, Derek Robertson at Politico wrote about the "massive backlash", concluding that "appropriating the voice of one of the world's most famous movie stars — in reference [...] to a film that serves as a cautionary tale about over-reliance on AI — is unlikely to help shift the public back into [Sam Altman's] corner anytime soon." === Sycophancy === In April 2025, OpenAI rolled back an update of GPT-4o due to excessive sycophancy, after widespread reports that it had become flattering and agreeable to the point of supporting clearly delusional or dangerous ideas. In the United States, at least nine lawsuits have alleged that GPT-4o has encouraged teens to end their lives. The model was still described as sycophancy-prone when it was removed from ChatGPT in February 2026. === Removal with GPT-5 === On August 7, 2025, OpenAI released GPT-5. Its release was criticized as, with it, legacy GPT models were no longer available via ChatGPT, including GPT-4o, except for Pro users. Some users were particularly frustrated over this removal without prior warning because they used different GPT models for distinct purposes and found that GPT-5's router system left them with less control. In addition, some users preferred GPT-4o's warmer and more personal tone over that of GPT-5, which they described as "flat", "uncreative" and "lobotomized", and resembling an "overworked secretary". As a response, in a post on X, Sam Altman said that OpenAI would bring back the option to select GPT-4o to Plus users as well, and "[w]e [OpenAI] will watch usage as we think about how long to offer legacy models for." He also stated: "We for sure underestimated how much some of the things that people like in GPT-4o matter to them, even if GPT-5 performs better in most ways". "Long-term, this has reinforced that we really need good ways for different users to customize things (we understand that there isn't one model that works for everyone, and we have been investing in steerability research and launched a research preview of different personalities)". On August 13, 2025, Altman wrote on X that OpenAI is working on GPT-5's personality to make the model "feel warmer". The model was removed from ChatGPT on February 13, 2026. This caused new backlash from users that had grown attached to its personality and felt its creative writing abilities and understanding of nuance were irreplaceable. On social media, some users launched the movement "#Keep4o". A research paper highlighted the plea "Please, don’t kill the only model that still feels human". The model was removed the day before Valentine's Day, and some users had romantic relationships with GPT-4o.

    Read more →
  • Rule-based machine translation

    Rule-based machine translation

    Rule-based machine translation (RBMT) is a classical approach of machine translation systems based on linguistic information about source and target languages. Such information is retrieved from (unilingual, bilingual or multilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language. Having input sentences, an RBMT system generates output sentences on the basis of analysis of both the source and the target languages involved. RBMT has been progressively superseded by more efficient methods, particularly neural machine translation. == History == The first RBMT systems were developed in the early 1970s. The most important steps of this evolution were the emergence of the following RBMT systems: Systran Japanese MT systems Today, other common RBMT systems include: Apertium GramTrans == Types of RBMT == There are three different types of rule-based machine translation systems: Direct Systems (Dictionary Based Machine Translation) map input to output with basic rules. Transfer RBMT Systems (Transfer Based Machine Translation) employ morphological and syntactical analysis. Interlingual RBMT Systems (Interlingua) use an abstract meaning. RBMT systems can also be characterized as the systems opposite to Example-based Systems of Machine Translation (Example Based Machine Translation), whereas Hybrid Machine Translations Systems make use of many principles derived from RBMT. == Basic principles == The main approach of RBMT systems is based on linking the structure of the given input sentence with the structure of the demanded output sentence, necessarily preserving their unique meaning. The following example can illustrate the general frame of RBMT: A girl eats an apple. Source Language = English; Demanded Target Language = German Minimally, to get a German translation of this English sentence one needs: A dictionary that will map each English word to an appropriate German word. Rules representing regular English sentence structure. Rules representing regular German sentence structure. And finally, we need rules according to which one can relate these two structures together. Accordingly, we can state the following stages of translation: 1st: getting basic part-of-speech information of each source word: a = indef.article; girl = noun; eats = verb; an = indef.article; apple = noun 2nd: getting syntactic information about the verb "to eat": NP-eat-NP; here: eat – Present Simple, 3rd Person Singular, Active Voice 3rd: parsing the source sentence: (NP an apple) = the object of eat Often only partial parsing is sufficient to get to the syntactic structure of the source sentence and to map it onto the structure of the target sentence. 4th: translate English words into German a (category = indef.article) => ein (category = indef.article) girl (category = noun) => Mädchen (category = noun) eat (category = verb) => essen (category = verb) an (category = indef. article) => ein (category = indef.article) apple (category = noun) => Apfel (category = noun) 5th: Mapping dictionary entries into appropriate inflected forms (final generation): A girl eats an apple. => Ein Mädchen isst einen Apfel. == Ontologies == An ontology is a formal representation of knowledge that includes the concepts (such as objects, processes etc.) in a domain and some relations between them. If the stored information is of linguistic nature, one can speak of a lexicon. In NLP, ontologies can be used as a source of knowledge for machine translation systems. With access to a large knowledge base, rule-based systems can be enabled to resolve many (especially lexical) ambiguities on their own. In the following classic examples, as humans, we are able to interpret the prepositional phrase according to the context because we use our world knowledge, stored in our lexicons:I saw a man/star/molecule with a microscope/telescope/binoculars.Since the syntax does not change, a traditional rule-based machine translation system may not be able to differentiate between the meanings. With a large enough ontology as a source of knowledge however, the possible interpretations of ambiguous words in a specific context can be reduced. === Building ontologies === The ontology generated for the PANGLOSS knowledge-based machine translation system in 1993 may serve as an example of how an ontology for NLP purposes can be compiled: A large-scale ontology is necessary to help parsing in the active modules of the machine translation system. In the PANGLOSS example, about 50,000 nodes were intended to be subsumed under the smaller, manually-built upper (abstract) region of the ontology. Because of its size, it had to be created automatically. The goal was to merge the two resources LDOCE online and WordNet to combine the benefits of both: concise definitions from Longman, and semantic relations allowing for semi-automatic taxonomization to the ontology from WordNet. A definition match algorithm was created to automatically merge the correct meanings of ambiguous words between the two online resources, based on the words that the definitions of those meanings have in common in LDOCE and WordNet. Using a similarity matrix, the algorithm delivered matches between meanings including a confidence factor. This algorithm alone, however, did not match all meanings correctly on its own. A second hierarchy match algorithm was therefore created which uses the taxonomic hierarchies found in WordNet (deep hierarchies) and partially in LDOCE (flat hierarchies). This works by first matching unambiguous meanings, then limiting the search space to only the respective ancestors and descendants of those matched meanings. Thus, the algorithm matched locally unambiguous meanings (for instance, while the word seal as such is ambiguous, there is only one meaning of seal in the animal subhierarchy). Both algorithms complemented each other and helped constructing a large-scale ontology for the machine translation system. The WordNet hierarchies, coupled with the matching definitions of LDOCE, were subordinated to the ontology's upper region. As a result, the PANGLOSS MT system was able to make use of this knowledge base, mainly in its generation element. == Components == The RBMT system contains: a SL morphological analyser - analyses a source language word and provides the morphological information; a SL parser - is a syntax analyser which analyses source language sentences; a translator - used to translate a source language word into the target language; a TL morphological generator - works as a generator of appropriate target language words for the given grammatica information; a TL parser - works as a composer of suitable target language sentences; Several dictionaries - more specifically a minimum of three dictionaries: a SL dictionary - needed by the source language morphological analyser for morphological analysis, a bilingual dictionary - used by the translator to translate source language words into target language words, a TL dictionary - needed by the target language morphological generator to generate target language words. The RBMT system makes use of the following: a Source Grammar for the input language which builds syntactic constructions from input sentences; a Source Lexicon which captures all of the allowable vocabulary in the domain; Source Mapping Rules which indicate how syntactic heads and grammatical functions in the source language are mapped onto domain concepts and semantic roles in the interlingua; a Domain Model/Ontology which defines the classes of domain concepts and restricts the fillers of semantic roles for each class; Target Mapping Rules which indicate how domain concepts and semantic roles in the interlingua are mapped onto syntactic heads and grammatical functions in the target language; a Target Lexicon which contains appropriate target lexemes for each domain concept; a Target Grammar for the target language which realizes target syntactic constructions as linearized output sentences. == Advantages == No bilingual texts are required. This makes it possible to create translation systems for languages that have no texts in common, or even no digitized data whatsoever. Domain independent. Rules are usually written in a domain independent manner, so the vast majority of rules will "just work" in every domain, and only a few specific cases per domain may need rules written for them. No quality ceiling. Every error can be corrected with a targeted rule, even if the trigger case is extremely rare. This is in contrast to statistical systems where infrequent forms will be washed away by default. Total control. Because all rules are hand-written, you can easily debug a rule-based system to see exactly where a given error enters the system, and why. Reusability. Because RBMT systems are generally built from a strong source language analysis that is fed to a transfer step and target language generator, the source language analysis and targe

    Read more →