AI Assistant Reddit

AI Assistant Reddit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Tute Genomics

    Tute Genomics

    Tute Genomics was an American genomics startup that provided a cloud-based web application for rapid and accurate annotation of human genomic data. It was built on the expertise of ANNOVAR. Tute Genomics assisted researchers in identifying disease genes and biomarkers, and assisted clinicians/labs in performing genetic diagnosis. Based in Provo, Utah, Tute was co-founded by Dr. Kai Wang, an assistant professor at the University of Southern California (USC); and Dr. Reid J. Robison, a board-certified psychiatrist with fellowship training in both neurodevelopmental genetics and bioinformatics. Tute Genomics was acquired by PierianDX in 2016. == History == The word "tute" means "personal" in the Na’vi language created for the 2009 film Avatar by Paul Frommer, a linguist and communications professor at the USC Marshall School of Business. === Timeline === 2013 Tute Genomics launched in 2013 and entered the accelerator, BoomStartup. By "demo day" of BoomStartup, Tute had raised their seed round of funding and expanded the round to include angel investors from SLC Angels, Park City Angels, Life Science Angels. Tute was the tenth ever online syndicate for AngelList and in all raised a seed round of $1.5 million. 2014 In March 2014, the company announced that Affiliated Genetics, a Utah-based CLIA-certified laboratory, selected Tute Genomics for its next-generation sequencing (NGS) analytics pipeline. In May 2014, the company announced joining the Global Alliance for Genomics and Health. In June 2014, Advanced Biological Laboratories (ABL), S.A., announced a licensing and collaboration agreement with Tute Genomics and the commercial launch of OncoChek for managing and analysing genomics data in the field of oncology. In July 2014, the company announced an agreement with Lineagen, Inc., to provide next-generation sequencing analytics for Lineagen’s NextStepDx Plus assay. Also, Brigham Young University selected the Tute Genomics genome annotation and discovery platform for analysis and interpretation of 1,000 exomes and genomes. In November 2014, the company announced addition of the Tute platform to Illumina’s BaseSpace. The company announced a Series A1 funding round of $2.3 million in December 2014. The round was led by UK-based Eurovestech. Peak Ventures and a number of angel investors also participated in this round. 2015 Tute recruits David Mittelman, founder of Arpeggi, Inc. and former CSO at FamilyTreeDNA, to Tute Genomics as Chief Scientific Officer. Tute acquires Knome and integrates the KnoSys platform into its software product. 2016 Reid Robison, Tute CEO, launches a Kickstarter campaign to sell Tute interpreted whole genome and whole exome sequencing directly to consumers. The campaign was suspended within the same month after receiving a letter from the United States Food and Drug Administration. Tute is acquired by PierianDX.

    Read more →
  • Albert One

    Albert One

    Albert One is an artificial intelligence chatbot created by Robby Garner and designed to mimic the way humans make conversations using a multi-faceted approach in natural language programming. == History == In both 1998 and 1999, Albert One won the Loebner Prize Contest, a competition between chatterbots. Some parts of Albert were deployed on the internet beginning in 1995, to gather information about what kinds of things people would say to a chatterbot. Another element of Albert One involved the building of a large database of human statements, and associated replies. This portion of the project was tested at the 1994-1997 Loebner Prize contests. Albert was the first of Robby Garner's multifaceted bots. The Albert One system was composed of several subsystems. Among those were a version of Eliza, the therapist, Elivs, another Eliza-like bot, and several other helper applications working together in a hierarchical arrangement. As a continuation of the stimulus-response library, various other database queries and assertions were tested to arrive at each of Albert's responses. Robby went on to develop networked examples of this kind of hierarchical "glue" at The Turing Hub.

    Read more →
  • Kruti

    Kruti

    Kruti is a multilingual AI agent and chatbot developed by the Indian company Ola Krutrim. It is designed to perform real-world tasks for users, such as booking taxis and ordering food, by integrating directly with various online services. It is notable for its ability to understand and respond in multiple Indian languages. Developed by a team founded by Bhavish Aggarwal, Kruti functions as an "agentic" AI, meaning it can reason, plan, and execute multi-step tasks to fulfill a user's request. The backend technology combines several open-source large language models with Ola's proprietary Krutrim V2 model. The system was developed to work primarily on smartphones, addressing the Indian market's specific needs, including language diversity and potential bandwidth constraints. Kruti was officially released in June 2025, replacing an earlier chatbot from the company that was also named Krutrim. Initially supporting 13 languages, the company plans to expand its capabilities to 22 Indian languages. == Background == Kruti is an improved version of Ola's Krutrim chatbot, which was first launched in 2023 and was intended to be replaced by Kruti. It was officially released on 12 June 2025 as an upgrade to passive chatbots, with support for text and voice in 13 Indian languages. As an agentic AI, it can execute tasks with customization and reasoning, providing adaptive answers based on user preferences and past interactions. Kruti is optimized for smartphone usage and designed to accommodate bandwidth constraints and usage patterns in India. To ensure scalability and cost-effective performance, it combines various open-source large language models with Ola's own Krutrim V2, which has 12 billion parameters. Its speech recognition is built to identify regional Indian languages, dialects, and accents. Due to its integration with numerous apps and services, Kruti is context-aware and can proactively complete tasks. Initially connected only with Ola ecosystem services, Krutrim intends to expand and incorporate various Indian services into Kruti, with the goal of adding services from Blinkit, Swiggy, and Uber with respective voice command support. On 20 June 2025, Krutrim acquired the AI platform BharatSah‘AI’yak to increase its involvement in government, education, and agriculture projects. This acquisition will allow Kruti to assist in broadening the scope of BharatSah'AI'yak's work on India-centric, vernacular retrieval-augmented generation AI bots. == Development == Kruti is designed to perform tasks with minimal user input, accepting documents, images, and text, without requiring users to switch between applications. Its agentic framework breaks queries into sub-tasks executed by multiple agents working sequentially or concurrently, with reported accuracy exceeding 90%. Kruti connects to company databases and APIs via the Model Context Protocol and presents responses as summaries, tables, or narratives adapted to user behaviour. The system supports payments via credit/debit cards and UPI. The underlying stack, which includes foundation models and AI training and inference systems, is intended to support adaptation across sectors such as healthcare, education, and finance. Ola Cabs and the Open Network for Digital Commerce have begun integrating Kruti into their platforms pending broader reliability testing.

    Read more →
  • Lexical substitution

    Lexical substitution

    Lexical substitution is the task of identifying a substitute for a word in the context of a clause. For instance, given the following text: "After the match, replace any remaining fluid deficit to prevent chronic dehydration throughout the tournament", a substitute of game might be given. Lexical substitution is strictly related to word sense disambiguation (WSD), in that both aim to determine the meaning of a word. However, while WSD consists of automatically assigning the appropriate sense from a fixed sense inventory, lexical substitution does not impose any constraint on which substitute to choose as the best representative for the word in context. By not prescribing the inventory, lexical substitution overcomes the issue of the granularity of sense distinctions and provides a level playing field for automatic systems that automatically acquire word senses (a task referred to as Word Sense Induction). == Evaluation == In order to evaluate automatic systems on lexical substitution, a task was organized at the Semeval-2007 evaluation competition held in Prague in 2007. A Semeval-2010 task on cross-lingual lexical substitution has also taken place. == Skip-gram model == The skip-gram model takes words with similar meanings into a vector space (collection of objects that can be added together and multiplied by numbers) that are found close to each other in N-dimensions (list of items). A variety of neural networks (computer system modeled after a human brain) are formed together as a result of the vectors and networks that are related together. This all occurs in the dimensions of the vocabulary that has been generated in a network. The model has been used in lexical substitution automation and prediction algorithms. One such algorithm developed by Oren Melamud, Omer Levy, and Ido Dagan uses the skip-gram model to find a vector for each word and its synonyms. Then, it calculates the cosine distance between vectors to determine which words will be the best substitutes. === Example === In a sentence like "The dog walked at a quick pace" each word has a specific vector in relation to the other. The vector for "The" would be [1,0,0,0,0,0,0] because the 1 is the word vocabulary and the 0s are the words surrounding that vocabulary, which create a vector.

    Read more →
  • Point distribution model

    Point distribution model

    The point distribution model is a model for representing the mean geometry of a shape and some statistical modes of geometric variation inferred from a training set of shapes. == Background == The point distribution model concept has been developed by Cootes, Taylor et al. and became a standard in computer vision for the statistical study of shape and for segmentation of medical images where shape priors really help interpretation of noisy and low-contrasted pixels/voxels. The latter point leads to active shape models (ASM) and active appearance models (AAM). Point distribution models rely on landmark points. A landmark is an annotating point posed by an anatomist onto a given locus for every shape instance across the training set population. For instance, the same landmark will designate the tip of the index finger in a training set of 2D hands outlines. Principal component analysis (PCA), for instance, is a relevant tool for studying correlations of movement between groups of landmarks among the training set population. Typically, it might detect that all the landmarks located along the same finger move exactly together across the training set examples showing different finger spacing for a flat-posed hands collection. == Details == First, a set of training images are manually landmarked with enough corresponding landmarks to sufficiently approximate the geometry of the original shapes. These landmarks are aligned using the generalized procrustes analysis, which minimizes the least squared error between the points. k {\displaystyle k} aligned landmarks in two dimensions are given as X = ( x 1 , y 1 , … , x k , y k ) {\displaystyle \mathbf {X} =(x_{1},y_{1},\ldots ,x_{k},y_{k})} . It's important to note that each landmark i ∈ { 1 , … k } {\displaystyle i\in \lbrace 1,\ldots k\rbrace } should represent the same anatomical location. For example, landmark #3, ( x 3 , y 3 ) {\displaystyle (x_{3},y_{3})} might represent the tip of the ring finger across all training images. Now the shape outlines are reduced to sequences of k {\displaystyle k} landmarks, so that a given training shape is defined as the vector X ∈ R 2 k {\displaystyle \mathbf {X} \in \mathbb {R} ^{2k}} . Assuming the scattering is gaussian in this space, PCA is used to compute normalized eigenvectors and eigenvalues of the covariance matrix across all training shapes. The matrix of the top d {\displaystyle d} eigenvectors is given as P ∈ R 2 k × d {\displaystyle \mathbf {P} \in \mathbb {R} ^{2k\times d}} , and each eigenvector describes a principal mode of variation along the set. Finally, a linear combination of the eigenvectors is used to define a new shape X ′ {\displaystyle \mathbf {X} '} , mathematically defined as: X ′ = X ¯ + P b {\displaystyle \mathbf {X} '={\overline {\mathbf {X} }}+\mathbf {P} \mathbf {b} } where X ¯ {\displaystyle {\overline {\mathbf {X} }}} is defined as the mean shape across all training images, and b {\displaystyle \mathbf {b} } is a vector of scaling values for each principal component. Therefore, by modifying the variable b {\displaystyle \mathbf {b} } an infinite number of shapes can be defined. To ensure that the new shapes are all within the variation seen in the training set, it is common to only allow each element of b {\displaystyle \mathbf {b} } to be within ± {\displaystyle \pm } 3 standard deviations, where the standard deviation of a given principal component is defined as the square root of its corresponding eigenvalue. PDM's can be extended to any arbitrary number of dimensions, but are typically used in 2D image and 3D volume applications (where each landmark point is R 2 {\displaystyle \mathbb {R} ^{2}} or R 3 {\displaystyle \mathbb {R} ^{3}} ). == Discussion == An eigenvector, interpreted in euclidean space, can be seen as a sequence of k {\displaystyle k} euclidean vectors associated to corresponding landmark and designating a compound move for the whole shape. Global nonlinear variation is usually well handled provided nonlinear variation is kept to a reasonable level. Typically, a twisting nematode worm is used as an example in the teaching of kernel PCA-based methods. Due to the PCA properties: eigenvectors are mutually orthogonal, form a basis of the training set cloud in the shape space, and cross at the 0 in this space, which represents the mean shape. Also, PCA is a traditional way of fitting a closed ellipsoid to a Gaussian cloud of points (whatever their dimension): this suggests the concept of bounded variation. The idea behind PDMs is that eigenvectors can be linearly combined to create an infinity of new shape instances that will 'look like' the one in the training set. The coefficients are bounded alike the values of the corresponding eigenvalues, so as to ensure the generated 2n/3n-dimensional dot will remain into the hyper-ellipsoidal allowed domain—allowable shape domain (ASD).

    Read more →
  • Natural language understanding

    Natural language understanding

    Natural language understanding (NLU) or natural language interpretation (NLI) is a subset of natural language processing in artificial intelligence that deals with machine reading comprehension. NLU has been considered an AI-hard problem. There is considerable commercial interest in the field because of its application to automated reasoning, machine translation, question answering, news-gathering, text categorization, voice-activation, archiving, and large-scale content analysis. == History == The program STUDENT, written in 1964 by Daniel Bobrow for his PhD dissertation at MIT, is one of the earliest known attempts at NLU by a computer. Eight years after John McCarthy coined the term artificial intelligence, Bobrow's dissertation (titled Natural Language Input for a Computer Problem Solving System) showed how a computer could understand simple natural language input to solve algebra word problems. A year later, in 1965, Joseph Weizenbaum at MIT wrote ELIZA, an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy. ELIZA worked by simple parsing and substitution of key words into canned phrases and Weizenbaum sidestepped the problem of giving the program a database of real-world knowledge or a rich lexicon. Yet ELIZA gained surprising popularity as a toy project and can be seen as a very early precursor to current commercial systems such as those used by Ask.com. In 1969, Roger Schank at Stanford University introduced the conceptual dependency theory for NLU. This model, partially influenced by the work of Sydney Lamb, was extensively used by Schank's students at Yale University, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner. In 1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input. Instead of phrase structure rules ATNs used an equivalent set of finite-state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years. In 1971, Terry Winograd finished writing SHRDLU for his PhD thesis at MIT. SHRDLU could understand simple English sentences in a restricted world of children's blocks to direct a robotic arm to move items. The successful demonstration of SHRDLU provided significant momentum for continued research in the field. Winograd continued to be a major influence in the field with the publication of his book Language as a Cognitive Process. At Stanford, Winograd would later advise Larry Page, who co-founded Google. In the 1970s and 1980s, the natural language processing group at SRI International continued research and development in the field. A number of commercial efforts based on the research were undertaken, e.g., in 1982 Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers. However, with the advent of mouse-driven graphical user interfaces, Symantec changed direction. A number of other commercial efforts were started around the same time, e.g., Larry R. Harris at the Artificial Intelligence Corporation and Roger Schank and his students at Cognitive Systems Corp. In 1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert. The third millennium saw the introduction of systems using machine learning for text classification, such as the IBM Watson. However, experts debate how much "understanding" such systems demonstrate: e.g., according to John Searle, Watson did not even understand the questions. John Ball, cognitive scientist and inventor of the Patom Theory, supports this assessment. Natural language processing has made inroads for applications to support human productivity in service and e-commerce, but this has largely been made possible by narrowing the scope of the application. There are thousands of ways to request something in a human language that still defies conventional natural language processing. According to Wibe Wagemans, "To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork." == Scope and context == The umbrella term "natural language understanding" can be applied to a diverse set of computer applications, ranging from small, relatively simple tasks such as short commands issued to robots, to highly complex endeavors such as the full comprehension of newspaper articles or poetry passages. Many real-world applications fall between the two extremes, for instance text classification for the automatic analysis of emails and their routing to a suitable department in a corporation does not require an in-depth understanding of the text, but needs to deal with a much larger vocabulary and more diverse syntax than the management of simple queries to database tables with fixed schemata. Throughout the years various attempts at processing natural language or English-like sentences presented to computers have taken place at varying degrees of complexity. Some attempts have not resulted in systems with deep understanding, but have helped overall system usability. For example, Wayne Ratliff originally developed the Vulcan program with an English-like syntax to mimic the English speaking computer in Star Trek. Vulcan later became the dBase system whose easy-to-use syntax effectively launched the personal computer database industry. Systems with an easy-to-use or English-like syntax are, however, quite distinct from systems that use a rich lexicon and include an internal representation (often as first order logic) of the semantics of natural language sentences. Hence the breadth and depth of "understanding" aimed at by a system determine both the complexity of the system (and the implied challenges) and the types of applications it can deal with. The "breadth" of a system is measured by the sizes of its vocabulary and grammar. The "depth" is measured by the degree to which its understanding approximates that of a fluent native speaker. At the narrowest and shallowest, English-like command interpreters require minimal complexity, but have a small range of applications. Narrow but deep systems explore and model mechanisms of understanding, but they still have limited application. Systems that attempt to understand the contents of a document such as a news release beyond simple keyword matching and to judge its suitability for a user are broader and require significant complexity, but they are still somewhat shallow. Systems that are both very broad and very deep are beyond the current state of the art. == Components and architecture == Regardless of the approach used, most NLU systems share some common components. The system needs a lexicon of the language and a parser and grammar rules to break sentences into an internal representation. The construction of a rich lexicon with a suitable ontology requires significant effort, e.g., the Wordnet lexicon required many person-years of effort. The system also needs theory from semantics to guide the comprehension. The interpretation capabilities of a language-understanding system depend on the semantic theory it uses. Competing semantic theories of language have specific trade-offs in their suitability as the basis of computer-automated semantic interpretation. These range from naive semantics or stochastic semantic analysis to the use of pragmatics to derive meaning from context. Semantic parsers convert natural-language texts into formal meaning representations. Advanced applications of NLU also attempt to incorporate logical inference within their framework. This is generally achieved by mapping the derived meaning into a set of assertions in predicate logic, then using logical deduction to arrive at conclusions. Therefore, systems based on functional languages such as Lisp need to include a subsystem to represent logical assertions, while logic-oriented systems such as those using the language Prolog generally rely on an extension of the built-in logical representation framework. The management of context in NLU can present special challenges. A large variety of examples and counter examples have resulted in multiple approaches to the formal modeling of context, each with specific strengths and weaknesses.

    Read more →
  • VideoPoet

    VideoPoet

    VideoPoet is a large language model developed by Google Research in 2023 for video making. It can be asked to animate still images. The model accepts text, images, and videos as inputs, with a program to add feature for any input to any format generated content. VideoPoet was publicly announced on December 19, 2023. It uses an autoregressive language model.

    Read more →
  • Fred (chatbot)

    Fred (chatbot)

    Fred, or FRED, was an early chatbot written by Robby Garner. == History == The name Fred was initially suggested by Karen Lindsey, and then Robby jokingly came up with an acronym, "Functional Response Emulation Device." Fred has also been implemented as a Java application by Paco Nathan called JFRED Archived 2008-08-24 at the Wayback Machine. Fred Chatterbot is designed to explore Natural Language communications between people and computer programs. In particular, this is a study of conversation between people and ways that a computer program can learn from other people's conversations to make its own conversations. Fred used a minimalistic "stimulus-response" approach. It worked by storing a database of statements and their responses, and made its own reply by looking up the input statements made by a user and then rendering the corresponding response from the database. This approach simplified the complexity of the rule base, but required expert coding and editing for modifications. Fred was a predecessor to Albert One, which Garner used in 1998 and 1999 to win the Loebner Prize.

    Read more →
  • INaturalist

    INaturalist

    iNaturalist is an American 501(c)(3) nonprofit social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe. iNaturalist may be accessed via its website or from its mobile applications. iNaturalist includes an automated species identification tool, and users further assist each other in identifying organisms from photographs and sound recordings. As of 5 August 2025, iNaturalist users had contributed nearly 300 million observations of plants, animals, fungi, and other organisms worldwide, and 400,000 users were active in the previous 30 days. iNaturalist serves as an important resource of open data for biodiversity research, conservation, and education, describing itself as "an online social network of people sharing biodiversity information to help each other learn about nature." It is the primary application for crowd-sourced biodiversity data in places such as Mexico, southern Africa, and Australia, and the project has been called "a standard-bearer for natural history mobile applications." Most of iNaturalist's software is open source. It has contributed to over 4,000 research papers and is widely used by scientists, land managers, and conservationists worldwide. The platform has also been active in the discovery of new species and rediscovery of species previously assumed to be extinct. == History == iNaturalist began in 2008 as a UC Berkeley School of Information Master's final project of Nate Agrin, Jessica Kline, and Ken-ichi Ueda. Agrin and Ueda continued work on the site with Sean McGregor, a web developer. In 2011, Ueda began collaboration with Scott Loarie, a research fellow at Stanford University and lecturer at UC Berkeley. Ueda and Loarie are the current co-directors of iNaturalist.org. The organization merged with the California Academy of Sciences on 24 April 2014. In 2017, iNaturalist became a joint initiative between the California Academy of Sciences and the National Geographic Society. With these collaborations and growing popularity of the site since 2012, the number of participants and observations has roughly doubled each year. In 2014, iNaturalist reached 1 million observations. Later, as of October 2023, there were 181 million observations (163 million verifiable). On 11 July 2023 iNaturalist announced its status as a newly independent 501(c)(3) nonprofit organization. === Google AI controversy === On 9 June 2025 Google announced that iNaturalist would be part of its "Generative AI Accelerator". This announcement, paired with the initial lack of information on the iNaturalist site, led to outcry from many iNaturalist users in the blog comments and forum, worrying about the consequences for the environment, volunteer engagement, reliability and raised questions about the decision making within iNaturalist, while some saw the backlash as a sign that people want to resist 'corrosive technologies'. PZ Myers, a biology professor who uses iNaturalist in his teaching, published an article on his website Pharyngula stating that "any decision that drives people away and replaces them with a hallucinating bot is a bad decision". == Platforms == Users can interact with iNaturalist in the following ways: through the iNaturalist.org website, through two mobile apps: iNaturalist (iOS/Android) and Seek by iNaturalist (iOS/Android), or through partner organizations such as the Global Biodiversity Information Facility (GBIF) website. On the iNaturalist.org website, visitors can search the public dataset and interact with other people adding observations and identifications. The website provides tools for registered users to add, identify, and discuss observations, write journal posts, explore information about species, create project pages to recruit participation, and coordinate work on their topics of interest. On the iNaturalist mobile app, users can create and share nature observations to the online dataset, explore observations both nearby and around the world, and learn about different species. Seek by iNaturalist, a separate app marketed to families, requires no online account registration and all observations may remain private. Seek incorporates features of gamification, such as providing a list of nearby organisms to find and encouraging the collection of badges and participation in challenges. Seek was initially released in the spring of 2018. == Observations == The iNaturalist platform is based on crowdsourcing of observations and identifications. An iNaturalist observation records a person's encounter with an individual organism at a particular time and place. An iNaturalist observation may also record evidence of an organism, such as animal tracks, nests, or scat. The scope of iNaturalist excludes natural but inert subjects such as geologic or hydrologic features. Users typically upload photos as evidence of their findings, though audio recordings are also accepted, and such evidence is not a strict requirement. Users may share observation locations publicly, "obscure" them to display a less precise location or make the locations completely private. iNaturalist users can add identifications to each other's observations in order to confirm or improve the identification of the observation. Observations are classified as "Casual", "Needs ID" (needs identification), or "Research Grade" based on the quality of the data provided and the community identification process. Any quality of data can be downloaded from iNaturalist and "Research Grade" observations are often incorporated into other online databases such as the Global Biodiversity Information Facility and the Atlas of Living Australia. === Automated species identification === In addition to observations being identified by others in the community, iNaturalist includes an automated species identification tool, first released in 2017. Images can be identified via a computer vision model which has been trained on the large database of the observations on iNaturalist. Multiple species suggestions are typically provided with the suggestion that the software guesses to be most likely is at the top of the list. A broader taxon such as a genus or family is commonly provided if the model is unsure of the species. It is trained once or twice a year, and the threshold for species included in the training set has changed over time. It can be difficult for the model to guess correctly if the species in question is infrequently observed or hard to identify from images alone, or if the image submitted has poor lighting, is blurry, or contains multiple subjects. In February 2023, iNaturalist released v2.1 of its computer vision model, which was trained on a new source model which performed significantly better than the previous models trained using a different source model. In April 2025 iNaturalist released an updated app for iOS, changing the original version to "iNaturalist Classic." == Projects == Users have created and contributed to tens of thousands of different projects on iNaturalist. The platform is commonly used to record observations during bioblitzes, which are biological surveying events that attempt to record all the species that occur within a designated area, and a specific project type on iNaturalist. Other project types include collections of observations by location or taxon or documenting specific types of observations such as animal tracks and signs, the spread of invasive species, roadkill, fishing catches, or discovering new species. In 2011, iNaturalist was used as a platform to power the Global Amphibian and Global Reptile BioBlitzes, in which observations were used to help monitor the occurrence and distribution of the world's reptiles and amphibian species. The US National Park Service partnered with iNaturalist to record observations from the 2016 National Parks BioBlitz. That project exceeded 100,000 observations in August 2016. In 2017, the United Nations Environment Programme teamed up with iNaturalist to celebrate World Environment Day.. In 2022, Reef Ecologic teamed up with iNaturalist to celebrate World Oceans Day. === City Nature Challenge === In 2016, Lila Higgins from the Natural History Museum of Los Angeles County and Alison Young from the California Academy of Sciences co-founded the City Nature Challenge (CNC). In the first City Nature Challenge, naturalists in Los Angeles and the San Francisco Bay Area documented over 20,000 observations with the iNaturalist platform. In 2017, the CNC expanded to 16 cities across the United States and collected over 125,000 observations of wildlife in 5 days. The CNC expanded to a global audience in 2018, with 68 cities participating from 19 countries, with some cities using community science platforms other than iNaturalist to participate. In 4 days, over 17,000 people cataloged over 440,000 nature observations in urban regions around the world. In 2019, the CNC once again expanded, with 35,000 parti

    Read more →
  • Ghana Post GPS

    Ghana Post GPS

    GhanaPostGPS is a web and smartphone application, sponsored by the government of Ghana and developed by Vokacom, to provide a digital addresses and postal codes for every 5 squared meter location in Ghana. The digital address is a composite of the postcode (region, district & area code) plus a unique address. GhanaPostGPS is the first digital addressing system created by the government of Ghana. GhanaPost GPS is a mandatory requirement for obtaining the National Identification Card and other services.

    Read more →
  • Scene text

    Scene text

    Scene text is text that appears in an image captured by a camera in an outdoor environment. The detection and recognition of scene text from camera captured images are computer vision tasks which became important after smart phones with good cameras became ubiquitous. The text in scene images varies in shape, font, colour and position. The recognition of scene text is further complicated sometimes by non-uniform illumination and focus. To improve scene text recognition, the International Conference on Document Analysis and Recognition (ICDAR) conducts a robust reading competition once in two years. The competition was held in 2003, 2005 and during every ICDAR conference. International association for pattern recognition (IAPR) has created a list of datasets as Reading systems. == Text detection == Text detection is the process of detecting the text present in the image, followed by surrounding it with a rectangular bounding box. Text detection can be carried out using image based techniques or frequency based techniques. In image based techniques, an image is segmented into multiple segments. Each segment is a connected component of pixels with similar characteristics. The statistical features of connected components are utilised to group them and form the text. Machine learning approaches such as support vector machine and convolutional neural networks are used to classify the components into text and non-text. In frequency based techniques, discrete Fourier transform (DFT) or discrete wavelet transform (DWT) are used to extract the high frequency coefficients. It is assumed that the text present in an image has high frequency components and selecting only the high frequency coefficients filters the text from the non-text regions in an image. == Word recognition == In word recognition, the text is assumed to be already detected and located and the rectangular bounding box containing the text is available. The word present in the bounding box needs to be recognized. The methods available to perform word recognition can be broadly classified into top-down and bottom-up approaches. In the top-down approaches, a set of words from a dictionary is used to identify which word suits the given image. Images are not segmented in most of these methods. Hence, the top-down approach is sometimes referred as segmentation free recognition. In the bottom-up approaches, the image is segmented into multiple components and the segmented image is passed through a recognition engine. Either an off the shelf Optical character recognition (OCR) engine or a custom-trained one is used to recognise the text.

    Read more →
  • GPT-5

    GPT-5

    GPT-5 is a multimodal large language model developed by OpenAI and the fifth in its series of generative pre-trained transformer (GPT) foundation models. Preceded in the series by GPT-4, it was launched on August 7, 2025. It is publicly accessible to users of the chatbot products ChatGPT and Microsoft Copilot as well as to developers through the OpenAI API. == Background == On April 14, 2023, Sam Altman, the chief executive officer of OpenAI, spoke at an event at the Massachusetts Institute of Technology and said that the company was not training GPT-5 at that time. He stated that OpenAI was "prioritizing GPT-4 development" and that "we are not and won't for some time" release GPT-5. On July 18, OpenAI filed for a "GPT-5" trademark in the United States. On November 13, Altman confirmed to the Financial Times that the company was working to develop GPT-5. According to The Information, "[f]or much of the second half of 2024, OpenAI was developing a model known internally as Orion and intended to become GPT-5", "[b]ut the Orion effort failed to produce a better model, and the company instead released it as GPT-4.5 in February [2025]." By late July 2025, OpenAI was widely anticipated as planning to release GPT-5 in early August. On July 30, The Verge reported that "Microsoft is getting ready for GPT-5" as "sources familiar with Microsoft's AI plans" told an editor that the company was testing a new mode for its Copilot chatbot that would offer a model that "thinks deeply or quickly based on the task". On August 5, in the leadup to the release of GPT-5, OpenAI released GPT-OSS, a set of two open-weight models that have reasoning capabilities. GPT-5 was then unveiled during a livestream event on August 7. == Capabilities == At the time of its release, GPT-5 had state-of-the-art performance on benchmarks that test mathematics, programming, finance, and multimodal understanding. According to OpenAI, improvements over its predecessor models include faster response times, better coding and writing skills, more accurate answers to health questions, and lower levels of hallucination. Also, compared to previous models, GPT-5 aims to give safe, high-level responses to potentially harmful queries rather than outright declining them, an approach that OpenAI refers to as "safe completions", aiming to result "in GPT-5 being able to refuse more unsafe questions, while offering fewer rejections to users seeking harmless information." In addition, GPT-5 was trained to give more critical, "less effusively agreeable" answers compared to its predecessor models. Days before the launch of GPT-5, two early testers of the model stated that they were "impressed" by its ability to code and to solve mathematical and scientific problems. They suggested that the model shows great improvement from GPT-4, but not as large of a gain as from GPT-3 to GPT-4. A day prior to the release of GPT-5, during a press briefing, Sam Altman, the chief executive officer of OpenAI, called GPT-5 "a significant step along the path to AGI", referring to artificial general intelligence, the hypothetical level of intelligence that OpenAI defines as the ability to perform any economically valuable task that a human can. According to Altman, GPT-5 is "significantly better" than its predecessors, offering "PhD-level" abilities across a wide range of tasks. The exact energy consumption of GPT-5 use has not been disclosed by OpenAI. Researchers at the University of Rhode Island estimated that a medium-length response consumes slightly over 18 watt-hours, equivalent to using an incandescent bulb for 18 minutes. === Architecture === GPT-5 is a system that contains a fast, high-throughput model, a deeper reasoning model, and a real-time router that decides which model to use based on conversation type, complexity, tool needs, and explicit user intent. Altman had previously criticized the manual model picker for being overly complex, suggesting a need for unification. GPT-5 also includes agentic functionality through which it can set up its own desktop and can use its browser to search autonomously for sources that relate to its task. The GPT-5 system card defines two fast, high-throughput models – gpt-5-main and gpt-5-main-mini – and two thinking models – gpt-5-thinking and gpt-5-thinking-mini. In the OpenAI API, developers can access the thinking model, its mini version, and gpt-5-thinking-nano, an even smaller and faster nano version of the thinking model. The version of GPT-5 that is accessible via the API has adjustable reasoning effort (low, medium, high, or minimal) and verbosity (low, medium, or high). Additionally, ChatGPT provides access to gpt-5-thinking with a setting that makes use of parallel test-time compute, referred to as gpt-5-thinking-pro. == Limitations == === Safety === Neuraltrust, a security research company, claimed to have successfully compromised GPT-5 within its first day of testing the model. According to its report, it enabled GPT-5 to generate detailed instructions for manufacturing explosive devices. SPLX, another company, conducted similar tests and came to similar conclusions about GPT-5's security. Their assessments suggest that GPT-5 has significant security gaps, potentially rendering it as being unsafe for use in a corporate environment. == Training == According to AIMultiple, GPT-5 is natively multimodal, meaning that it was trained from scratch on multiple modalities (like text and images) at once without relying on already-trained language or vision models. Its training process involved three stages: unsupervised pretraining, supervised fine-tuning, and reinforcement learning from human feedback. Pretraining used a large-scale multilingual dataset of books, articles, web pages, academic papers, and licensed sources. GPT-5's visual and text capabilities were described as having been developed alongside each other throughout training, unlike with GPT-4. == Use == GPT-5 is used in ChatGPT. Although GPT-5 is free for all ChatGPT users, Plus users get higher use limits while Pro users get unlimited access to GPT-5 as well as limited access to GPT-5 Pro. Standard limits for lower-tier users on responses per hour still apply. Additionally, with the introduction of GPT-5, ChatGPT's "Advanced Voice Mode" was replaced by "ChatGPT Voice", which is supposed to enable more natural-sounding conversations. OpenAI stated that "Standard Voice Mode retires on September 9, 2025, unifying all users on ChatGPT Voice". On November 24, 2025, the feature of shopping research was added to ChatGPT, claimed to be a mini model post-trained on gpt-5-thinking-mini. GPT-5 is also available in Microsoft Copilot, and Microsoft stated that it will incorporate GPT-5 into a wide variety of its products. According to 9to5Mac, Apple Inc. is planning to integrate the model into the Apple Intelligence feature in its iOS 26, iPadOS 26, and macOS Tahoe operating systems. It is also accessible via the OpenAI API. A number of American companies were reported as having received access to GPT-5 ahead of its launch. OpenAI stated that the private health insurance company Oscar Health was checking applications from its policyholders with the model. In addition, Uber was using GPT-5 for its customer support system; GitLab, Windsurf, and Cursor were using the model for software development; and the Spanish bank BBVA was using it for financial analysis. Other companies that OpenAI listed as having used GPT-5 pre-release include Amgen, Lowe's, and Notion. == Reception == === Critical reviews === Grace Huckins in MIT Technology Review found that, "[w]hereas o1 was a major technological advancement, GPT-5 is, above all else, a refined product." In response to claims that Sam Altman, the chief executive officer of OpenAI, had made about the model, she stated that "GPT-5 will furnish a more pleasant and seamless user experience. That's not nothing, but it falls far short of the transformative AI future that Altman has spent much of the past year hyping." In response to Altman's claim that GPT-5 is "a significant step along the path" to artificial general intelligence, she noted: "[M]aybe he's right—but if so, it's a very small step." In The Information, Stephanie Palazzolo praised GPT-5's coding capabilities. According to Matteo Wong in The Atlantic, GPT-5 "is intuitive, fast, and efficient; adapts to human preferences and intentions; and is easy to personalize." He stated: "At this stage of the AI boom, when every major chatbot is legitimately helpful in numerous ways, benchmarks, science, and rigor feel almost insignificant. What matters is how the chatbot feels [...]". John Herrman from the New York magazine wrote: "Casual users who encounter GPT-5 through ChatGPT aren't likely to feel like they're using a completely different product [...] while people who use it for software development or in a corporate context are more likely to notice a major change." Mashable's Christian de Looper found that "GPT-5

    Read more →
  • Actor-critic algorithm

    Actor-critic algorithm

    The actor-critic algorithm (AC) is a family of reinforcement learning (RL) algorithms that combine policy-based RL algorithms such as policy gradient methods, and value-based RL algorithms such as value iteration, Q-learning, SARSA, and TD learning. An AC algorithm consists of two main components: an "actor" that determines which actions to take according to a policy function, and a "critic" that evaluates those actions according to a value function. Some AC algorithms are on-policy, some are off-policy. Some apply to either continuous or discrete action spaces. Some work in both cases. == Overview == The actor-critic methods can be understood as an improvement over pure policy gradient methods like REINFORCE via introducing a baseline. === Actor === The actor uses a policy function π ( a | s ) {\displaystyle \pi (a|s)} , while the critic estimates either the value function V ( s ) {\displaystyle V(s)} , the action-value Q-function Q ( s , a ) , {\displaystyle Q(s,a),} the advantage function A ( s , a ) {\displaystyle A(s,a)} , or any combination thereof. The actor is a parameterized function π θ {\displaystyle \pi _{\theta }} , where θ {\displaystyle \theta } are the parameters of the actor. The actor takes as argument the state of the environment s {\displaystyle s} and produces a probability distribution π θ ( ⋅ | s ) {\displaystyle \pi _{\theta }(\cdot |s)} . If the action space is discrete, then ∑ a π θ ( a | s ) = 1 {\displaystyle \sum _{a}\pi _{\theta }(a|s)=1} . If the action space is continuous, then ∫ a π θ ( a | s ) d a = 1 {\displaystyle \int _{a}\pi _{\theta }(a|s)da=1} . The goal of policy optimization is to improve the actor. That is, to find some θ {\displaystyle \theta } that maximizes the expected episodic reward J ( θ ) {\displaystyle J(\theta )} : J ( θ ) = E π θ [ ∑ t = 0 T γ t r t ] {\displaystyle J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\gamma ^{t}r_{t}\right]} where γ {\displaystyle \gamma } is the discount factor, r t {\displaystyle r_{t}} is the reward at step t {\displaystyle t} , and T {\displaystyle T} is the time-horizon (which can be infinite). The goal of policy gradient method is to optimize J ( θ ) {\displaystyle J(\theta )} by gradient ascent on the policy gradient ∇ J ( θ ) {\displaystyle \nabla J(\theta )} . As detailed on the policy gradient method page, there are many unbiased estimators of the policy gradient: ∇ θ J ( θ ) = E π θ [ ∑ 0 ≤ j ≤ T ∇ θ ln ⁡ π θ ( A j | S j ) ⋅ Ψ j | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{0\leq j\leq T}\nabla _{\theta }\ln \pi _{\theta }(A_{j}|S_{j})\cdot \Psi _{j}{\Big |}S_{0}=s_{0}\right]} where Ψ j {\textstyle \Psi _{j}} is a linear sum of the following: ∑ 0 ≤ i ≤ T ( γ i R i ) {\textstyle \sum _{0\leq i\leq T}(\gamma ^{i}R_{i})} . γ j ∑ j ≤ i ≤ T ( γ i − j R i ) {\textstyle \gamma ^{j}\sum _{j\leq i\leq T}(\gamma ^{i-j}R_{i})} : the REINFORCE algorithm. γ j ∑ j ≤ i ≤ T ( γ i − j R i ) − b ( S j ) {\textstyle \gamma ^{j}\sum _{j\leq i\leq T}(\gamma ^{i-j}R_{i})-b(S_{j})} : the REINFORCE with baseline algorithm. Here b {\displaystyle b} is an arbitrary function. γ j ( R j + γ V π θ ( S j + 1 ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(R_{j}+\gamma V^{\pi _{\theta }}(S_{j+1})-V^{\pi _{\theta }}(S_{j})\right)} : TD(1) learning. γ j Q π θ ( S j , A j ) {\textstyle \gamma ^{j}Q^{\pi _{\theta }}(S_{j},A_{j})} . γ j A π θ ( S j , A j ) {\textstyle \gamma ^{j}A^{\pi _{\theta }}(S_{j},A_{j})} : Advantage Actor-Critic (A2C). γ j ( R j + γ R j + 1 + γ 2 V π θ ( S j + 2 ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(R_{j}+\gamma R_{j+1}+\gamma ^{2}V^{\pi _{\theta }}(S_{j+2})-V^{\pi _{\theta }}(S_{j})\right)} : TD(2) learning. γ j ( ∑ k = 0 n − 1 γ k R j + k + γ n V π θ ( S j + n ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(\sum _{k=0}^{n-1}\gamma ^{k}R_{j+k}+\gamma ^{n}V^{\pi _{\theta }}(S_{j+n})-V^{\pi _{\theta }}(S_{j})\right)} : TD(n) learning. γ j ∑ n = 1 ∞ λ n − 1 1 − λ ⋅ ( ∑ k = 0 n − 1 γ k R j + k + γ n V π θ ( S j + n ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\sum _{n=1}^{\infty }{\frac {\lambda ^{n-1}}{1-\lambda }}\cdot \left(\sum _{k=0}^{n-1}\gamma ^{k}R_{j+k}+\gamma ^{n}V^{\pi _{\theta }}(S_{j+n})-V^{\pi _{\theta }}(S_{j})\right)} : TD(λ) learning, also known as GAE (generalized advantage estimate). This is obtained by an exponentially decaying sum of the TD(n) learning terms. === Critic === In the unbiased estimators given above, certain functions such as V π θ , Q π θ , A π θ {\displaystyle V^{\pi _{\theta }},Q^{\pi _{\theta }},A^{\pi _{\theta }}} appear. These are approximated by the critic. Since these functions all depend on the actor, the critic must learn alongside the actor. The critic is learned by value-based RL algorithms. For example, if the critic is estimating the state-value function V π θ ( s ) {\displaystyle V^{\pi _{\theta }}(s)} , then it can be learned by any value function approximation method. Let the critic be a function approximator V ϕ ( s ) {\displaystyle V_{\phi }(s)} with parameters ϕ {\displaystyle \phi } . The simplest example is TD(1) learning, which trains the critic to minimize the TD(1) error: δ i = R i + γ V ϕ ( S i + 1 ) − V ϕ ( S i ) {\displaystyle \delta _{i}=R_{i}+\gamma V_{\phi }(S_{i+1})-V_{\phi }(S_{i})} The critic parameters are updated by gradient descent on the squared TD error: ϕ ← ϕ − α ∇ ϕ ( δ i ) 2 = ϕ + α δ i ∇ ϕ V ϕ ( S i ) {\displaystyle \phi \leftarrow \phi -\alpha \nabla _{\phi }(\delta _{i})^{2}=\phi +\alpha \delta _{i}\nabla _{\phi }V_{\phi }(S_{i})} where α {\displaystyle \alpha } is the learning rate. Note that the gradient is taken with respect to the ϕ {\displaystyle \phi } in V ϕ ( S i ) {\displaystyle V_{\phi }(S_{i})} only, since the ϕ {\displaystyle \phi } in γ V ϕ ( S i + 1 ) {\displaystyle \gamma V_{\phi }(S_{i+1})} constitutes a moving target, and the gradient is not taken with respect to that. This is a common source of error in implementations that use automatic differentiation, and requires "stopping the gradient" at that point. Similarly, if the critic is estimating the action-value function Q π θ {\displaystyle Q^{\pi _{\theta }}} , then it can be learned by Q-learning or SARSA. In SARSA, the critic maintains an estimate of the Q-function, parameterized by ϕ {\displaystyle \phi } , denoted as Q ϕ ( s , a ) {\displaystyle Q_{\phi }(s,a)} . The temporal difference error is then calculated as δ i = R i + γ Q θ ( S i + 1 , A i + 1 ) − Q θ ( S i , A i ) {\displaystyle \delta _{i}=R_{i}+\gamma Q_{\theta }(S_{i+1},A_{i+1})-Q_{\theta }(S_{i},A_{i})} . The critic is then updated by θ ← θ + α δ i ∇ θ Q θ ( S i , A i ) {\displaystyle \theta \leftarrow \theta +\alpha \delta _{i}\nabla _{\theta }Q_{\theta }(S_{i},A_{i})} The advantage critic can be trained by training both a Q-function Q ϕ ( s , a ) {\displaystyle Q_{\phi }(s,a)} and a state-value function V ϕ ( s ) {\displaystyle V_{\phi }(s)} , then let A ϕ ( s , a ) = Q ϕ ( s , a ) − V ϕ ( s ) {\displaystyle A_{\phi }(s,a)=Q_{\phi }(s,a)-V_{\phi }(s)} . Although, it is more common to train just a state-value function V ϕ ( s ) {\displaystyle V_{\phi }(s)} , then estimate the advantage by A ϕ ( S i , A i ) ≈ ∑ j ∈ 0 : n − 1 γ j R i + j + γ n V ϕ ( S i + n ) − V ϕ ( S i ) {\displaystyle A_{\phi }(S_{i},A_{i})\approx \sum _{j\in 0:n-1}\gamma ^{j}R_{i+j}+\gamma ^{n}V_{\phi }(S_{i+n})-V_{\phi }(S_{i})} Here, n {\displaystyle n} is a positive integer. The higher n {\displaystyle n} is, the more lower is the bias in the advantage estimation, but at the price of higher variance. The Generalized Advantage Estimation (GAE) introduces a hyperparameter λ {\displaystyle \lambda } that smoothly interpolates between Monte Carlo returns ( λ = 1 {\displaystyle \lambda =1} , high variance, no bias) and 1-step TD learning ( λ = 0 {\displaystyle \lambda =0} , low variance, high bias). This hyperparameter can be adjusted to pick the optimal bias-variance trade-off in advantage estimation. It uses an exponentially decaying average of n-step returns with λ {\displaystyle \lambda } being the decay strength. == Variants == Asynchronous Advantage Actor-Critic (A3C): Parallel and asynchronous version of A2C. Soft Actor-Critic (SAC): Incorporates entropy maximization for improved exploration. Deep Deterministic Policy Gradient (DDPG): Specialized for continuous action spaces.

    Read more →
  • Google Mobile Services

    Google Mobile Services

    Google Mobile Services (GMS) is a collection of proprietary applications and application programming interfaces (APIs) services from Google that are typically pre-installed on the majority of Android devices, such as smartphones, tablets, and smart TVs. GMS is not a part of the Android Open Source Project (AOSP), which means an Android manufacturer needs to obtain a license from Google in order to legally pre-install GMS on an Android device. This license is provided by Google without any licensing fees except in the EU. == Core applications == The following are core applications that are part of Google Mobile Services: Google Search Google Chrome YouTube Google Play Google Drive Gmail Google Meet Google Maps Google Photos Google TV YouTube Music === Historically === Google+ Google Hangouts Google Wallet Google Play Magazines Google Play Music Google Play Movies & TV Google Duo == Reception, competitors, and regulators == === FairSearch === Numerous European firms filed a complaint to the European Commission stating that Google had manipulated their power and dominance within the market to push their Services to be used by phone manufacturers. The firms were joined under the name FairSearch, and the main firms included were Microsoft, Expedia, TripAdvisor, Nokia and Oracle. FairSearch's major problem with Google's practices was that they believed Google were forcing phone manufacturers to use their Mobile Services. They claimed Google managed this by asking these manufacturers to sign a contract stating that they must preinstall specific Google Mobile Services, such as Maps, Search and YouTube, in order to get the latest version of Android. Google swiftly responded stating that they "continue to work co-operatively with the European Commission". === Aptoide === The third-party Android app store Aptoide also filed an EU competition complaint against Google once again stating that they are misusing their power within the market. Aptoide alleged that Google was blocking third-party app stores from being on Google Play, as well as blocking Google Chrome from downloading any third-party apps and app stores. As of June 2014, Google had not responded to these allegations. === Abuse of Android dominance === In May 2019, Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC & Ors. the Competition Commission of India ordered an antitrust probe against Google for abusing its dominant position with Android to block market rivals. In Prima Facie opinion the commission held that mandatory pre-installation of the entire Google Mobile Services (GMS) suite, under Mobile Application Distribution Agreements (MADA), amounts to the imposition of unfair conditions on the device manufacturers. === EU antitrust ruling === On July 18, 2018, the European Commission fined Google €4.34 billion for breaching EU antitrust rules which resulted in a change of licensing policy for the GMS in the EU. A new paid licensing agreement for smartphones and tablets shipped into the EEA was created. The change is that the GMS is now decoupled from the base Android and will be offered under a separate paid licensing agreement. === Privacy policy === At the same time, Google faced problems with various European data protection agencies, most notably In the United Kingdom and France. The problem they faced was that they had a set of 60 rules merged into one, which allowed Google to "track users more closely". Google once again came out and stated that their new policies still abide by European Union laws. === Android distributions without Google Mobile Services === After surveillance and privacy concerns, several custom android distributions have been implemented, such as GrapheneOS, LineageOS, CalyxOS, iodéOS or /e/OS, and they come either without any GMS installed by default or with microG, that adds a compatibility layer.

    Read more →
  • Example-based machine translation

    Example-based machine translation

    Example-based machine translation (EBMT) is a method of machine translation often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base at run-time. It is essentially a translation by analogy and can be viewed as an implementation of a case-based reasoning approach to machine learning. == Translation by analogy == At the foundation of example-based machine translation is the idea of translation by analogy. When applied to the process of human translation, the idea that translation takes place by analogy is a rejection of the idea that people translate sentences by doing deep linguistic analysis. Instead, it is founded on the belief that people translate by first decomposing a sentence into certain phrases, then by translating these phrases, and finally by properly composing these fragments into one long sentence. Phrasal translations are translated by analogy to previous translations. The principle of translation by analogy is encoded to example-based machine translation through the example translations that are used to train such a system. Other approaches to machine translation, including statistical machine translation, also use bilingual corpora to learn the process of translation. == History == Example-based machine translation was first suggested by Makoto Nagao in 1984. He pointed out that it is especially adapted to translation between two totally different languages, such as English and Japanese. In this case, one sentence can be translated into several well-structured sentences in another language, therefore, it is no use to do the deep linguistic analysis characteristic of rule-based machine translation. == Example == Example-based machine translation systems are trained from bilingual parallel corpora containing sentence pairs like the example shown in the table above. Sentence pairs contain sentences in one language with their translations into another. The particular example shows an example of a minimal pair, meaning that the sentences vary by just one element. These sentences make it simple to learn translations of portions of a sentence. For example, an example-based machine translation system would learn three units of translation from the above example: How much is that X ? corresponds to Ano X wa ikura desu ka. red umbrella corresponds to akai kasa small camera corresponds to chiisai kamera Composing these units can be used to produce novel translations in the future. For example, if we have been trained using some text containing the sentences: President Kennedy was shot dead during the parade. and The convict escaped on July 15th., then we could translate the sentence The convict was shot dead during the parade. by substituting the appropriate parts of the sentences. == Phrasal verbs == Example-based machine translation is best suited for sub-language phenomena like phrasal verbs. Phrasal verbs have highly context-dependent meanings. They are common in English, where they comprise a verb followed by an adverb and/or a preposition, which are called the particle to the verb. Phrasal verbs produce specialized context-specific meanings that may not be derived from the meaning of the constituents. There is almost always an ambiguity during word-to-word translation from source to the target language. As an example, consider the phrasal verb "put on" and its Hindustani translation. It may be used in any of the following ways: Ram put on the lights. (Switched on) (Hindustani translation: Jalana) Ram put on a cap. (Wear) (Hindustani translation: Pahenna)

    Read more →