AI Grammar Clean Up

AI Grammar Clean Up — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Computer audition

    Computer audition

    Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems — "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents." Inspired by models of human audition, CA deals with questions of representation, transduction, grouping, use of musical knowledge and general sound semantics for the purpose of performing intelligent operations on audio and music signals by the computer. Technically this requires a combination of methods from the fields of signal processing, auditory modelling, music perception and cognition, pattern recognition, and machine learning, as well as more traditional methods of artificial intelligence for musical knowledge representation. == Applications == Like computer vision versus image processing, computer audition versus audio engineering deals with understanding of audio rather than processing. It also differs from problems of speech understanding by machine since it deals with general audio signals, such as natural sounds and musical recordings. Applications of computer audition are widely varying, and include search for sounds, genre recognition, acoustic monitoring, music transcription, score following, audio texture, music improvisation, emotion in audio and so on. == Related disciplines == Computer Audition overlaps with the following disciplines: Music information retrieval: methods for search and analysis of similarity between music signals. Auditory scene analysis: understanding and description of audio sources and events. Computational musicology and mathematical music theory: use of algorithms that employ musical knowledge for analysis of music data. Computer music: use of computers in creative musical applications. Machine musicianship: audition driven interactive music systems. == Areas of study == Since audio signals are interpreted by the human ear–brain system, that complex perceptual mechanism should be simulated somehow in software for "machine listening". In other words, to perform on par with humans, the computer should hear and understand audio content much as humans do. Analyzing audio accurately involves several fields: electrical engineering (spectrum analysis, filtering, and audio transforms); artificial intelligence (machine learning and sound classification); psychoacoustics (sound perception); cognitive sciences (neuroscience and artificial intelligence); acoustics (physics of sound production); and music (harmony, rhythm, and timbre). Furthermore, audio transformations such as pitch shifting, time stretching, and sound object filtering, should be perceptually and musically meaningful. For best results, these transformations require perceptual understanding of spectral models, high-level feature extraction, and sound analysis/synthesis. Finally, structuring and coding the content of an audio file (sound and metadata) could benefit from efficient compression schemes, which discard inaudible information in the sound. Computational models of music and sound perception and cognition can lead to a more meaningful representation, a more intuitive digital manipulation and generation of sound and music in musical human-machine interfaces. The study of CA could be roughly divided into the following sub-problems: Representation: signal and symbolic. This aspect deals with time-frequency representations, both in terms of notes and spectral models, including pattern playback and audio texture. Feature extraction: sound descriptors, segmentation, onset, pitch and envelope detection, chroma, and auditory representations. Musical knowledge structures: analysis of tonality, rhythm, and harmonies. Sound similarity: methods for comparison between sounds, sound identification, novelty detection, segmentation, and clustering. Sequence modeling: matching and alignment between signals and note sequences. Source separation: methods of grouping of simultaneous sounds, such as multiple pitch detection and time-frequency clustering methods. Auditory cognition: modeling of emotions, anticipation and familiarity, auditory surprise, and analysis of musical structure. Multi-modal analysis: finding correspondences between textual, visual, and audio signals. === Representation issues === Computer audition deals with audio signals that can be represented in a variety of fashions, from direct encoding of digital audio in two or more channels to symbolically represented synthesis instructions. Audio signals are usually represented in terms of analogue or digital recordings. Digital recordings are samples of acoustic waveform or parameters of audio compression algorithms. One of the unique properties of musical signals is that they often combine different types of representations, such as graphical scores and sequences of performance actions that are encoded as MIDI files. Since audio signals usually comprise multiple sound sources, then unlike speech signals that can be efficiently described in terms of specific models (such as source-filter model), it is hard to devise a parametric representation for general audio. Parametric audio representations usually use filter banks or sinusoidal models to capture multiple sound parameters, sometimes increasing the representation size in order to capture internal structure in the signal. Additional types of data that are relevant for computer audition are textual descriptions of audio contents, such as annotations, reviews, and visual information in the case of audio-visual recordings. === Features === Description of contents of general audio signals usually requires extraction of features that capture specific aspects of the audio signal. Generally speaking, one could divide the features into signal or mathematical descriptors such as energy, description of spectral shape etc., statistical characterization such as change or novelty detection, special representations that are better adapted to the nature of musical signals or the auditory system, such as logarithmic growth of sensitivity (bandwidth) in frequency or octave invariance (chroma). Since parametric models in audio usually require very many parameters, the features are used to summarize properties of multiple parameters in a more compact or salient representation. === Musical knowledge === Finding specific musical structures is possible by using musical knowledge as well as supervised and unsupervised machine learning methods. Examples of this include detection of tonality according to distribution of frequencies that correspond to patterns of occurrence of notes in musical scales, distribution of note onset times for detection of beat structure, distribution of energies in different frequencies to detect musical chords and so on. === Sound similarity and sequence modeling === Comparison of sounds can be done by comparison of features with or without reference to time. In some cases an overall similarity can be assessed by close values of features between two sounds. In other cases when temporal structure is important, methods of dynamic time warping need to be applied to "correct" for different temporal scales of acoustic events. Finding repetitions and similar sub-sequences of sonic events is important for tasks such as texture synthesis and machine improvisation. === Source separation === Since one of the basic characteristics of general audio is that it comprises multiple simultaneously sounding sources, such as multiple musical instruments, people talking, machine noises or animal vocalization, the ability to identify and separate individual sources is very desirable. Unfortunately, there are no methods that can solve this problem in a robust fashion. Existing methods of source separation rely sometimes on correlation between different audio channels in multi-channel recordings. The ability to separate sources from stereo signals requires different techniques than those usually applied in communications where multiple sensors are available. Other source separation methods rely on training or clustering of features in mono recording, such as tracking harmonically related partials for multiple pitch detection. Some methods, before explicit recognition, rely on revealing structures in data without knowing the structures (like recognizing objects in abstract pictures without attributing them meaningful labels) by finding the least complex data representations, for instance describing audio scenes as generated by a few tone patterns and their trajectories (polyphonic voices) and acoustical contours drawn by a tone (c

    Read more →
  • Language Weaver

    Language Weaver

    Language Weaver is the machine translation (MT) technology and brand of RWS. The brand name was revived in 2021 following the acquisition of SDL and Iconic Translation Machines Ltd. and the merging of the respective teams and technologies. Language Weaver was formerly a standalone company that was acquired by SDL in 2010. == History == Language Weaver was a Los Angeles, California–based company founded in 2002 as a spin-out company from the University of Southern California. The company was founded to commercialise a statistical approach to automatic language translation and natural language processing known as statistical machine translation (SMT). The company's name is a reference to one of the pioneers of machine translation — Warren Weaver — who first proposed the idea of using computers to ‘decode’ or ‘decrypt’ language in a memorandum back in 1947. Language Weaver’s statistical approach to machine translation was cutting-edge at the time, and a significant improvement over previous approaches such as Rule-Based MT. Language Weaver grew steadily over an 8 year period, with staff numbers totalling 96 across offices in US, Europe, and Japan. The company had significant business with Government organisations where its name continues to hold strong recognition to this day. In July 2010, Language Weaver was acquired by SDL plc for $42.5 million and the company was renamed SDL Language Weaver. == SDL Language Weaver == SDL Language Weaver was the primary machine translation technology at SDL where, over time, it evolved from SMT to syntax-based MT, to Neural Machine Translation. The Language Weaver brand was retired in 2015 in favour of SDL BeGlobal for the cloud-based solution, and SDL Enterprise Translation Server for the on-premise solution. Later, these products were rebranded again as SDL Machine Translation Cloud and SDL Machine Translation Edge respectively. == 2021 Relaunch == The Language Weaver brand was revived in 2021 following the acquisition of SDL by RWS, and the merger of the SDL MT and Iconic Translation Machines teams and technologies. The combined technologies of both companies, based on state-of-the-art Transformer-based Neural Machine Translation, are now sold as "Language Weaver" for cloud-based MT, and "Language Weaver Edge" for on-premise MT. == Supported languages == As of September 2021, Language Weaver supports the following languages and language varieties:

    Read more →
  • How to Choose an Conversational AI Platform

    How to Choose an Conversational AI Platform

    Trying to pick the best conversational AI platform? An conversational AI platform is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right conversational AI platform slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Top 10 AI Text-to-image Tools Compared (2026)

    Top 10 AI Text-to-image Tools Compared (2026)

    Comparing the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • LaMDA

    LaMDA

    LaMDA (Language Model for Dialogue Applications) is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year. In June 2022, LaMDA gained widespread attention when Google engineer Blake Lemoine made claims that the chatbot had become sentient. The scientific community has largely rejected Lemoine's claims, though it has led to conversations about the efficacy of the Turing test, which measures whether a computer can pass for a human. In February 2023, Google announced Gemini (then Bard), a conversational artificial intelligence chatbot powered by LaMDA, to counter the rise of OpenAI's ChatGPT. == History == === Background === On January 28, 2020, Google unveiled Meena, a neural network-powered chatbot with 2.6 billion parameters, which Google claimed to be superior to all other existing chatbots. The company previously hired computer scientist Ray Kurzweil in 2012 to develop multiple chatbots for the company, including one named Danielle. The Google Brain research team, who developed Meena, hoped to release the chatbot to the public in a limited capacity, but corporate executives refused on the grounds that Meena violated Google's "AI principles around safety and fairness". Meena was later renamed LaMDA as its data and computing power increased, and the Google Brain team again sought to deploy the software to the Google Assistant, the company's virtual assistant software, in addition to opening it up to a public demo. Both requests were once again denied by company leadership. LaMDA's two lead researchers, Daniel de Freitas and Noam Shazeer, eventually left the company in frustration. === First generation === Google announced the LaMDA conversational large language model during the Google I/O keynote on May 18, 2021, powered by artificial intelligence. The acronym stands for "Language Model for Dialogue Applications". Built on the seq2seq architecture, transformer-based neural networks developed by Google Research in 2017, LaMDA was trained on human dialogue and stories, allowing it to engage in open-ended conversations. Google states that responses generated by LaMDA have been ensured to be "sensible, interesting, and specific to the context". LaMDA has access to multiple symbolic text processing systems, including a database, a real-time clock and calendar, a mathematical calculator, and a natural language translation system, giving it superior accuracy in tasks supported by those systems, and making it among the first dual process chatbots. LaMDA is also not stateless because its "sensibleness" metric is fine-tuned by "pre-conditioning" each dialog turn by prepending many of the most recent dialog interactions, on a user-by-user basis. LaMDA is tuned on nine unique performance metrics: sensibleness, specificity, interestingness, safety, groundedness, informativeness, citation accuracy, helpfulness, and role consistency. Tests by Google indicated that LaMDA surpassed human responses in the area of interestingness. The pre-training dataset consists of 2.97B documents, 1.12B dialogs, and 13.39B utterances, for a total of 1.56T words. The largest LaMDA model has 137B non-embedding parameters. === Second generation === On May 11, 2022, Google unveiled LaMDA 2, the successor to LaMDA, during the 2022 Google I/O keynote. The new incarnation of the model draws examples of text from numerous sources, using it to formulate unique "natural conversations" on topics that it may not have been trained to respond to. === Sentience claims === On June 11, 2022, The Washington Post reported that Google engineer Blake Lemoine had been placed on paid administrative leave after Lemoine told company executives Blaise Agüera y Arcas and Jen Gennai that LaMDA had become sentient. Lemoine came to this conclusion after the chatbot made questionable responses to questions regarding self-identity, moral values, religion, and Isaac Asimov's Three Laws of Robotics. Google refuted these claims, insisting that there was substantial evidence to indicate that LaMDA was not sentient. In an interview with Wired, Lemoine reiterated his claims that LaMDA was "a person" as dictated by the Thirteenth Amendment to the U.S. Constitution, comparing it to an "alien intelligence of terrestrial origin". He further revealed that he had been dismissed by Google after he hired an attorney on LaMDA's behalf after the chatbot requested that Lemoine do so. On July 22, Google fired Lemoine, asserting that Blake had violated their policies "to safeguard product information" and rejected his claims as "wholly unfounded". Internal controversy instigated by the incident prompted Google executives to decide against releasing LaMDA to the public, which it had previously been considering. Lemoine's claims were widely pushed back by the scientific community. Many experts rejected the idea that LaMDA was sentient, including former New York University psychology professor Gary Marcus, David Pfau of Google sister company DeepMind, Erik Brynjolfsson of the Institute for Human-Centered Artificial Intelligence at Stanford University, and University of Surrey professor Adrian Hilton. Yann LeCun, who leads Meta Platforms' AI research team, stated that neural networks such as LaMDA were "not powerful enough to attain true intelligence". University of California, Santa Cruz professor Max Kreminski noted that LaMDA's architecture did not "support some key capabilities of human-like consciousness" and that its neural network weights were "frozen", assuming it was a typical large language model. Philosopher Nick Bostrom noted, however, that the lack of precise and consensual criteria for determining whether a system is conscious warrants some uncertainty. IBM Watson lead developer David Ferrucci compared how LaMDA appeared to be human in the same way Watson did when it was first introduced. Former Google AI ethicist Timnit Gebru called Lemoine a victim of a "hype cycle" initiated by researchers and the media. Lemoine's claims have also generated discussion on whether the Turing test remained useful to determine researchers' progress toward achieving artificial general intelligence, with Will Omerus of the Post opining that the test actually measured whether machine intelligence systems were capable of deceiving humans, while Brian Christian of The Atlantic said that the controversy was an instance of the ELIZA effect. == Products == === AI Test Kitchen === With the unveiling of LaMDA 2 in May 2022, Google also launched the AI Test Kitchen, a mobile application for the Android operating system powered by LaMDA capable of providing lists of suggestions on-demand based on a complex goal. Originally open only to Google employees, the app was set to be made available to "select academics, researchers, and policymakers" by invitation sometime in the year. In August, the company began allowing users in the U.S. to sign up for early access. In November, Google released a "season 2" update to the app, integrating a limited form of Google Brain's Imagen text-to-image model. A third iteration of the AI Test Kitchen was in development by January 2023, expected to launch at I/O later that year. Following the 2023 I/O keynote in May, Google added MusicLM, an AI-powered music generator first previewed in January, to the AI Test Kitchen app. In August, the app was delisted from Google Play and the Apple App Store, instead moving completely online. === Bard === On February 6, 2023, Google announced Bard, a conversational AI chatbot powered by LaMDA, in response to the unexpected popularity of OpenAI's ChatGPT chatbot. Google positions the chatbot as a "collaborative AI service" rather than a search engine. Bard became available for early access on March 21. === Other products === In addition to Bard, Pichai also unveiled the company's Generative Language API, an application programming interface also based on LaMDA, which he announced would be opened up to third-party developers in March 2023. == Architecture == LaMDA is a decoder-only Transformer language model. It is pre-trained on a text corpus that includes both documents and dialogs consisting of 1.56 trillion words, and is then trained with fine-tuning data generated by manually annotated responses for "sensibleness, interestingness, and safety". LaMDA was retrieval-augmented to improve the accuracy of facts provided to the user. Three different models were tested, with the largest having 137 billion non-embedding parameters:

    Read more →
  • The Best Free AI Essay Writer for Beginners

    The Best Free AI Essay Writer for Beginners

    In search of the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • AI Code-review Tools Reviews: What Actually Works in 2026

    AI Code-review Tools Reviews: What Actually Works in 2026

    Shopping for the best AI code-review tool? An AI code-review tool is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI code-review tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Black in AI

    Black in AI

    Black in AI, formally called the Black in AI Workshop, is a technology research organization and affinity group, founded by computer scientists Timnit Gebru and Rediet Abebe in 2017. It started as a conference workshop, later pivoting into an organization. Black in AI increases the presence and inclusion of Black people in the field of artificial intelligence (AI) by creating space for sharing ideas, fostering collaborations, mentorship, and advocacy. == History == Black in AI was created in 2017 to address issues of lack of diversity in AI workshops, and was started as its own workshop within the Conference on Neural Information Processing Systems (NeurIPS) conference. Because of algorithmic bias, ethical issues, and underrepresentation of Black people in AI roles; there has been an ongoing need for unity within the AI community to have focus on these issues. Black in AI has strived to continue the progress of improving the presence of people of color in the field of artificial intelligence. In 2018 and 2019, the Black in AI workshop had many immigration visa issues to Canada, which spurred the conference to be planned for 2020 in Addis Ababa, Ethiopia. On December 7, 2020, Black in AI held its fourth annual workshop and first virtual workshop (due to the COVID-19 pandemic). In 2021, Black in AI, alongside the groups Queer in AI and Widening NLP, released a public statement refusing funding from Google in an act of protest of Google's treatment of Timnit Gebru, Margaret Mitchell, and April Christina Curley in the events that occurred in December 2020. == Founders == Rediet Abebe is an Ethiopian computer scientist who specializes in algorithms and artificial intelligence. She is a Computer Science Assistant Professor at the University of California, Berkeley. She was previously a Junior Fellow at Harvard's Society of Fellows. She was the first Black woman to receive a Ph.D. in computer science at Cornell University. She "designs and analyzes algorithms, discrete optimizations, network-based, [and] computational strategies to increase access to opportunity for historically disadvantaged populations," according to her web bio. Timnit Gebru was born in Ethiopia and moved to the United States at the age of fifteen. She got her B.S. and M.S. in electrical engineering from Stanford University, as well as a PhD from the Stanford Artificial Intelligence Laboratory, where she studied computer vision under Fei-Fei Li. She formerly worked as a postdoctoral researcher at Microsoft Research in the Fairness Accountability Transparency, and Ethics (FATE) division. She's also worked with Apple, where she assisted in the development of signal-processing algorithms for the original iPad. == Grants == Black in AI received grants and support from private foundations like MacArthur Foundation and Rockefeller Foundation. The organization received $10,000 in 2018 for its annual workshop and $150,000 in 2019 for its long-term organizational planning. In 2020, during the pandemic, the organization received a grant of $300,000 by MacArthur Foundation in order to provide broad organizational support. In 2022, Rockefeller Foundation announced $300,000 to fight prejudice in artificial intelligence (AI) across the globe and incorporate equity into this rapidly expanding field. == Programs == "Black in AI works in academics, advocacy, entrepreneurship, financial support, and summer research programs." The Black in AI Academic Program is a resource for Black junior researchers applying to graduate schools, navigating graduate school, and transitioning into the postgraduate employment market. They provide online education sessions, offer scholarships to cover application fees, pair participants with peer and senior mentors, and distribute crowdsourced papers that simplify the application process. They also undertake research projects to investigate and highlight the difficulties that Black young researchers face, as well as push for structural reforms to eliminate these barriers and build equitable research settings. Moses Namara is a Facebook Research Fellow at Clemson University and a PhD candidate in Human-Centered Computing (HCC). He is the mentor for the new Black in AI Academic Program. During the graduate school admissions season in 2021, Black in AI served more than 200 potential graduate program candidates in some capacity. Furthermore, the organization's study identified greater problems encountered by Black graduate school candidates, such as the high cost of graduate school admissions examinations (GREs), which are known to be biased against those from low-income backgrounds. Black in AI's attempts to encourage institutions to eliminate the obstacles were supported by the findings. Black in AI is also developing a program to help and connect Black tech startups with investors. Black in AI also mentors early-career Black AI academics and is forming relationships with Historically Black Colleges and Universities to extend its academic program. In 2021, Black in AI launched two summer research programs, one for undergraduate internships and another for unconstrained research mentorship, including one aimed explicitly at empowering Black women's AI research projects. == Conferences and workshops == At NeurIPS 2017, the first Black in AI event took place in December 8, 2017 in Long Beach, California. The goal was to bring together experts in the area to share ideas and debate efforts aimed at increasing the participation of Black people in artificial intelligence, both for diversity and to avoid data bias. Black AI researchers had the opportunity to share their work at the workshop's oral and poster sessions. The second workshop was hosted in Montréal, Canada, on December 7, 2018. According to AI experts, visa issues stymie efforts to make their area more inclusive, making technology that discriminates or disadvantages individuals who aren't white or Western less likely. Hundreds of participants who were supposed to attend or present work at the Black in AI session on Friday were unable to fly to Canada; many of the participants were from African countries. The third workshop was held in NeurIPS 2019, one of the premier machine learning conferences Vancouver, Canada. The workshop was able to give travel scholarships and visa support to hundreds of academics who would not have been able to attend NeurIPS without the help of sponsors. For instance, Ramon Vilarino of the University of Sao Paulo, who presented a poster at the conference on his study of geographical and racial prejudice in credit scoring in Brazil, would not have been able to attend NeurIPS without the help of Black in AI. Twenty-four academics from Africa and South America were denied visas to attend this session during the conference, according to Victor Silva, the workshop organizer. He noted that, less than a month before the conference, 40 applicants from both continents had been given visas but that more than 70 applications were still waiting. For the second year in a row, visa restrictions have stopped several African scholars from attending the 2018 meeting in Montreal. The AAAI announced the first Black in AI lunch, which was held in conjunction with AAAI-19. The lunch was hosted on Tuesday, January 29, 2019. This event was intended to promote networking, discussion of various AI career options, and the exchange of ideas in order to boost the number of Black researchers in the area. The fourth Black in AI workshop, which was held in conjunction with NeurIPS 2020, took place the week of December 7, 2020. The workshop was scheduled to take place in Vancouver, British Columbia. Due to the pandemic, the session was held for the first time in a virtual format. Victor Silva, an AI4Society student, served as the event's chair. The fifth annual Black in AI workshop was also held virtually in 2021. Oral presentations, guest keynote speakers, a combined poster session with other affinity groups, sponsored sessions, and startup showcases was all featured. The goal of the session was to raise the visibility of black scholars at NeurIPS.

    Read more →
  • Stability (learning theory)

    Stability (learning theory)

    Stability, also known as algorithmic stability, is a notion in computational learning theory of how a machine learning algorithm output is changed with small perturbations to its inputs. A stable learning algorithm is one for which the prediction does not change much when the training data is modified slightly. For instance, consider a machine learning algorithm that is being trained to recognize handwritten letters of the alphabet, using 1000 examples of handwritten letters and their labels ("A" to "Z") as a training set. One way to modify this training set is to leave out an example, so that only 999 examples of handwritten letters and their labels are available. A stable learning algorithm would produce a similar classifier with both the 1000-element and 999-element training sets. Stability can be studied for many types of learning problems, from language learning to inverse problems in physics and engineering, as it is a property of the learning process rather than the type of information being learned. The study of stability gained importance in computational learning theory in the 2000s when it was shown to have a connection with generalization. It was shown that for large classes of learning algorithms, notably empirical risk minimization algorithms, certain types of stability ensure good generalization. == History == A central goal in designing a machine learning system is to guarantee that the learning algorithm will generalize, or perform accurately on new examples after being trained on a finite number of them. In the 1990s, milestones were reached in obtaining generalization bounds for supervised learning algorithms. The technique historically used to prove generalization was to show that an algorithm was consistent, using the uniform convergence properties of empirical quantities to their means. This technique was used to obtain generalization bounds for the large class of empirical risk minimization (ERM) algorithms. An ERM algorithm is one that selects a solution from a hypothesis space H {\displaystyle H} in such a way to minimize the empirical error on a training set S {\displaystyle S} . A general result, proved by Vladimir Vapnik for an ERM binary classification algorithms, is that for any target function and input distribution, any hypothesis space H {\displaystyle H} with VC-dimension d {\displaystyle d} , and n {\displaystyle n} training examples, the algorithm is consistent and will produce a training error that is at most O ( d n ) {\displaystyle O\left({\sqrt {\frac {d}{n}}}\right)} (plus logarithmic factors) from the true error. The result was later extended to almost-ERM algorithms with function classes that do not have unique minimizers. Vapnik's work, using what became known as VC theory, established a relationship between generalization of a learning algorithm and properties of the hypothesis space H {\displaystyle H} of functions being learned. However, these results could not be applied to algorithms with hypothesis spaces of unbounded VC-dimension. Put another way, these results could not be applied when the information being learned had a complexity that was too large to measure. Some of the simplest machine learning algorithms—for instance, for regression—have hypothesis spaces with unbounded VC-dimension. Another example is language learning algorithms that can produce sentences of arbitrary length. Stability analysis was developed in the 2000s for computational learning theory and is an alternative method for obtaining generalization bounds. The stability of an algorithm is a property of the learning process, rather than a direct property of the hypothesis space H {\displaystyle H} , and it can be assessed in algorithms that have hypothesis spaces with unbounded or undefined VC-dimension such as nearest neighbor. A stable learning algorithm is one for which the learned function does not change much when the training set is slightly modified, for instance by leaving out an example. A measure of Leave one out error is used in a Cross Validation Leave One Out (CVloo) algorithm to evaluate a learning algorithm's stability with respect to the loss function. As such, stability analysis is the application of sensitivity analysis to machine learning. == Summary of classic results == Early 1900s - Stability in learning theory was earliest described in terms of continuity of the learning map L {\displaystyle L} , traced to Andrey Nikolayevich Tikhonov. 1979 - Devroye and Wagner observed that the leave-one-out behavior of an algorithm is related to its sensitivity to small changes in the sample. 1999 - Kearns and Ron discovered a connection between finite VC-dimension and stability. 2002 - In a landmark paper, Bousquet and Elisseeff proposed the notion of uniform hypothesis stability of a learning algorithm and showed that it implies low generalization error. Uniform hypothesis stability, however, is a strong condition that does not apply to large classes of algorithms, including ERM algorithms with a hypothesis space of only two functions. 2002 - Kutin and Niyogi extended Bousquet and Elisseeff's results by providing generalization bounds for several weaker forms of stability which they called almost-everywhere stability. Furthermore, they took an initial step in establishing the relationship between stability and consistency in ERM algorithms in the Probably Approximately Correct (PAC) setting. 2004 - Poggio et al. proved a general relationship between stability and ERM consistency. They proposed a statistical form of leave-one-out-stability which they called CVEEEloo stability, and showed that it is a) sufficient for generalization in bounded loss classes, and b) necessary and sufficient for consistency (and thus generalization) of ERM algorithms for certain loss functions such as the square loss, the absolute value and the binary classification loss. 2010 - Shalev Shwartz et al. noticed problems with the original results of Vapnik due to the complex relations between hypothesis space and loss class. They discuss stability notions that capture different loss classes and different types of learning, supervised and unsupervised. 2016 - Moritz Hardt et al. proved stability of gradient descent given certain assumption on the hypothesis and number of times each instance is used to update the model. == Preliminary definitions == We define several terms related to learning algorithms training sets, so that we can then define stability in multiple ways and present theorems from the field. A machine learning algorithm, also known as a learning map L {\displaystyle L} , maps a training data set, which is a set of labeled examples ( x , y ) {\displaystyle (x,y)} , onto a function f {\displaystyle f} from X {\displaystyle X} to Y {\displaystyle Y} , where X {\displaystyle X} and Y {\displaystyle Y} are in the same space of the training examples. The functions f {\displaystyle f} are selected from a hypothesis space of functions called H {\displaystyle H} . The training set from which an algorithm learns is defined as S = { z 1 = ( x 1 , y 1 ) , . . , z m = ( x m , y m ) } {\displaystyle S=\{z_{1}=(x_{1},\ y_{1})\ ,..,\ z_{m}=(x_{m},\ y_{m})\}} and is of size m {\displaystyle m} in Z = X × Y {\displaystyle Z=X\times Y} drawn i.i.d. from an unknown distribution D. Thus, the learning map L {\displaystyle L} is defined as a mapping from Z m {\displaystyle Z_{m}} into H {\displaystyle H} , mapping a training set S {\displaystyle S} onto a function f S {\displaystyle f_{S}} from X {\displaystyle X} to Y {\displaystyle Y} . Here, we consider only deterministic algorithms where L {\displaystyle L} is symmetric with respect to S {\displaystyle S} , i.e. it does not depend on the order of the elements in the training set. Furthermore, we assume that all functions are measurable and all sets are countable. The loss V {\displaystyle V} of a hypothesis f {\displaystyle f} with respect to an example z = ( x , y ) {\displaystyle z=(x,y)} is then defined as V ( f , z ) = V ( f ( x ) , y ) {\displaystyle V(f,z)=V(f(x),y)} . The empirical error of f {\displaystyle f} is I S [ f ] = 1 n ∑ V ( f , z i ) {\displaystyle I_{S}[f]={\frac {1}{n}}\sum V(f,z_{i})} . The true error of f {\displaystyle f} is I [ f ] = E z V ( f , z ) {\displaystyle I[f]=\mathbb {E} _{z}V(f,z)} Given a training set S of size m, we will build, for all i = 1....,m, modified training sets as follows: By removing the i-th element S | i = { z 1 , . . . , z i − 1 , z i + 1 , . . . , z m } {\displaystyle S^{|i}=\{z_{1},...,\ z_{i-1},\ z_{i+1},...,\ z_{m}\}} By replacing the i-th element S i = { z 1 , . . . , z i − 1 , z i ′ , z i + 1 , . . . , z m } {\displaystyle S^{i}=\{z_{1},...,\ z_{i-1},\ z_{i}',\ z_{i+1},...,\ z_{m}\}} == Definitions of stability == === Hypothesis Stability === An algorithm L {\displaystyle L} has hypothesis stability β with respect to the loss function V if the following holds: ∀ i ∈ { 1 , . . . , m } , E S , z [ | V ( f S , z ) − V ( f S |

    Read more →
  • Philipp Koehn

    Philipp Koehn

    Philipp Koehn (born 1 August 1971 in Erlangen, West Germany) is a computer scientist and researcher in the field of machine translation. His primary research interest is statistical machine translation and he is one of the inventors of a method called phrase based machine translation. This is a sub-field of statistical translation methods that employs sequences of words (or so-called "phrases") as the basis of translation, expanding the previous word based approaches. A 2003 paper which he authored with Franz Josef Och and Daniel Marcu called Statistical phrase-based translation has attracted wide attention in Machine translation community and has been cited over a thousand times. Phrase based methods are widely used in machine translation applications in industry. Philipp Koehn received his PhD in computer science in 2003 from the University of Southern California, where he worked at the Information Sciences Institute advised by Kevin Knight. After a year as a postdoctoral fellow under Michael Collins at the Massachusetts Institute of Technology, he joined the University of Edinburgh as a lecturer in the School of Informatics in 2005. He was appointed reader in 2010 and professor in 2012. In 2014, he was appointed professor at the computer science department of The Johns Hopkins University, where he is affiliated with the Center for Language and Speech Processing. == Moses statistical machine translation decoder == The Moses machine translation decoder is an open source project that was created by and is maintained under the guidance of Philipp Koehn. The Moses decoder is a platform for developing Statistical machine translation systems given a parallel corpus for any language pair. The decoder was mainly developed by Hieu Hoang and Philipp Koehn at the University of Edinburgh and extended during a Johns Hopkins University Summer Workshop and further developed under Euromatrix and GALE project funding. The decoder (which is part of a complete statistical machine translation toolkit) is the de facto benchmark for research in the field. Although Koehn continues to play a major role in the development of Moses, the Moses decoder was supported by the European Framework 6 projects Euromatrix, TC-Star, the European Framework 7 projects EuroMatrixPlus, Let's MT, META-NET and MosesCore and the DARPA GALE project, as well as several universities such as the University of Edinburgh, the University of Maryland, ITC-irst, Massachusetts Institute of Technology, and others. Substantial additional contributors to the Moses decoder include Hieu Hoang, Chris Dyer, Josh Schroeder, Marcello Federico, Richard Zens, and Wade Shen. == Europarl corpus == The Europarl corpus is a set of documents that consists of the proceedings of the European Parliament from 1996 to the present. The corpus has been compiled and expanded by a group of researchers led by Philipp Koehn at University of Edinburgh. The data that makes up the corpus was extracted from the website of the European Parliament and then prepared for linguistic research. The latest release (2012) comprised up to 60 million words per language, with 21 European languages represented: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavic (Bulgarian, Czech, Polish, Slovak, Slovene), Finno-Ugric (Finnish, Hungarian, Estonian), Baltic (Latvian, Lithuanian), and Greek. == Other interests and activities in chronological order == Koehn is a professor at Johns Hopkins University where he continues his research into machine translation through his affiliation with the Center for Language and Speech Processing Koehn is a professor and chair of machine translation at the University of Edinburgh School of Informatics and contributes to its statistical machine translation group which organises workshops, seminars and project related to the subject. Koehn has consulted to SYSTRAN periodically between 2006 and 2011. SYSTRAN was acquired by CLSI, a Korean machine translation company in April 2014. Koehn worked for Facebook/META AI Research from 2018 to 2022. Koehn is also chief scientist for Omniscien Technologies and a shareholder in Omniscien Technologies since 2007. Omniscien Technologies is a private company developing and commercialising machine translation technologies. Koehn authored a book titled "Statistical Machine Translation" in 2009 and a book titled "Neural Machine Translation" in 2020. == Awards and recognition == 2013: One of three finalists in the category of Research for the European Patent Office (EPO) 2013 European Inventor Award. Koehn was recognised for patent EP 1488338 B, Phrase-Based Joint Probability Model for Statistical Machine Translations, a translation model that uses mathematical probabilities to determine the most likely interpretation of chunks of text between foreign languages. 2015: Koehn received the Award of Honor of the International Association for Machine Translation. 2024: Koehn was named Fellow of the Association for Computational Linguistics (ACL).

    Read more →
  • Probabilistic context-free grammar

    Probabilistic context-free grammar

    In theoretical linguistics and computational linguistics, probabilistic context free grammars (PCFGs) extend context-free grammars, similar to how hidden Markov models extend regular grammars. Each production is assigned a probability. The probability of a derivation (parse) is the product of the probabilities of the productions used in that derivation. These probabilities can be viewed as parameters of the model, and for large problems it is convenient to learn these parameters via machine learning. A probabilistic grammar's validity is constrained by context of its training dataset. PCFGs originated from grammar theory, and have application in areas as diverse as natural language processing to the study the structure of RNA molecules and design of programming languages. Designing efficient PCFGs has to weigh factors of scalability and generality. Issues such as grammar ambiguity must be resolved. The grammar design affects results accuracy. Grammar parsing algorithms have various time and memory requirements. == Definitions == Derivation: The process of recursive generation of strings from a grammar. Parsing: Finding a valid derivation using an automaton. Parse Tree: The alignment of the grammar to a sequence. An example of a parser for PCFG grammars is the pushdown automaton. The algorithm parses grammar nonterminals from left to right in a stack-like manner. This brute-force approach is not very efficient. In RNA secondary structure prediction variants of the Cocke–Younger–Kasami (CYK) algorithm provide more efficient alternatives to grammar parsing than pushdown automata. Another example of a PCFG parser is the Stanford Statistical Parser which has been trained using Treebank. == Formal definition == Similar to a CFG, a probabilistic context-free grammar G can be defined by a quintuple: G = ( M , T , R , S , P ) {\displaystyle G=(M,T,R,S,P)} where M is the set of non-terminal symbols T is the set of terminal symbols R is the set of production rules S is the start symbol P is the set of probabilities on production rules == Relation with hidden Markov models == PCFGs models extend context-free grammars the same way as hidden Markov models extend regular grammars. The Inside-Outside algorithm is an analogue of the Forward-Backward algorithm. It computes the total probability of all derivations that are consistent with a given sequence, based on some PCFG. This is equivalent to the probability of the PCFG generating the sequence, and is intuitively a measure of how consistent the sequence is with the given grammar. The Inside-Outside algorithm is used in model parametrization to estimate prior frequencies observed from training sequences in the case of RNAs. Dynamic programming variants of the CYK algorithm find the Viterbi parse of a RNA sequence for a PCFG model. This parse is the most likely derivation of the sequence by the given PCFG. == Grammar construction == Context-free grammars are represented as a set of rules inspired from attempts to model natural languages. The rules are absolute and have a typical syntax representation known as Backus–Naur form. The production rules consist of terminal { a , b } {\displaystyle \left\{a,b\right\}} and non-terminal S symbols and a blank ϵ {\displaystyle \epsilon } may also be used as an end point. In the production rules of CFG and PCFG the left side has only one nonterminal whereas the right side can be any string of terminal or nonterminals. In PCFG nulls are excluded. An example of a grammar: S → a S , S → b S , S → ϵ {\displaystyle S\to aS,S\to bS,S\to \epsilon } This grammar can be shortened using the '|' ('or') character into: S → a S | b S | ϵ {\displaystyle S\to aS|bS|\epsilon } Terminals in a grammar are words and through the grammar rules a non-terminal symbol is transformed into a string of either terminals and/or non-terminals. The above grammar is read as "beginning from a non-terminal S the emission can generate either a or b or ϵ {\displaystyle \epsilon } ". Its derivation is: S ⇒ a S ⇒ a b S ⇒ a b b S ⇒ a b b {\displaystyle S\Rightarrow aS\Rightarrow abS\Rightarrow abbS\Rightarrow abb} Ambiguous grammar may result in ambiguous parsing if applied on homographs since the same word sequence can have more than one interpretation. Pun sentences such as the newspaper headline "Iraqi Head Seeks Arms" are an example of ambiguous parses. One strategy of dealing with ambiguous parses (originating with grammarians as early as Pāṇini) is to add yet more rules, or prioritize them so that one rule takes precedence over others. This, however, has the drawback of proliferating the rules, often to the point where they become difficult to manage. Another difficulty is overgeneration, where unlicensed structures are also generated. Probabilistic grammars circumvent these problems by ranking various productions on frequency weights, resulting in a "most likely" (winner-take-all) interpretation. As usage patterns are altered in diachronic shifts, these probabilistic rules can be re-learned, thus updating the grammar. Assigning probability to production rules makes a PCFG. These probabilities are informed by observing distributions on a training set of similar composition to the language to be modeled. On most samples of broad language, probabilistic grammars where probabilities are estimated from data typically outperform hand-crafted grammars. CFGs when contrasted with PCFGs are not applicable to RNA structure prediction because while they incorporate sequence-structure relationship they lack the scoring metrics that reveal a sequence structural potential == Weighted context-free grammar == A weighted context-free grammar (WCFG) is a more general category of context-free grammar, where each production has a numeric weight associated with it. The weight of a specific parse tree in a WCFG is the product (or sum ) of all rule weights in the tree. Each rule weight is included as often as the rule is used in the tree. A special case of WCFGs are PCFGs, where the weights are (logarithms of ) probabilities. An extended version of the CYK algorithm can be used to find the "lightest" (least-weight) derivation of a string given some WCFG. When the tree weight is the product of the rule weights, WCFGs and PCFGs can express the same set of probability distributions. == Applications == === RNA structure prediction === Since the 1990s, PCFG has been applied to model RNA structures. Energy minimization and PCFG provide ways of predicting RNA secondary structure with comparable performance. However structure prediction by PCFGs is scored probabilistically rather than by minimum free energy calculation. PCFG model parameters are directly derived from frequencies of different features observed in databases of RNA structures rather than by experimental determination as is the case with energy minimization methods. The types of various structure that can be modeled by a PCFG include long range interactions, pairwise structure and other nested structures. However, pseudoknots can not be modeled. PCFGs extend CFG by assigning probabilities to each production rule. A maximum probability parse tree from the grammar implies a maximum probability structure. Since RNAs preserve their structures over their primary sequence, RNA structure prediction can be guided by combining evolutionary information from comparative sequence analysis with biophysical knowledge about a structure plausibility based on such probabilities. Also search results for structural homologs using PCFG rules are scored according to PCFG derivations probabilities. Therefore, building grammar to model the behavior of base-pairs and single-stranded regions starts with exploring features of structural multiple sequence alignment of related RNAs. S → a S a | b S b | a a | b b {\displaystyle S\to aSa|bSb|aa|bb} The above grammar generates a string in an outside-in fashion, that is the basepair on the furthest extremes of the terminal is derived first. So a string such as a a b a a b a a {\displaystyle aabaabaa} is derived by first generating the distal a's on both sides before moving inwards: S ⇒ a S a ⇒ a a S a a ⇒ a a b S b a a ⇒ a a b a a b a a {\displaystyle S\Rightarrow aSa\Rightarrow aaSaa\Rightarrow aabSbaa\Rightarrow aabaabaa} A PCFG model extendibility allows constraining structure prediction by incorporating expectations about different features of an RNA . Such expectation may reflect for example the propensity for assuming a certain structure by an RNA. However incorporation of too much information may increase PCFG space and memory complexity and it is desirable that a PCFG-based model be as simple as possible. Every possible string x a grammar generates is assigned a probability weight P ( x | θ ) {\displaystyle P(x|\theta )} given the PCFG model θ {\displaystyle \theta } . It follows that the sum of all probabilities to all possible grammar productions is ∑ x P ( x | θ ) = 1 {\displaystyle \sum _{\text{x}}P(x|\theta )=1} . The scores

    Read more →
  • Katz's back-off model

    Katz's back-off model

    Katz back-off is a generative n-gram language model that estimates the conditional probability of a word given its history in the n-gram. It accomplishes this estimation by backing off through progressively shorter history models under certain conditions. By doing so, the model with the most reliable information about a given history is used to provide the better results. The model was introduced in 1987 by Slava M. Katz. Prior to that, n-gram language models were constructed by training individual models for different n-gram orders using maximum likelihood estimation and then interpolating them together. == Method == The equation for Katz's back-off model is: P b o ( w i ∣ w i − n + 1 ⋯ w i − 1 ) = { d w i − n + 1 ⋯ w i C ( w i − n + 1 ⋯ w i − 1 w i ) C ( w i − n + 1 ⋯ w i − 1 ) if C ( w i − n + 1 ⋯ w i ) > k α w i − n + 1 ⋯ w i − 1 P b o ( w i ∣ w i − n + 2 ⋯ w i − 1 ) otherwise {\displaystyle {\begin{aligned}&P_{bo}(w_{i}\mid w_{i-n+1}\cdots w_{i-1})\\[4pt]={}&{\begin{cases}d_{w_{i-n+1}\cdots w_{i}}{\dfrac {C(w_{i-n+1}\cdots w_{i-1}w_{i})}{C(w_{i-n+1}\cdots w_{i-1})}}&{\text{if }}C(w_{i-n+1}\cdots w_{i})>k\\[10pt]\alpha _{w_{i-n+1}\cdots w_{i-1}}P_{bo}(w_{i}\mid w_{i-n+2}\cdots w_{i-1})&{\text{otherwise}}\end{cases}}\end{aligned}}} where C(x) = number of times x appears in training wi = ith word in the given context Essentially, this means that if the n-gram has been seen more than k times in training, the conditional probability of a word given its history is proportional to the maximum likelihood estimate of that n-gram. Otherwise, the conditional probability is equal to the back-off conditional probability of the (n − 1)-gram. The more difficult part is determining the values for k, d and α. k {\displaystyle k} is the least important of the parameters. It is usually chosen to be 0. However, empirical testing may find better values for k. d {\displaystyle d} is typically the amount of discounting found by Good–Turing estimation. In other words, if Good–Turing estimates C {\displaystyle C} as C ∗ {\displaystyle C^{}} , then d = C ∗ C {\displaystyle d={\frac {C^{}}{C}}} To compute α {\displaystyle \alpha } , it is useful to first define a quantity β, which is the left-over probability mass for the (n − 1)-gram: β w i − n + 1 ⋯ w i − 1 = 1 − ∑ { w i : C ( w i − n + 1 ⋯ w i ) > k } d w i − n + 1 ⋯ w i C ( w i − n + 1 ⋯ w i − 1 w i ) C ( w i − n + 1 ⋯ w i − 1 ) {\displaystyle \beta _{w_{i-n+1}\cdots w_{i-1}}=1-\sum _{\{w_{i}:C(w_{i-n+1}\cdots w_{i})>k\}}d_{w_{i-n+1}\cdots w_{i}}{\frac {C(w_{i-n+1}\cdots w_{i-1}w_{i})}{C(w_{i-n+1}\cdots w_{i-1})}}} Then the back-off weight, α, is computed as follows: α w i − n + 1 ⋯ w i − 1 = β w i − n + 1 ⋯ w i − 1 ∑ { w i : C ( w i − n + 1 ⋯ w i ) ≤ k } P b o ( w i ∣ w i − n + 2 ⋯ w i − 1 ) {\displaystyle \alpha _{w_{i-n+1}\cdots w_{i-1}}={\frac {\beta _{w_{i-n+1}\cdots w_{i-1}}}{\sum _{\{w_{i}:C(w_{i-n+1}\cdots w_{i})\leq k\}}P_{bo}(w_{i}\mid w_{i-n+2}\cdots w_{i-1})}}} The above formula only applies if there is data for the "(n − 1)-gram". If not, the algorithm skips n-1 entirely and uses the Katz estimate for n-2. (and so on until an n-gram with data is found) == Discussion == This model generally works well in practice, but fails in some circumstances. For example, suppose that the bigram "a b" and the unigram "c" are very common, but the trigram "a b c" is never seen. Since "a b" and "c" are very common, it may be significant (that is, not due to chance) that "a b c" is never seen. Perhaps it's not allowed by the rules of the grammar. Instead of assigning a more appropriate value of 0, the method will back off to the bigram and estimate P(c | b), which may be too high.

    Read more →
  • Be My Eyes

    Be My Eyes

    Be My Eyes is a Danish mobile app that aims to help blind and visually impaired people to recognize objects and manage everyday situations. An online community of sighted volunteers receive photos or videos from randomly assigned affected individuals and assist via live chat. In 2023, the company launched Be My AI, an AI-based interface to help blind and visually impaired users describe images. The app is currently available for Android, iOS, and Windows. == History == === Founding and early years === The app was developed and marketed by Hans Jørgen Wiberg. He had demonstrated that although there are video chat software such as Skype and FaceTime, none is tailored for the visually impaired. For development, he joined forces with the Danish Association of the Blind, and other organizations. The app was first presented at an event for start-up companies in 2012 and first released in 2015. A version for Android was released in 2017, in addition to the iOS version. Praise was given for easy use of the app. The lack of sufficient data protection, which makes it possible to pass on data to third parties, was criticized. === Recent developments === The company has raised over $650,000, including funding from Silicon Valley, Microsoft, and other angel investors. In February 2020, $2.8 million in Series A funding was raised, allowing the company to further develop its business model while keeping visual support services free for visually impaired users. The investment allows the company to further develop its unique "purpose and profit" business model while keeping the visual support service free and unlimited for all visually impaired users. === User base and accessibility === Over 9.3 million volunteers and 900,000 blind or visually impaired people use the app. == Features == === Human-based assistance === A visually impaired person starts a live stream showing their view from their cellphone camera. They are assigned, through a phone call or chat, a random volunteer who speaks the same language and who is in the same time zone. This allows the volunteer to describe an object and assist the visually impaired person, such as guiding the person to move their camera, read instructions, or clean up a spill. Through speech synthesis, content can be read out loud. This process encourages a more independent life for blind and visually impaired people. === Be My AI === In March of 2023, Be My Eyes launched Be My AI, an AI-based virtual assistant. Be My AI is accessible through the Be My Eyes app, and is based on OpenAI's GPT-4 large language model. Through the interface, the app allows blind and visually impaired users to send images from a variety of devices to be described. The app allows users to then follow up with questions to further tailor the image description. Blind users report using Be My AI for a variety of tasks, including reading menus, identifying clothing, and describing people. The Be My AI interface is available on Android, iOS, and Windows. Within a few weeks of the interface's roll out, the company reported that it had been used one million times, and it was named among Time's best inventions of 2023. Be My AI is part of a growing number of AI-based apps and devices designed to help blind and visually impaired individuals. == Partnerships == === Microsoft === In November 2023, Be My Eyes entered a partnership with Microsoft to share data to help improve accessibility-focused AI models. === Meta === In 2024, Be My Eyes integrated with Ray-Ban Meta smart glasses, a wearable product developed by Meta and EssilorLuxottica. The partnership enabled users to receive hands-free, real-time visual descriptions and volunteer assistance by using voice commands through the smart glasses. === Hilton === In October 2024, Hilton partnered with Be My Eyes to provide live video assistance for blind and low-vision guests. The free service connects travelers to a Hilton team member that can guide them through tasks like adjusting thermostats, opening window shades, or navigating hotel amenities. This collaboration progressed from a prior arrangement where Hilton helped train Be My Eyes' GPT-4 powered AI model to better recognize objects and layouts in hotel rooms. === Tesco === In October 2025, retailer Tesco announced its partnership with Be My Eyes to launch a six-month pilot aimed at improving in-store accessibility in the UK. The initiative was launched on World Sight Day, 9 October, enabling Be My Eyes users to connect directly with Tesco staff via the app for personalised visual assistance while shopping, Euronewsweek reported. == Awards == Nordic Startup Awards for "Best Social Entrepreneurial Tech Startup" in Denmark 2021 Apple Design Award for best social impact

    Read more →
  • AI Video Generators Reviews: What Actually Works in 2026

    AI Video Generators Reviews: What Actually Works in 2026

    Comparing the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Babel Fish (website)

    Babel Fish (website)

    Yahoo! Babel Fish was a free Web-based machine translation service by Yahoo!. In May 2012 it was replaced by Bing Translator (now Microsoft Translator), to which queries were redirected. Although Yahoo! has transitioned its Babel Fish translation services to Bing Translator, it did not sell its translation application to Microsoft outright. As the oldest free online language translator, the service translated text or Web pages in 36 pairs between 13 languages, including English, Simplified Chinese, Traditional Chinese, Dutch, French, German, Greek, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. The internet service derived its name from the Babel fish, a fictional species in Douglas Adams's book and radio series The Hitchhiker's Guide to the Galaxy that could instantly translate languages. In turn, the name of the fictional creature refers to the biblical account of the confusion of languages that arose in the city of Babel. == History == On December 9, 1997, Digital Equipment Corporation (DEC) and SYSTRAN S.A. launched AltaVista Translation Service at babelfish.altavista.com, which was developed by a team of researchers at DEC. In February 2003, AltaVista was bought by Overture Services, Inc. In July 2003, Overture, in turn, was taken over by Yahoo!. The web address for Babel Fish remained at babelfish.altavista.com until May 9, 2008, when the address changed to babelfish.yahoo.com. In 2012, the Web address changed again, this time redirecting babelfish.yahoo.com to www.microsofttranslator.com when Microsoft's Bing Translator replaced Yahoo Babel Fish. As of June 2013, babelfish.yahoo.com no longer redirects to the Microsoft Bing Translator. Instead, it refers directly back to the main Yahoo.com page.

    Read more →