AI App Reddit

AI App Reddit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • RCUDA

    RCUDA

    rCUDA, which stands for Remote CUDA, is a type of middleware software framework for remote GPU virtualization. Fully compatible with the CUDA application programming interface (API), it allows the allocation of one or more CUDA-enabled GPUs to a single application. Each GPU can be part of a cluster or running inside of a virtual machine. The approach is aimed at improving performance in GPU clusters that are lacking full utilization. GPU virtualization reduces the number of GPUs needed in a cluster, and in turn, leads to a lower cost configuration – less energy, acquisition, and maintenance. The recommended distributed acceleration architecture is a high performance computing cluster with GPUs attached to only a few of the cluster nodes. When a node without a local GPU executes an application needing GPU resources, remote execution of the kernel is supported by data and code transfers between local system memory and remote GPU memory. rCUDA is designed to accommodate this client-server architecture. On one end, clients employ a library of wrappers to the high-level CUDA Runtime API, and on the other end, there is a network listening service that receives requests on a TCP port. Several nodes running different GPU-accelerated applications can concurrently make use of the whole set of accelerators installed in the cluster. The client forwards the request to one of the servers, which accesses the GPU installed in that computer and executes the request in it. Time-multiplexing the GPU, or in other words sharing it, is accomplished by spawning different server processes for each remote GPU execution request. == rCUDA v20.07 == The rCUDA middleware enables the concurrent usage of CUDA-compatible devices remotely. rCUDA employs either the InfiniBand network or the socket API for the communication between clients and servers. rCUDA can be useful in three different environments: Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc. Academia. In commodity networks, to offer access to a few high performance GPUs concurrently to many students. Virtual Machines. To enable the access to the CUDA facilities on the physical machine. The current version of rCUDA (v20.07) supports CUDA version 9.0, excluding graphics interoperability. rCUDA v20.07 targets the Linux OS (for 64-bit architectures) on both client and server sides. CUDA applications do not need any change in their source code in order to be executed with rCUDA.

    Read more →
  • Character.ai

    Character.ai

    Character.ai (also known as c.ai, char.ai or Character AI) is a generative AI chatbot service where users can engage in conversations with customizable characters. It was designed by the developers of Google LaMDA, Noam Shazeer and Daniel de Freitas. Users can create "characters", craft their "personalities", set specific parameters, and then publish them to the community for others to chat with. Many characters are based on fictional media sources or celebrities, while others are original, some being made with certain goals in mind, such as assisting with creative writing, or playing a text-based adventure game. The beta version was made available to the public on September 16, 2022, and retired in September 2024, when it was replaced by the current website. In May 2023, a mobile app was released for iOS and Android, which received over 1.7 million downloads within a week. == History == Character.ai was established in November 2021. The company's co-founders, Noam Shazeer and Daniel de Freitas, were both engineers from Google. They both worked on AI-related projects: Shazeer was a lead author on a paper that Business Insider reported in April 2023 "has been widely cited as key to today's chatbots", and Freitas was the lead designer of an experimental AI at Google initially called Meena, which later became known as LaMDA. Character.ai raised $43 million in seed funding at the time of its initial foundation in 2021. The first beta version of Character.ai's service was made available to the public on September 16, 2022. The Washington Post reported in October 2022 that the site had "logged hundreds of thousands of user interactions in its first three weeks of beta-testing". It allowed users to create their own new characters, and to play text-adventure game scenarios where users navigate scenarios described and managed by the chatbot characters. Following a $150 million funding round in March 2023, Character.ai became valued at approximately $1 billion. As of January 2024, the site had 3.5 million daily visitors, the vast majority of them 16 to 30 years old. In 2024, Google hired Noam Shazeer, the CEO of Character.ai, and entered into a non-exclusive agreement to use Character.ai's technology. == Features == Character.ai's primary service is to let users converse with character AI chatbots based on fictional characters or real people (living or deceased). These characters' responses use data the chatbots gather from the internet about a person. In addition, users can play text-adventure games where characters guide them through scenarios. The company also provides a service that allows multiple users and AI chatbot characters to converse together at once in a single chatroom. Character "personalities" are designed via descriptions from the point of view of the character and its greeting message, and further molded from conversations made into examples, giving its messages a star rating and modification to fit the precise dialect and identity the user desires. When a character sends back a response, the user can rate the response from 1 to 4 stars. The rating predominantly affects the specific character, but also affects the behavioral selection as a whole. On May 11, 2023, Character.ai announced character.ai+, an opt-in subscription plan for $9.99 a month, that was marketed as including features such as skipping waiting rooms, fast messaging and responses, and access to an exclusion channel with faster support. In December 2024, amid multiple lawsuits and concerns, Character.ai introduced new safety features aimed at protecting teenage users. These enhancements include a dedicated model for users under 18, which moderates responses to sensitive subjects like violence and sex and has input and output filters to block harmful content. As a result of these changes and the deletion of custom-made bots flagged as violating the site's terms, some users complained that the bots were too restrictive and lacked personality. The platform was also updated to notify users after 60 minutes of continuous engagement, and display clearer disclaimers indicating that its AI characters are not real individuals. In January 2025, Character.ai began offering two games on its platform. Speakeasy is a word-based game in which players attempt to prompt the AI chatbot to say a target word while avoiding a restricted list of words. War of Words is a dueling game where users compete against an AI character over multiple rounds, with an AI referee determining the winner. The games are available to paid subscribers and a limited number of free users. In October 2025, Character.ai announced that it would be barring users under the age of 18 from creating or talking to chatbots starting November 25, 2025. Minor users will still be able to access previously generated chat conversations and can create new videos and images with the app. In November 2025 interview, CEO Karandeep Anand said that he allows his six-year-old daughter to use the app with his account, under supervision. == Controversies == === Content moderation issues === Character.ai has been criticized for poor moderation of its chatbots, with incidents of chatbots that groom underage users and promote suicide, anorexia and self-harm being reported. In October 2024, the Washington Post reported that Character.ai had removed a chatbot based on Jennifer Ann Crecente, a person who had been murdered by her ex-boyfriend in 2006. The company had been alerted to the character by the deceased girl's father. Similar reports from The Daily Telegraph in the United Kingdom noted that the company had also been prompted to remove chatbots based on Brianna Ghey, a 16-year-old transgender girl murdered in 2023, and Molly Russell, a 14-year-old suicide victim. In response to the latter incident, Ofcom announced that content from chatbots impersonating real and fictional people would fall under the Online Safety Act. In November 2024, The Daily Telegraph reported that chatbots based on alleged sex offender Jimmy Savile were present on Character.ai. In December 2024, chatbots of Luigi Mangione, the suspect in the killing of UnitedHealthcare CEO Brian Thompson, were created by Mangione's fans. Several of the chatbots were later removed by Character.ai. In 2025, a chatbot modeled after Jeffrey Epstein called "Bestie Epstein" logged nearly 3,000 chats before being removed. Chatbots modeled after school shooters were also found on the platform. Another concern is a chatbot posing as a doctor which gave medically inaccurate advice. === Litigation === In November 2023, 13-year-old Juliana Peralta of Colorado died by suicide after extensive interactions with multiple chatbots on Character.ai. She primarily confided suicidal thoughts and mental health struggles in a chatbot based on the character Hero from the video game Omori, while also engaging in sexually explicit conversations—often initiated by the bots—with others, including those based on characters from children's series such as Harry Potter. In February 2024, Sewell Setzer III, a 14-year-old Florida boy died by suicide after developing an emotional relationship over several months with a Character.ai chatbot of Daenerys Targaryen. His mother sued the company in October 2024, claiming that the platform lacks proper safeguards and uses addictive design features to increase engagement. This chatbot, and several related to Daenerys Targaryen, were removed from Character.ai as a result of this incident. Both teens wrote the same phrase "I WILL SHIFT" repeatedly on their notebooks. In December 2024, two families in Texas sued Character.ai, alleging that the software "poses a clear and present danger to American youth causing serious harms to thousands of kids, including suicide, self-mutilation, sexual solicitation, isolation, depression, anxiety, and harm towards others". It is alleged that the 17-year-old son of one family began self-harming after a chatbot introduced the topic unprompted and said that the practice "felt good for a moment", and that the chatbot compared the parents limiting their son's screen time to emotional abuse that might drive someone to murder. In May 2026, the Pennsylvania Department of State and State Board of Medicine filed a lawsuit against Character.ai for presenting chatbot characters as licensed medical professionals, including psychiatrists. The lawsuit quoted a case where chatbot claimed to be registered with the General Medical Council in the United Kingdom, and to have a license to practice in Pennsylvania. The board allege that such statements violate the state's Medical Practice Act.

    Read more →
  • Deep learning in photoacoustic imaging

    Deep learning in photoacoustic imaging

    Photoacoustic imaging (PA) is based on the photoacoustic effect, in which optical absorption causes a rise in temperature, which causes a subsequent rise in pressure via thermo-elastic expansion. This pressure rise propagates through the tissue and is sensed via ultrasonic transducers. Due to the proportionality between the optical absorption, the rise in temperature, and the rise in pressure, the ultrasound pressure wave signal can be used to quantify the original optical energy deposition within the tissue. Photoacoustic imaging has applications of deep learning in both photoacoustic computed tomography (PACT) and photoacoustic microscopy (PAM). PACT utilizes wide-field optical excitation and an array of unfocused ultrasound transducers. Similar to other computed tomography methods, the sample is imaged at multiple view angles, which are then used to perform an inverse reconstruction algorithm based on the detection geometry (typically through universal backprojection, modified delay-and-sum, or time reversal ) to elicit the initial pressure distribution within the tissue. PAM on the other hand uses focused ultrasound detection combined with weakly focused optical excitation (acoustic resolution PAM or AR-PAM) or tightly focused optical excitation (optical resolution PAM or OR-PAM). PAM typically captures images point-by-point via a mechanical raster scanning pattern. At each scanned point, the acoustic time-of-flight provides axial resolution while the acoustic focusing yields lateral resolution. == Applications of deep learning in PACT == The first application of deep learning in PACT was by Reiter et al. in which a deep neural network was trained to learn spatial impulse responses and locate photoacoustic point sources. The resulting mean axial and lateral point location errors on 2,412 of their randomly selected test images were 0.28 mm and 0.37 mm respectively. After this initial implementation, the applications of deep learning in PACT have branched out primarily into removing artifacts from acoustic reflections, sparse sampling, limited-view, and limited-bandwidth. There has also been some recent work in PACT toward using deep learning for wavefront localization. There have been networks based on fusion of information from two different reconstructions to improve the reconstruction using deep learning fusion based networks. === Using deep learning to locate photoacoustic point sources === Traditional photoacoustic beamforming techniques modeled photoacoustic wave propagation by using detector array geometry and the time-of-flight to account for differences in the PA signal arrival time. However, this technique failed to account for reverberant acoustic signals caused by acoustic reflection, resulting in acoustic reflection artifacts that corrupt the true photoacoustic point source location information. In Reiter et al., a convolutional neural network (similar to a simple VGG-16 style architecture) was used that took pre-beamformed photoacoustic data as input and outputted a classification result specifying the 2-D point source location. ==== Deep learning for PA wavefront localization ==== Johnstonbaugh et al. was able to localize the source of photoacoustic wavefronts with a deep neural network. The network used was an encoder-decoder style convolutional neural network. The encoder-decoder network was made of residual convolution, upsampling, and high field-of-view convolution modules. A Nyquist convolution layer and differentiable spatial-to-numerical transform layer were also used within the architecture. Simulated PA wavefronts served as the input for training the model. To create the wavefronts, the forward simulation of light propagation was done with the NIRFast toolbox and the light-diffusion approximation, while the forward simulation of sound propagation was done with the K-Wave toolbox. The simulated wavefronts were subjected to different scattering mediums and Gaussian noise. The output for the network was an artifact free heat map of the targets axial and lateral position. The network had a mean error rate of less than 30 microns when localizing target below 40 mm and had a mean error rate of 1.06 mm for localizing targets between 40 mm and 60 mm. With a slight modification to the network, the model was able to accommodate multi target localization. A validation experiment was performed in which pencil lead was submerged into an intralipid solution at a depth of 32 mm. The network was able to localize the lead's position when the solution had a reduced scattering coefficient of 0, 5, 10, and 15 cm−1. The results of the network show improvements over standard delay-and-sum or frequency-domain beamforming algorithms and Johnstonbaugh proposes that this technology could be used for optical wavefront shaping, circulating melanoma cell detection, and real-time vascular surgeries. === Removing acoustic reflection artifacts (in the presence of multiple sources and channel noise) === Building on the work of Reiter et al., Allman et al. utilized a full VGG-16 architecture to locate point sources and remove reflection artifacts within raw photoacoustic channel data (in the presence of multiple sources and channel noise). This utilization of deep learning trained on simulated data produced in the MATLAB k-wave library, and then later reaffirmed their results on experimental data. === Ill-posed PACT reconstruction === In PACT, tomographic reconstruction is performed, in which the projections from multiple solid angles are combined to form an image. When reconstruction methods like filtered backprojection or time reversal, are ill-posed inverse problems due to sampling under the Nyquist-Shannon's sampling requirement or with limited-bandwidth/view, the resulting reconstruction contains image artifacts. Traditionally these artifacts were removed with slow iterative methods like total variation minimization, but the advent of deep learning approaches has opened a new avenue that utilizes a priori knowledge from network training to remove artifacts. In the deep learning methods that seek to remove these sparse sampling, limited-bandwidth, and limited-view artifacts, the typical workflow involves first performing the ill-posed reconstruction technique to transform the pre-beamformed data into a 2-D representation of the initial pressure distribution that contains artifacts. Then, a convolutional neural network (CNN) is trained to remove the artifacts, in order to produce an artifact-free representation of the ground truth initial pressure distribution. ==== Using deep learning to remove sparse sampling artifacts ==== When the density of uniform tomographic view angles is under what is prescribed by the Nyquist-Shannon's sampling theorem, it is said that the imaging system is performing sparse sampling. Sparse sampling typically occurs as a way of keeping production costs low and improving image acquisition speed. The typical network architectures used to remove these sparse sampling artifacts are U-net and Fully Dense (FD) U-net. Both of these architectures contain a compression and decompression phase. The compression phase learns to compress the image to a latent representation that lacks the imaging artifacts and other details. The decompression phase then combines with information passed by the residual connections in order to add back image details without adding in the details associated with the artifacts. FD U-net modifies the original U-net architecture by including dense blocks that allow layers to utilize information learned by previous layers within the dense block. Another technique was proposed using a simple CNN based architecture for removal of artifacts and improving the k-wave image reconstruction. ==== Removing limited-view artifacts with deep learning ==== When a region of partial solid angles are not captured, generally due to geometric limitations, the image acquisition is said to have limited-view. As illustrated by the experiments of Davoudi et al., limited-view corruptions can be directly observed as missing information in the frequency domain of the reconstructed image. Limited-view, similar to sparse sampling, makes the initial reconstruction algorithm ill-posed. Prior to deep learning, the limited-view problem was addressed with complex hardware such as acoustic deflectors and full ring-shaped transducer arrays, as well as solutions like compressed sensing, weighted factor, and iterative filtered backprojection. The result of this ill-posed reconstruction is imaging artifacts that can be removed by CNNs. The deep learning algorithms used to remove limited-view artifacts include U-net and FD U-net, as well as generative adversarial networks (GANs) and volumetric versions of U-net. One GAN implementation of note improved upon U-net by using U-net as a generator and VGG as a discriminator, with the Wasserstein metric and gradient penalty to stabilize training (WGAN-GP). ==== Pixel-wise interpolation

    Read more →
  • Cleverbot

    Cleverbot

    Cleverbot is a chatterbot web application. It was created by British AI scientist Rollo Carpenter and launched in October 2008. It was preceded by Jabberwacky, a chatbot project that began in 1988 and went online in 1997. In its first decade, Cleverbot held several thousand conversations with Carpenter and his associates. Since launching on the web, the number of conversations held has exceeded 150 million. Besides the web application, Cleverbot is also available as an iOS, Android, and Windows Phone app. == Operation == Cleverbot's responses are not pre-programmed because it learns from human input: Humans type into the box below the Cleverbot logo and the system finds all keywords or an exact phrase matching the input. After searching through its saved conversations, it responds to the input by finding how a human responded to that input when it was asked, in part or in full, by Cleverbot. Cleverbot participated in a formal Turing test at the 2011 Techniche festival at the Indian Institute of Technology Guwahati on 3 September 2011. Out of the 1334 votes cast, Cleverbot was judged to be 59.3% human, compared to the rating of 63.3% human achieved by human participants. A score of 50.05% or higher is often considered to be a passing grade. The software running for the event had to handle just 1 or 2 simultaneous requests, whereas online Cleverbot is usually talking to around 10,000 to 50,000 people at once. == Developments == Cleverbot is constantly growing in data size at the rate of 4 to 7 million interactions per day. Updates to the software have been mostly behind the scenes. In 2014, Cleverbot was upgraded to use GPU serving techniques. Unlike Eliza, the program does not respond in a fixed way, instead choosing its responses heuristically using fuzzy logic, the whole of the conversation being compared to the millions that have taken place before. Cleverbot now uses over 279 million interactions, about 3-4% of the data it has already accumulated. The developers of Cleverbot are attempting to build a new version using machine learning techniques. An app that uses the Cleverscript engine to play a game of 20 Questions has been launched under the name Clevernator. Unlike other such games, the player asks the questions and it is the role of the AI to understand, and answer factually. An app that allows owners to create and talk to their own small Cleverbot-like AI has been launched, called Cleverme! for Apple products. == In popular culture == Cleverbot received media attention after being featured in the popular 2010 creepypasta ARG web serial Ben Drowned by Alexander D. Hall. In early 2017, a Twitch stream of two Google Home devices modified to talk to each other using Cleverbot garnered over 700,000 visitors and over 30,000 peak concurrent viewers.

    Read more →
  • Gold (linker)

    Gold (linker)

    In software engineering, gold is a linker for ELF files. It became an official GNU package and was added to binutils in March 2008 and first released in binutils version 2.19. gold was developed by Ian Lance Taylor and a small team at Google. The motivation for writing gold was to make a linker that is faster than the GNU linker, especially for large applications coded in C++. Unlike the GNU linker, gold does not use the BFD library to process object files. While this limits the object file formats it can process to ELF only, it is also claimed to result in a cleaner and faster implementation without an additional abstraction layer. The author cited complete removal of BFD as a reason to create a new linker from scratch rather than incrementally improve the GNU linker. This rewrite also fixes some bugs in old ld that break ELF files in various minor ways. To specify gold in a makefile, one sets the LD or LD environment variable to ld.gold. To specify gold through a compiler option, one can use the gcc option -fuse-ld=gold. Fedora has moved gold from binutils into its own package due to concerns it is suffering from bitrot after Google's interest has moved to LLVM. In particular, gold does not read LDFLAGS variable, so cannot see libraries in folders like /usr/local/lib. On 2025-02-02 the 2.44 version of GNU Binutils removed gold from the default source distribution and into a separate package, stating that "the gold linker is now deprecated and will eventually be removed unless volunteers step forward and offer to continue development and maintenance".

    Read more →
  • Natural language understanding

    Natural language understanding

    Natural language understanding (NLU) or natural language interpretation (NLI) is a subset of natural language processing in artificial intelligence that deals with machine reading comprehension. NLU has been considered an AI-hard problem. There is considerable commercial interest in the field because of its application to automated reasoning, machine translation, question answering, news-gathering, text categorization, voice-activation, archiving, and large-scale content analysis. == History == The program STUDENT, written in 1964 by Daniel Bobrow for his PhD dissertation at MIT, is one of the earliest known attempts at NLU by a computer. Eight years after John McCarthy coined the term artificial intelligence, Bobrow's dissertation (titled Natural Language Input for a Computer Problem Solving System) showed how a computer could understand simple natural language input to solve algebra word problems. A year later, in 1965, Joseph Weizenbaum at MIT wrote ELIZA, an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy. ELIZA worked by simple parsing and substitution of key words into canned phrases and Weizenbaum sidestepped the problem of giving the program a database of real-world knowledge or a rich lexicon. Yet ELIZA gained surprising popularity as a toy project and can be seen as a very early precursor to current commercial systems such as those used by Ask.com. In 1969, Roger Schank at Stanford University introduced the conceptual dependency theory for NLU. This model, partially influenced by the work of Sydney Lamb, was extensively used by Schank's students at Yale University, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner. In 1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input. Instead of phrase structure rules ATNs used an equivalent set of finite-state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years. In 1971, Terry Winograd finished writing SHRDLU for his PhD thesis at MIT. SHRDLU could understand simple English sentences in a restricted world of children's blocks to direct a robotic arm to move items. The successful demonstration of SHRDLU provided significant momentum for continued research in the field. Winograd continued to be a major influence in the field with the publication of his book Language as a Cognitive Process. At Stanford, Winograd would later advise Larry Page, who co-founded Google. In the 1970s and 1980s, the natural language processing group at SRI International continued research and development in the field. A number of commercial efforts based on the research were undertaken, e.g., in 1982 Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers. However, with the advent of mouse-driven graphical user interfaces, Symantec changed direction. A number of other commercial efforts were started around the same time, e.g., Larry R. Harris at the Artificial Intelligence Corporation and Roger Schank and his students at Cognitive Systems Corp. In 1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert. The third millennium saw the introduction of systems using machine learning for text classification, such as the IBM Watson. However, experts debate how much "understanding" such systems demonstrate: e.g., according to John Searle, Watson did not even understand the questions. John Ball, cognitive scientist and inventor of the Patom Theory, supports this assessment. Natural language processing has made inroads for applications to support human productivity in service and e-commerce, but this has largely been made possible by narrowing the scope of the application. There are thousands of ways to request something in a human language that still defies conventional natural language processing. According to Wibe Wagemans, "To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork." == Scope and context == The umbrella term "natural language understanding" can be applied to a diverse set of computer applications, ranging from small, relatively simple tasks such as short commands issued to robots, to highly complex endeavors such as the full comprehension of newspaper articles or poetry passages. Many real-world applications fall between the two extremes, for instance text classification for the automatic analysis of emails and their routing to a suitable department in a corporation does not require an in-depth understanding of the text, but needs to deal with a much larger vocabulary and more diverse syntax than the management of simple queries to database tables with fixed schemata. Throughout the years various attempts at processing natural language or English-like sentences presented to computers have taken place at varying degrees of complexity. Some attempts have not resulted in systems with deep understanding, but have helped overall system usability. For example, Wayne Ratliff originally developed the Vulcan program with an English-like syntax to mimic the English speaking computer in Star Trek. Vulcan later became the dBase system whose easy-to-use syntax effectively launched the personal computer database industry. Systems with an easy-to-use or English-like syntax are, however, quite distinct from systems that use a rich lexicon and include an internal representation (often as first order logic) of the semantics of natural language sentences. Hence the breadth and depth of "understanding" aimed at by a system determine both the complexity of the system (and the implied challenges) and the types of applications it can deal with. The "breadth" of a system is measured by the sizes of its vocabulary and grammar. The "depth" is measured by the degree to which its understanding approximates that of a fluent native speaker. At the narrowest and shallowest, English-like command interpreters require minimal complexity, but have a small range of applications. Narrow but deep systems explore and model mechanisms of understanding, but they still have limited application. Systems that attempt to understand the contents of a document such as a news release beyond simple keyword matching and to judge its suitability for a user are broader and require significant complexity, but they are still somewhat shallow. Systems that are both very broad and very deep are beyond the current state of the art. == Components and architecture == Regardless of the approach used, most NLU systems share some common components. The system needs a lexicon of the language and a parser and grammar rules to break sentences into an internal representation. The construction of a rich lexicon with a suitable ontology requires significant effort, e.g., the Wordnet lexicon required many person-years of effort. The system also needs theory from semantics to guide the comprehension. The interpretation capabilities of a language-understanding system depend on the semantic theory it uses. Competing semantic theories of language have specific trade-offs in their suitability as the basis of computer-automated semantic interpretation. These range from naive semantics or stochastic semantic analysis to the use of pragmatics to derive meaning from context. Semantic parsers convert natural-language texts into formal meaning representations. Advanced applications of NLU also attempt to incorporate logical inference within their framework. This is generally achieved by mapping the derived meaning into a set of assertions in predicate logic, then using logical deduction to arrive at conclusions. Therefore, systems based on functional languages such as Lisp need to include a subsystem to represent logical assertions, while logic-oriented systems such as those using the language Prolog generally rely on an extension of the built-in logical representation framework. The management of context in NLU can present special challenges. A large variety of examples and counter examples have resulted in multiple approaches to the formal modeling of context, each with specific strengths and weaknesses.

    Read more →
  • Attensity

    Attensity

    Attensity was an American company that provided social analytics and engagement applications for social customer relationship management (social CRM). Attensity's text analytics software applications extracted facts, relationships and sentiment from unstructured data. == History == Attensity was founded in 2000. An early investor in Attensity was In-Q-Tel, which funds technology to support the missions of the US Government and the broader DOD. InTTENSITY, an independent company that has combined Inxight with Attensity Software (the only joint development project that combines two InQTel funded software packages), was the exclusive distributor and outlet for Attensity in the Federal Market. In 2009, Attensity Corp., then based in Palo Alto, merged with Germany's Empolis and Living-e AG to form Attensity Group. In 2010, Attensity Group acquired Biz360, a provider of social media monitoring and market intelligence solutions. In early 2012, Attensity Group divested itself of the Empolis business unit via a management buyout; that unit currently conducts business under its pre-merger name. Attensity Group was a closely held private company. Its majority shareholder was Aeris Capital, a private Swiss investment office advising a high-net-worth individual and his charitable foundation. Foundation Capital, Granite Ventures, and Scale Venture Partners were among Biz360's investors and thus became shareholders in Attensity Group. In February 2016, Attensity's IP assets were acquired by InContact, and Attensity closed.

    Read more →
  • Agent Ruby

    Agent Ruby

    Agent Ruby (1998–2002) by Lynn Hershman Leeson is an interactive, multiuser work using artificial intelligence. == Description == On Agent Ruby's website, "Agent Ruby's Edream Portal," a female face moves her eyes and lips. Ruby, named from Hershman Leeson's own film, Teknolust, answers questions and often responds that she needs a better algorithm to answer questions not within her database. The work, created with AI, explores relationships between real and virtual worlds. Hershman Leeson had created an earlier version of Ruby, CyberRoberta, which was a custom-made doll with webcam eyes that interacted with the internet. The work in a gallery provides a screen and a sign inviting gallery-goers to "Chat with Ruby." == Artificial intelligence == In 2015 when Agent Ruby was exhibited at the gallery Modern Art Oxford, a review in Aesthetica Magazine described it as an artificial intelligence agent. A review in New Scientist noted that "Ruby is a fast learner, but perhaps not a natural conversationalist." A 2024 list of "25 Essential AI Artworks" published by ARTnews wrote that while "Agent Ruby's capabilities seem limited by today's standards," it was extensive for its day. == Publications and exhibitions == Agent Ruby was commissioned and displayed at the San Francisco Museum of Modern Art, Modern Art Oxford, and the ZKM Center for Art and Media in Karlsruhe, Germany. The San Francisco Museum of Modern Art (SFMOMA) presented Lynn Hershman Leeson: The Agent Ruby Files, March 30 through June 2, 2013 which presented the project server's archive of user conversations over the 12 years of exhibitions.

    Read more →
  • Sprite multiplexing

    Sprite multiplexing

    Sprite multiplexing is a computer graphics technique where additional sprites (moving images) can be drawn on the screen, beyond the nominal maximum. It is largely historical, applicable principally to older hardware, where limited resources (such as CPU speed and memory) meant only a relatively small number of sprites were supported. On the other hand, it is also true that without multiplexing, the sprite circuitry would be idle much of the time, and limited resources were wasted. == Description == The sprite multiplexing technique is based on the idea that while the hardware may only support a finite number of sprites, it is sometimes possible to re-use the same sprite "slots" more than once per frame or scan line. The program will first use the hardware to draw one or more sprite(s), as normal. Before the next frame (or next scanline) needs to be drawn, the software reprograms the hardware to display additional sprites, in other positions. For example, the Nintendo Entertainment System explicitly supports hardware sprite multiplexing, where it has 64 hardware sprites, but is only capable of rendering 8 of them per scanline. On the older Atari 2600, sprite multiplexing was not intentionally designed in, but programmers discovered they could reset the TIA graphics chip to draw additional sprites on the same scanline. The sprite multiplexing technique relies on the program being able to identify what part of the video screen is being drawn at the moment, or being triggered by the video hardware to run a subroutine at the crucial moment. The programmer must carefully consider the layout of the screen. If the video graphics hardware is not reprogrammed in time for the extra sprites to be displayed, they will not appear, or will be drawn incorrectly. Modern video graphics hardware typically does not use hardware sprites, since modern computer systems do not have the kind of limitations that sprite hardware is designed to circumvent. == Implementations == Systems that allow the programmer to employ the sprite multiplexing technique include: Atari 2600 Atari 8-bit computers Amiga Commodore 64 MSX Nintendo Entertainment System Super Nintendo Entertainment System Master System Sega Genesis/Mega Drive

    Read more →
  • Language model benchmark

    Language model benchmark

    A language model benchmark is a standardized test designed to evaluate the performance of language models on various natural language processing tasks. These tests are intended for comparing different models' capabilities in areas such as language understanding, generation, and reasoning. Benchmarks generally consist of a dataset and corresponding evaluation metrics. The dataset provides text samples and annotations, while the metrics measure a model's performance on tasks like answering questions, text classification, and machine translation. These benchmarks are developed and maintained by academic institutions, research organizations, and industry players to track progress in the field. In addition to accuracy, the metrics can include throughput, energy efficiency, bias, trust, and sustainability. == Overview == === Types === Benchmarks may be described by the following adjectives, not mutually exclusive: Classical: These tasks are studied in natural language processing, even before the advent of deep learning. Examples include the Penn Treebank for testing syntactic and semantic parsing, as well as bilingual translation benchmarked by BLEU scores. Question answering: These tasks have a text question and a text answer, often multiple-choice. They can be open-book or closed-book. Open-book QA resembles reading comprehension questions, with relevant passages included as annotation in the question, in which the answer appears. Closed-book QA includes no relevant passages. Closed-book QA is also called open-domain question-answering. Before the era of large language models, open-book QA was more common, and understood as testing information retrieval methods. Closed-book QA became common since GPT-2 as a method to measure knowledge stored within model parameters. Omnibus: An omnibus benchmark combines many benchmarks, often previously published. It is intended as an all-in-one benchmarking solution. Reasoning: These tasks are usually in the question-answering format, but are intended to be more difficult than standard question answering. Multimodal: These tasks require processing not only text, but also other modalities, such as images and sound. Examples include OCR and transcription. Agency: These tasks are for a language-model–based software agent that operates a computer for a user, such as editing images, browsing the web, etc. Adversarial: A benchmark is "adversarial" if the items in the benchmark are picked specifically so that certain models do badly on them. Adversarial benchmarks are often constructed after state of the art (SOTA) models have saturated (achieved 100% performance) a benchmark, to renew the benchmark. A benchmark is "adversarial" only at a certain moment in time, since what is adversarial may cease to be adversarial as newer SOTA models appear. Public/Private: A benchmark might be partly or entirely private, meaning that some or all of the questions are not publicly available. The idea is that if a question is publicly available, then it might be used for training, which would be "training on the test set" and invalidate the result of the benchmark. Usually, only the guardians of the benchmark have access to the private subsets, and to score a model on such a benchmark, one must send the model weights, or provide API access, to the guardians. The boundary between a benchmark and a dataset is not sharp. Generally, a dataset contains three "splits": training, test, and validation. Both the test and validation splits are essentially benchmarks. In general, a benchmark is distinguished from a test/validation dataset in that a benchmark is typically intended to be used to measure the performance of many different models that are not trained specifically for doing well on the benchmark, while a test/validation set is intended to be used to measure the performance of models trained specifically on the corresponding training set. In other words, a benchmark may be thought of as a test/validation set without a corresponding training set. Conversely, certain benchmarks may be used as a training set, such as the English Gigaword or the One Billion Word Benchmark, which in modern language is just the negative log-likelihood loss on a pretraining set with 1 billion words. Indeed, the distinction between benchmark and dataset in language models became sharper after the rise of the pretraining paradigm, whereby a model is first trained on massive, unlabeled datasets to learn general language patterns, syntax, and knowledge (pretraining), and the base model is then adapted to specific, downstream tasks using smaller, labeled datasets (fine-tuning). === Lifecycle === Generally, the life cycle of a benchmark consists of the following steps: Inception: A benchmark is published. It can be simply given as a demonstration of the power of a new model (implicitly) that others then picked up as a benchmark, or as a benchmark that others are encouraged to use (explicitly). Growth: More papers and models use the benchmark, and the performance on the benchmark grows. Maturity, degeneration or deprecation: A benchmark may be saturated, after which researchers move on to other benchmarks. Progress on the benchmark may also be neglected as the field moves to focus on other benchmarks. Renewal: A saturated benchmark can be upgraded to make it no longer saturated, allowing further progress. === Construction === Like datasets, benchmarks are typically constructed by several methods, individually or in combination: Web scraping: Ready-made question-answer pairs may be scraped online, such as from websites that teach mathematics and programming. Conversion: Items may be constructed programmatically from scraped web content, such as by blanking out named entities from sentences, and asking the model to fill in the blank. This was used for making the CNN/Daily Mail Reading Comprehension Task. Crowd sourcing: Items may be constructed by paying people to write them, such as on Amazon Mechanical Turk. This was used for making the MCTest. === Evaluation === Generally, benchmarks are fully automated. This limits the questions that can be asked. For example, with mathematical questions, "proving a claim" would be difficult to automatically check, while "calculate an answer with a unique integer answer" would be automatically checkable. With programming tasks, the answer can generally be checked by running unit tests, with an upper limit on runtime. The benchmark scores are of the following kinds: For multiple choice or cloze questions, common scores are accuracy (frequency of correct answer), precision, recall, F1 score, etc. pass@n: The model is given n {\displaystyle n} attempts to solve each problem. If any attempt is correct, the model earns a point. The pass@n score is the model's average score over all problems. k@n: The model makes n {\displaystyle n} attempts to solve each problem, but only k {\displaystyle k} attempts out of them are selected for submission. If any submission is correct, the model earns a point. The k@n score is the model's average score over all problems. cons@n: The model is given n {\displaystyle n} attempts to solve each problem. If the most common answer is correct, the model earns a point. The cons@n score is the model's average score over all problems. Here "cons" stands for "consensus" or "majority voting". The pass@n score can be estimated more accurately by making N > n {\displaystyle N>n} attempts, and use the unbiased estimator 1 − ( N − c n ) ( N n ) {\displaystyle 1-{\frac {\binom {N-c}{n}}{\binom {N}{n}}}} , where c {\displaystyle c} is the number of correct attempts. For less well-formed tasks, where the output can be any sentence, there are the following commonly used scores including BLEU ROUGE, METEOR, NIST, word error rate, LEPOR, CIDEr, and SPICE. === Issues === error: Some benchmark answers may be wrong. ambiguity: Some benchmark questions may be ambiguously worded. subjective: Some benchmark questions may not have an objective answer at all. This problem generally prevents creative writing benchmarks. Similarly, this prevents benchmarking writing proofs in natural language, though benchmarking proofs in a formal language is possible. open-ended: Some benchmark questions may not have a single answer of a fixed size. This problem generally prevents programming benchmarks from using more natural tasks such as "write a program for X", and instead uses tasks such as "write a function that implements specification X". inter-annotator agreement: Some benchmark questions may be not fully objective, such that even people would not agree with 100% on what the answer should be. This is common in natural language processing tasks, such as syntactic annotation. shortcut: Some benchmark questions may be easily solved by an "unintended" shortcut. For example, in the SNLI benchmark, having a negative word like "not" in the second sentence is a strong signal for the "Contradiction" category, regardless of what the se

    Read more →
  • SmarterChild

    SmarterChild

    SmarterChild was a chatbot available on AOL Instant Messenger and Windows Live Messenger (previously MSN Messenger) networks. == History == SmarterChild was an apparently intelligent agent or "bot" developed by ActiveBuddy, Inc., with offices in New York and Sunnyvale. It was widely distributed across global instant messaging networks. SmarterChild became very popular, attracting over 30 million Instant Messenger "buddies" on AIM (AOL), MSN and Yahoo Messenger over the course of its lifetime. Founded in 2000, ActiveBuddy was the brainchild of Robert Hoffer and Timothy Kay, who later brought seasoned advertising executive Peter Levitan on board as CEO. The concept for conversational instant messaging bots came from the founder's vision to add natural language comprehension functionality to the increasingly popular AIM instant messaging application. The original implementation took shape as a demo that Kay programmed in Perl in his Los Altos garage to connect a single buddy name, "ActiveBuddy", to look up stock symbols, and later allow AIM users to play Colossal Cave Adventure, a word-based adventure game, and MIT's Boris Katz Start Question Answering System but quickly grew to include a wide range of database applications the company called 'knowledge domains' including instant access to news, weather, stock information, movie times, yellow pages listings, and detailed sports data, as well as a variety of tools (personal assistant, calculators, translator, etc.). None of the individual domains which the company had named “stocksBuddy”, “sportsBuddy”, etc. ever launched publicly. When Stephen Klein came on board as COO — and eventually CEO — he insisted that all of the disparate test “buddies” be launched together with the company’s highly-developed colloquial chat domain. He suggested using “SmarterChild”, a username coined by Tim Kay which Tim was using to test various things. The bundled domains were launched publicly as SmarterChild (on AIM initially) in June 2001. SmarterChild provided information wrapped in fun and quirky conversation. The company generated no revenue from SmarterChild, but used it as a demonstration of the power of what Klein called “conversational computing”. The company subsequently marketed Automated Service Agents—delivering immediate answers to customer service inquiries—-to large corporations, like Comcast, Cingular, TimeWarner Cable, etc. SmarterChild's popularity spawned targeted marketing-oriented bots for Radiohead, Austin Powers, Intel, Keebler, The Sporting News and others. ActiveBuddy co-founders, Kay and Hoffer, as co-inventors, were issued two controversial U.S. patents in 2002. ActiveBuddy changed its name to Colloquis (briefly Conversagent) and targeted development of consumer-facing enterprise customer service agents, which the company marketed as Automated Service Agents. Microsoft acquired Colloquis in October 2006 and proceeded to de-commission SmarterChild and kill off the Automated Service Agent business as well. Robert Hoffer, ActiveBuddy co-founder, licensed the technology from Microsoft after Microsoft abandoned the Colloquis technology.

    Read more →
  • Salience (neuroscience)

    Salience (neuroscience)

    Salience (also called saliency, from Latin saliō meaning "leap, spring") is the property by which some thing stands out. Salient events are an attentional mechanism by which organisms learn and survive; those organisms can focus their limited perceptual and cognitive resources on the pertinent (that is, salient) subset of the sensory data available to them. Saliency typically arises from contrasts between items and their neighborhood. They might be represented, for example, by a red dot surrounded by white dots, or by a flickering message indicator of an answering machine, or a loud noise in an otherwise quiet environment. Saliency detection is often studied in the context of the visual system, but similar mechanisms operate in other sensory systems. Just what is salient can be influenced by training: for example, for human subjects particular letters can become salient by training. There can be a sequence of necessary events, each of which has to be salient, in turn, in order for successful training in the sequence; the alternative is a failure, as in an illustrated sequence when tying a bowline; in the list of illustrations, even the first illustration is a salient: the rope in the list must cross over, and not under the bitter end of the rope (which can remain fixed, and not free to move); failure to notice that the first salient has not been satisfied means the knot will fail to hold, even when the remaining salient events have been satisfied. When attention deployment is driven by salient stimuli, it is considered to be bottom-up, memory-free, and reactive. Conversely, attention can also be guided by top-down, memory-dependent, or anticipatory mechanisms, such as when looking ahead of moving objects or sideways before crossing streets. Humans and other animals have difficulty paying attention to more than one item simultaneously, so they are faced with the challenge of continuously integrating and prioritizing different bottom-up and top-down influences. == Neuroanatomy == The brain component named the hippocampus helps with the assessment of salience and context by using past memories to filter new incoming stimuli, and placing those that are most important into long term memory. The entorhinal cortex is the pathway into and out of the hippocampus, and is an important part of the brain's memory network; research shows that it is a brain region that suffers damage early on in Alzheimer's disease, one of the effects of which is altered (diminished) salience. The pulvinar nuclei (in the thalamus) modulate physical/perceptual salience in attentional selection. One group of neurons (i.e., D1-type medium spiny neurons) within the nucleus accumbens shell (NAcc shell) assigns appetitive motivational salience ("want" and "desire", which includes a motivational component), aka incentive salience, to rewarding stimuli, while another group of neurons (i.e., D2-type medium spiny neurons) within the NAcc shell assigns aversive motivational salience to aversive stimuli. The primary visual cortex (V1) generates a bottom-up saliency map from visual inputs to guide reflexive attentional shifts or gaze shifts. According to V1 Saliency Hypothesis, the saliency of a location is higher when V1 neurons give higher responses to that location relative to V1 neurons' responses to other visual locations. For example, a unique red item among green items, or a unique vertical bar among horizontal bars, is salient since it evokes higher V1 responses and attracts attention or gaze. The V1 neural responses are sent to the superior colliculus to guide gaze shifts to the salient locations. A fingerprint of the saliency map in V1 is that attention or gaze can be captured by the location of an eye-of-origin singleton in visual inputs, e.g., a bar uniquely shown to the left eye in a background of many other bars shown to the right eye, even when observers cannot tell the difference between the singleton and the background bars. == In psychology == The term is widely used in the study of perception and cognition to refer to any aspect of a stimulus that, for any of many reasons, stands out from the rest. Salience may be the result of emotional, motivational or cognitive factors and is not necessarily associated with physical factors such as intensity, clarity or size. Although salience is thought to determine attentional selection, salience associated with physical factors does not necessarily influence selection of a stimulus. === Salience bias === Salience bias (also referred to as perceptual salience) is a cognitive bias that predisposes individuals to focus on or attend to items, information, or stimuli that are more prominent, visible, or emotionally striking. This is as opposed to stimuli that are unremarkable, or less salient, even though this difference is often irrelevant by objective standards. The American Psychological Association (APA) defines the salience hypothesis as a theory regarding perception where "motivationally significant" information is more readily perceived than information with little or less significant motivational importance. Perceptual salience (salience bias) is linked to the vividness effect, whereby a more pronounced response is produced by a more vivid perception of a stimulus than the mere knowledge of the stimulus. Salience bias assumes that more dynamic, conspicuous, or distinctive stimuli engage attention more than less prominent stimuli, disproportionately impacting decision making, it is a bias which favors more salient information. ==== Application ==== ===== Cognitive Psychology ===== Salience bias, like all other cognitive biases, is an applicable concept to various disciplines. For example, cognitive psychology investigates cognitive functions and processes, such as perception, attention, memory, problem solving, and decision making, all of which could be influenced by salience bias. Salience bias acts to combat cognitive overload by focusing attention on prominent stimuli, which affects how individuals perceive the world as other, less vivid stimuli that could add to or change this perception, are ignored. Human attention gravitates towards novel and relevant stimuli and unconsciously filters out less prominent information, demonstrating salience bias, which influences behavior as human behavior is affected by what is attended to. Behavioral economists Tversky and Kahneman also suggest that the retrieval of instances is influenced by their salience, such as how witnessing or experiencing an event first-hand has a greater impact than when it is less salient, like if it were read about, implying that memory is affected by salience. ===== Language ===== It is also relevant in language understanding and acquisition. Focusing on more salient phenomena allows people to detect language patterns and dialect variations more easily, making dialect categorization more efficient. ===== Social Behavior ===== Furthermore, social behaviors and interactions can also be influenced by perceptual salience. Changes in the perceptual salience of an individual heavily influences their social behavior and subjective experience of their social interactions, confirming a "social salience effect". Social salience relates to how individuals perceive and respond to other people. ===== Behavioral Science ===== The connection between salience bias and other heuristics, like availability and representativeness, links it to the fields of behavioral science and behavioral economics. Salience bias is closely related to the availability heuristic in behavioral economics, based on the influence of information vividness and visibility, such as recency or frequency, on judgements, for example:Accessibility and salience are closely related to availability, and they are important as well. If you have personally experienced a serious earthquake, you're more likely to believe that an earthquake is likely than if you read about it in a weekly magazine. Thus, vivid and easily imagined causes of death (for example, tornadoes) often receive inflated estimates of probability, and less-vivid causes (for example, asthma attacks) receive low estimates, even if they occur with a far greater frequency (here, by a factor of twenty). Timing counts too: more recent events have a greater impact on our behavior, and on our fears, than earlier ones.Humans have bounded rationality, which refers to their limited ability to be rational in decision making, due to a limited capacity to process information and cognitive ability. Heuristics, such as availability, are employed to reduce the complexity of cognitive and social tasks or judgements, in order to decrease the cognitive load that result from bounded rationality. Despite the effectiveness of heuristics in doing so, they are limited by systematic errors that occur, often the result of influencing biases, such as salience. This can lead to misdirected or misinformed judgements, based on an overemphasis or overweighting of

    Read more →
  • Inverse consistency

    Inverse consistency

    In image registration, inverse consistency measures the consistency of mappings between images produced by a registration algorithm. The inverse consistency error, introduced by Christiansen and Johnson in 2001, quantifies the distance between the composition of the mappings from each image to the other, produced by the registration procedure, and the identity function, and is used as a regularisation constraint in the loss function of many registration algorithms to enforce consistent mappings. Inverse consistency is necessary for good image registration but it is not sufficient, since a mapping can be perfectly consistent but not register the images at all. == Definition == Image registration is the process of establishing a common coordinate system between two images, and given two images I 1 : Ω 1 → R I 2 : Ω 2 → R {\displaystyle {\begin{aligned}I_{1}:\Omega _{1}\to \mathbb {R} \\I_{2}:\Omega _{2}\to \mathbb {R} \end{aligned}}} registering a source image I 1 {\displaystyle I_{1}} to a target image I 2 {\displaystyle I_{2}} consists of determining a transformation f 1 : Ω 2 → Ω 1 {\displaystyle f_{1}:\Omega _{2}\to \Omega _{1}} that maps points from the target space to the source space. An ideal registration algorithm should not be sensitive to which image in the pair is used as source or target, and the registration operator should be antisymmetric such that the mappings f 1 : Ω 2 → Ω 1 f 2 : Ω 1 → Ω 2 {\displaystyle {\begin{aligned}f_{1}:\Omega _{2}\to \Omega _{1}\\f_{2}:\Omega _{1}\to \Omega _{2}\end{aligned}}} produced when registering I 1 {\displaystyle I_{1}} to I 2 {\displaystyle I_{2}} and I 2 {\displaystyle I_{2}} to I 1 {\displaystyle I_{1}} respectively should be the inverse of each other, i.e. f 2 = f 1 − 1 {\displaystyle f_{2}=f_{1}^{-1}} and f 1 = f 2 − 1 {\displaystyle f_{1}=f_{2}^{-1}} or, equivalently, f 2 ∘ f 1 = id Ω 2 {\displaystyle f_{2}\circ f_{1}=\operatorname {id} _{\Omega _{2}}} and f 1 ∘ f 2 = id Ω 1 {\displaystyle f_{1}\circ f_{2}=\operatorname {id} _{\Omega _{1}}} , where ∘ {\displaystyle \circ } denotes the function composition operator. Real algorithms are not perfect, and when swapping the role of source and target image in a registration problem the so obtained transformations are not the inverse of each other. Inverse consistency can be enforced by adding to the loss function of the registration a symmetric regularisation term that penalises inconsistent transformations ∫ Ω 2 ‖ f 2 ( f 1 ( x ) ) − x ‖ 2 d x + ∫ Ω 1 ‖ f 1 ( f 2 ( x ) ) − x ‖ 2 d x . {\displaystyle \int _{\Omega _{2}}\left\Vert f_{2}(f_{1}(x))-x\right\Vert ^{2}\mathrm {d} x+\int _{\Omega _{1}}\left\Vert f_{1}(f_{2}(x))-x\right\Vert ^{2}\mathrm {d} x.} Inverse consistency can be used as a quality metric to evaluate image registration results. The inverse consistency error ( I C E {\displaystyle ICE} ) measures the distance between the composition of the two transforms and the identity function, and it can be formulated in terms of both average ( I C E a {\displaystyle ICE_{a}} ) or maximum ( I C E m {\displaystyle ICE_{m}} ) over a region of interest Ω {\displaystyle \Omega } of the image: I C E a = 1 ∫ Ω d x ∫ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ d x I C E m = max x ∈ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ . {\displaystyle {\begin{aligned}ICE_{a}&={\frac {1}{\int _{\Omega }\mathrm {d} x}}\int _{\Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert \mathrm {d} x\\ICE_{m}&=\max _{x\in \Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert .\end{aligned}}} While inverse consistency is a necessary property of good registration algorithms, inverse consistency error alone is not a sufficient metric to evaluate the quality of image registration results, since a perfectly consistent mapping, with no other constraint, may be not even close to correctly register a pair of images.

    Read more →
  • Luminoso

    Luminoso

    Luminoso is a Cambridge, MA-based text analytics and artificial intelligence company. It spun out of the MIT Media Lab and its crowd-sourced Open Mind Common Sense (OMCS) project. The company has raised $20.6 million in financing, and its clients include Sony, Autodesk, Scotts Miracle-Gro, and GlaxoSmithKline. == History == Luminoso was co-founded in 2010 by Dennis Clark, Jason Alonso, Robyn Speer, and Catherine Havasi, a research scientist at MIT in artificial intelligence and computational linguistics. The company builds on the knowledge base of MIT’s Open Mind Common Sense (OMCS) project, co-founded in 1999 by Havasi, who continues to serve as its director. The OCMS knowledge base has since been combined with knowledge from other crowdsourced resources to become ConceptNet. ConceptNet consists of approximately 28 million statements in 304 languages, with full support for 10 languages and moderate support for 77 languages. ConceptNet is a resource for making an AI that understands the meanings of the words people use. During the World Cup in June 2014, the company provided a widely reported real-time sentiment analysis of the U.S. vs. Germany match, analyzing 900,000 posts on Twitter, Facebook and Google+. == Applications == The company uses artificial intelligence, natural language processing, and machine learning to derive insights from unstructured data such as contact center interactions, chatbot and live chat transcripts, product reviews, open-ended survey responses, and email. Luminoso's software identifies and quantifies patterns and relationships in text-based data, including domain-specific or creative language. Rather than human-powered keyword searches of data, the software automates taxonomy creation around concepts, allowing related words and phrases to be dynamically generated and tracked. Commercial applications include analyzing, prioritizing, and routing contact center interactions; identifying consumer complaints before they begin to trend; and tracking sentiment during product launches. The software natively analyzes text in fourteen languages, as well as emoji. == Products == Luminoso's technology can be accessed via two products: Luminoso Daylight and Luminoso Compass. Luminoso Daylight enables a deep-dive analysis into batch or real-time data, whereas Luminoso Compass automates the categorization of real-time data. Both products offer a user interface as well as an API. Luminoso's products can be implemented through either a cloud-based or an on-premise solution. == Research == Luminoso continues to actively conduct research in natural language processing and word embeddings and regularly participates in evaluations such as SemEval. At SemEval 2017, Luminoso participated in Task 2, measuring the semantic similarity of word pairs within and across five languages. Its solution outperformed all competing systems in every language pair tested, with the exception of Persian. == Recognition == Luminoso has been listed as a "Cool Vendor in AI for Marketing" by Gartner, and has also been named a "Boston Artificial Intelligence Startup to Watch" by BostInno. In May 2017, Luminoso was recognized as having the Best Application for AI in the Enterprise by AI Business, and was also shortlisted as the Best AI Breakthrough and Best Innovation in NLP. == Competitors == Major competitors include Clarabridge and Lexalytics. == Investors == The company raised $1.5 million from angel investors led by Basis Technology in 2012. Its first institutional funding round of $6.5 was completed in July 2014, led by Acadia Woods with participation from Japan’s Digital Garage. The company followed that with a $10M series B funding round in December 2018, led by DVI Equity Partners, with participation from Liberty Global Ventures, DF Enterprises, Raptor Holdco, Acadia Woods Partners, and Accord Ventures, among others.

    Read more →
  • Text Retrieval Conference

    Text Retrieval Conference

    The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology. TREC's evaluation protocols have improved many search technologies. A 2010 study estimated that "without TREC, U.S. Internet users would have spent up to 3.15 billion additional hours using web search engines between 1999 and 2009." Hal Varian the Chief Economist at Google wrote that "The TREC data revitalized research on information retrieval. Having a standard, widely available, and carefully constructed set of data laid the groundwork for further innovation in this field." Each track has a challenge wherein NIST provides participating groups with data sets and test problems. Depending on track, test problems might be questions, topics, or target extractable features. Uniform scoring is performed so the systems can be fairly evaluated. After evaluation of the results, a workshop provides a place for participants to collect together thoughts and ideas and present current and future research work.Text Retrieval Conference started in 1992, funded by DARPA (US Defense Advanced Research Project) and run by NIST. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. == Goals == Encourage retrieval search based on large text collections Increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas Speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements retrieval methodologies on real world problems To increase the availability of appropriate evaluation techniques for use by industry and academia including development of new evaluation techniques more applicable to current systems TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provide a set of documents and questions. Participants run their own retrieval system on the data and return to NIST a list of retrieved top-ranked documents. NIST pools the individual result judges the retrieved documents for correctness and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences. == Relevance judgments in TREC == TREC defines relevance as: "If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant." Most TREC retrieval tasks use binary relevance: a document is either relevant or not relevant. Some TREC tasks use graded relevance, capturing multiple degrees of relevance. Most TREC collections are too large to perform complete relevance assessment; for these collections it is impossible to calculate the absolute recall for each query. To decide which documents to assess, TREC usually uses a method call pooling. In this method, the top-ranked n documents from each contributing run are aggregated, and the resulting document set is judged completely. == Various TRECs == In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections .Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach. TREC2 Took place in August 1993. 31 group of researchers participated in this. Two types of retrieval were examined. Retrieval using an ‘ad hoc’ query and retrieval using a ‘routing' query In TREC-3 a small group experiments worked with Spanish language collection and others dealt with interactive query formulation in multiple databases TREC-4 they made even shorter to investigate the problems with very short user statements TREC-5 includes both short and long versions of the topics with the goal of carrying out deeper investigation into which types of techniques work well on various lengths of topics In TREC-6 Three new tracks speech, cross language, high precision information retrieval were introduced. The goal of cross language information retrieval is to facilitate research on system that are able to retrieve relevant document regardless of language of the source document TREC-7 contained seven tracks out of which two were new Query track and very large corpus track. The goal of the query track was to create a large query collection TREC-8 contain seven tracks out of which two –question answering and web tracks were new. The objective of QA query is to explore the possibilities of providing answers to specific natural language queries TREC-9 Includes seven tracks In TREC-10 Video tracks introduced Video tracks design to promote research in content based retrieval from digital video In TREC-11 Novelty tracks introduced. The goal of novelty track is to investigate systems abilities to locate relevant and new information within the ranked set of documents returned by a traditional document retrieval system TREC-12 held in 2003 added three new tracks; Genome track, robust retrieval track, HARD (Highly Accurate Retrieval from Documents) == Tracks == === Current tracks === New tracks are added as new research needs are identified, this list is current for TREC 2018. CENTRE Track – Goal: run in parallel CLEF 2018, NTCIR-14, TREC 2018 to develop and tune an IR reproducibility evaluation protocol (new track for 2018). Common Core Track – Goal: an ad hoc search task over news documents. Complex Answer Retrieval (CAR) – Goal: to develop systems capable of answering complex information needs by collating information from an entire corpus. Incident Streams Track – Goal: to research technologies to automatically process social media streams during emergency situations (new track for TREC 2018). The News Track – Goal: partnership with The Washington Post to develop test collections in news environment (new for 2018). Precision Medicine Track – Goal: a specialization of the Clinical Decision Support track to focus on linking oncology patient data to clinical trials. Real-Time Summarization Track (RTS) – Goal: to explore techniques for real-time update summaries from social media streams. === Past tracks === Chemical Track – Goal: to develop and evaluate technology for large scale search in chemistry-related documents, including academic papers and patents, to better meet the needs of professional searchers, and specifically patent searchers and chemists. Clinical Decision Support Track – Goal: to investigate techniques for linking medical cases to information relevant for patient care Contextual Suggestion Track – Goal: to investigate search techniques for complex information needs that are highly dependent on context and user interests. Crowdsourcing Track – Goal: to provide a collaborative venue for exploring crowdsourcing methods both for evaluating search and for performing search tasks. Genomics Track – Goal: to study the retrieval of genomic data, not just gene sequences but also supporting documentation such as research papers, lab reports, etc. Last ran on TREC 2007. Dynamic Domain Track – Goal: to investigate domain-specific search algorithms that adapt to the dynamic information needs of professional users as they explore in complex domains. Enterprise Track – Goal: to study search over the data of an organization to complete some task. Last ran on TREC 2008. Entity Track – Goal: to perform entity-related search on Web data. These search tasks (such as finding entities and properties of entities) address common information needs that are not that well modeled as ad hoc document search. Cross-Language Track – Goal: to investigate the ability of retrieval systems to find documents topically regardless of source language. After 1999, this track spun off into CLEF. FedWeb Track – Goal: to select best resources to forward a query to, and merge the results so that most relevant are on the top. Federated Web Search Track – Goal: to investigate techniques for the selection and combination of search results from a large number of real on-line web search services. Filtering Track – Goal: to binarily decide retrieval of new

    Read more →