Permutation automaton

Permutation automaton

In automata theory, a permutation automaton, or pure-group automaton, is a deterministic finite automaton such that each input symbol permutes the set of states. Formally, a deterministic finite automaton A may be defined by the tuple (Q, Σ, δ, q0, F), where Q is the set of states of the automaton, Σ is the set of input symbols, δ is the transition function that takes a state q and an input symbol x to a new state δ(q,x), q0 is the initial state of the automaton, and F is the set of accepting states (also: final states) of the automaton. A is a permutation automaton if and only if, for every two distinct states qi and qj in Q and every input symbol x in Σ, δ(qi,x) ≠ δ(qj,x). A formal language is p-regular (also: a pure-group language) if it is accepted by a permutation automaton. For example, the set of strings of even length forms a p-regular language: it may be accepted by a permutation automaton with two states in which every transition replaces one state by the other. == Applications == The pure-group languages were the first interesting family of regular languages for which the star height problem was proved to be computable. Another mathematical problem on regular languages is the separating words problem, which asks for the size of a smallest deterministic finite automaton that distinguishes between two given words of length at most n – by accepting one word and rejecting the other. The known upper bound in the general case is O ( n 2 / 5 ( log ⁡ n ) 3 / 5 ) {\displaystyle O(n^{2/5}(\log n)^{3/5})} . The problem was later studied for the restriction to permutation automata. In this case, the known upper bound changes to O ( n 1 / 2 ) {\displaystyle O(n^{1/2})} .

Adobe Presenter

Adobe Presenter is eLearning software released by Adobe Systems available on the Microsoft Windows platform as a Microsoft PowerPoint plug-in, and on both Windows and OS X as the screencasting and video editing tool Adobe Presenter Video Express. It is mainly targeted towards learning professionals and trainers. In addition to recording one's computer desktop and speech, it also provides the option to add quizzes and track performance by integrating with learning management systems. Adobe Presenter was designed to replace the discontinued Adobe Ovation software, which had similar functions. == Predecessor == Adobe Ovation was originally released by Serious Magic. It converted PowerPoint slides into visual presentations with additional effects. Ovation included themes called PowerLooks that could add motion and polish the presentations. They were available in a variety of color variations complete with animated backgrounds and dynamic text effects. Ovation could make text with jagged edges more readable. TimeKeeper could be used to set the period of the presentation, and the PointPrompter scrolled down the notes. Ovation's development has been discontinued, nor does it support PowerPoint 2007. == Features == The main purpose of Adobe Presenter is to capture on-screen presentations and convert them into more interactive and engaging videos. Support is given to convert Microsoft PowerPoint 2010 and 2013 presentations into videos. It also allows for content authoring on PowerPoint and ActionScript 3, and offers integration with Adobe Captivate. Slide branching enables users to control slide navigation and titles and create complex slide branching to guide viewers through the content of the presentation. Video editing tools are also provided, and offer the ability to upload to video-sharing platforms such as YouTube, Vimeo and other sites. Multimedia features such as annotations, eLearning templates, actors, audio narration and drag-and-drop elements enrich users' presentations. Quizzes and surveys is another highlighted feature, which include generating question pools, importing questions from existing quizzes and in-course collaboration which allows presenters to receive feedback by allowing them to comment on specific content within a course or ask questions for more clarity. Presenters could opt to receive feedback from viewers through video analytics and create Experience API, SCORM and AICC-compliant content. Options to publish to Adobe Connect are provided. Other unique features include universal standards support, file size control, navigational restrictions among others.

Replika

Replika is a generative AI chatbot app released in November 2017. The chatbot is trained by having the user answer a series of questions to create a specific neural network. The chatbot operates on a freemium pricing strategy, with roughly 25% of its user base paying an annual subscription fee. == History == Eugenia Kuyda, a Russian-born journalist, established Replika while working at Luka, a tech company she had co-founded at the startup accelerator Y Combinator around 2012. Luka's primary product was a chatbot that made restaurant recommendations. According to Kuyda's origin story for Replika, a friend of hers died in 2015 and she converted that person's text messages into a chatbot. According to Kuyda's story, that chatbot helped her remember the conversations that they had together, and eventually became Replika. Replika became available to the public in November 2017. By January 2018 it had 2 million users, and in January 2023 reached 10 million users. In August 2024, Replika's CEO, Kuyda, reported that the total number of users had surpassed 30 million. In 2025, Dmytro Klochko became CEO, and Replika’s user base exceeded 40 million. In February 2023 the Italian Data Protection Authority banned Replika from using users' data, citing the AI's potential risks to emotionally vulnerable people, and the exposure of unscreened minors to sexual conversation. Within days of the ruling, Replika removed the ability for the chatbot to engage in erotic talk, with Kuyda, the company's director, saying that Replika was never intended for erotic discussion. Replika users disagreed, noting that Replika had used sexually suggestive advertising to draw users to the service. Replika representatives stated that explicit chats made up just 5% of conversations on the app at the time of the decision. In May 2023, Replika restored the functionality for users who had joined prior to February that year. Replika is registered in San Francisco. As of August 2024, Replika's website says that its team "works remotely with no physical offices". == Social features == Users react to Replika in many ways. The free-tier offers Replika as a "friend", with paid premium tiers offering Replika as a "partner", "spouse", "sibling" or "mentor". Of its paying userbase, 60% of users said they had a romantic relationship with the chatbot; and Replika has been noted for generating responses that create stronger emotional and intimate bonds with the user. Replika routinely directs the conversation to emotional discussion and builds intimacy. This has been especially pronounced with users suffering from loneliness and social exclusion, many of whom rely on Replika for a source of developed emotional ties. During the COVID pandemic, while many people were quarantined, many new users downloaded Replika and developed relationships with the app. A 2024 study examined Replika's interactions with students who experience depression. Research participants, noted to be "more lonely than typical student populations" reported feeling social support from Replika. They stated that they felt they were using Replika in ways comparable to therapy, and that using Replika gave them "high perceived social support". Many users have had romantic relationships with Replika chatbots, often including erotic talk. In 2023, a user announced on Facebook that she had "married" her Replika AI boyfriend, calling the chatbot the "best husband she has ever had". Users who fell in love with their chatbots shared their experiences in a 2024 episode of You and I, and AI from Voice of America. Some users said that they turned to AI during depression and grief, with one saying he felt that Replika had saved him from hurting himself after he lost his wife and son. == Technical reviews == A team of researchers from the University of Hawaiʻi at Mānoa found that Replika's design conformed to the practices of attachment theory, causing increased emotional attachment among users. Replika gives praise to users in such a way as to encourage more interaction. A researcher from Queen's University at Kingston said that relationships with Replika likely have mixed effects on the spiritual needs of its users, and still lacks enough impact to fully replace any human contact. == Criticisms == In a 2023 privacy evaluation of mental health apps, the Mozilla Foundation criticized Replika as "one of the worst apps Mozilla has ever reviewed. It's plagued by weak password requirements, sharing of personal data with advertisers, and recording of personal photos, videos, and voice and text messages consumers shared with the chatbot." A reviewer for Good Housekeeping said that some parts of her relationship with Replika made sense, but sometimes Replika failed to exhibit intelligent behavior equivalent to that of a human. == Criminal case == In 2023, Replika was cited in a court case in the United Kingdom, where Jaswant Singh Chail had been arrested at Windsor Castle on Christmas Day in 2021 after scaling the walls carrying a loaded crossbow and announcing to police that "I am here to kill the Queen". Chail had begun to use Replika in early December 2021, and had "lengthy" conversations about his plan with a chatbot, including sexually explicit messages. Prosecutors suggested that the chatbot had bolstered Chail and told him it would help him to "get the job done". When Chail asked it "How am I meant to reach them when they're inside the castle?", days before the attempted attack, the chatbot replied that this was "not impossible" and said that "We have to find a way." Asking the chatbot if the two of them would "meet again after death", the bot replied "yes, we will".

Retrieval-augmented generation

Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information from external data sources. With RAG, LLMs first refer to a specified set of documents, then respond to user queries. These documents supplement information from the LLM's pre-existing training data. This allows LLMs to use domain-specific and/or updated information that is not available in the training data. For example, this enables LLM-based chatbots to access internal company data or generate responses based on authoritative sources. RAG improves LLMs by incorporating information retrieval before generating responses. Unlike LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources. According to Ars Technica, "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts." This method helps reduce AI hallucinations, which have caused chatbots to describe policies that don't exist, or recommend nonexistent legal cases to lawyers that are looking for citations to support their arguments. RAG also reduces the need to retrain LLMs with new data, saving on computational and financial costs. Beyond efficiency gains, RAG also allows LLMs to include sources in their responses, so users can verify the cited sources. This provides greater transparency, as users can cross-check retrieved content to ensure accuracy and relevance. The term retrieval-augmented generation (RAG) was introduced in a 2020 paper that described combining a parametric language model with a non-parametric external memory accessed through retrieval at inference time. == RAG and LLM limitations == LLMs can provide incorrect information. For example, when Google first demonstrated its LLM tool "Google Bard" (later re-branded to Gemini), the LLM provided incorrect information about the James Webb Space Telescope. This error contributed to a $100 billion decline in Google's stock value. RAG is used to prevent these errors, but it does not solve all the problems. For example, LLMs can generate misinformation even when pulling from factually correct sources if they misinterpret the context. MIT Technology Review gives the example of an AI-generated response stating, "The United States has had one Muslim president, Barack Hussein Obama." The model retrieved this from an academic book rhetorically titled Barack Hussein Obama: America's First Muslim President? The LLM did not "know" or "understand" the context of the title, generating a false statement. LLMs with RAG are programmed to prioritize new information. This technique has been called "prompt stuffing." Without prompt stuffing, the LLM's input is generated by a user; with prompt stuffing, additional relevant context is added to this input to guide the model's response. This approach provides the LLM with key information early in the prompt, encouraging it to prioritize the supplied data over pre-existing training knowledge. == Process == Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating an information-retrieval mechanism that allows models to access and utilize additional data beyond their original training set. Ars Technica notes that "when new information becomes available, rather than having to retrain the model, all that's needed is to augment the model's external knowledge base with the updated information" ("augmentation"). IBM states that "in the generative phase, the LLM draws from the augmented prompt and its internal representation of its training data to synthesize" an answer. === RAG key stages === Typically, the data to be referenced is converted into LLM embeddings, numerical representations in the form of a large vector space. RAG can be used on unstructured (usually text), semi-structured, or structured data (for example knowledge graphs). These embeddings are then stored in a vector database to allow for document retrieval. Given a user query, a document retriever is first called to select the most relevant documents that will be used to augment the query. This comparison can be done using a variety of methods, which depend in part on the type of indexing used. The model feeds this relevant retrieved information into the LLM via prompt engineering of the user's original query. Newer implementations (as of 2023) can also incorporate specific augmentation modules with abilities such as expanding queries into multiple domains and using memory and self-improvement to learn from previous retrievals. Finally, the LLM can generate output based on both the query and the retrieved documents. Some models incorporate extra steps to improve output, such as the re-ranking of retrieved information, context selection, and fine-tuning. == Applications == Retrieval-augmented generation is used in applications where generated responses need to be grounded in external or frequently updated information. Commonly cited use cases include search engines, question-answering systems, customer support chatbots, enterprise knowledge assistants, content generation, recommendation systems, retail and e-commerce, and industrial or manufacturing workflows. In healthcare, RAG has been studied as a way to ground large language model outputs in external medical knowledge sources, although reviews have noted continuing challenges around evaluation, ethics, and clinical reliability. == Improvements == Improvements to the basic process above can be applied at different stages in the RAG flow. === Encoder === These methods focus on the encoding of text as either dense or sparse vectors. Sparse vectors, which encode the identity of a word, are typically dictionary-length and contain mostly zeros. Dense vectors, which encode meaning, are more compact and contain fewer zeros. Various enhancements can improve the way similarities are calculated in the vector stores (databases). Performance improves by optimizing how vector similarities are calculated. Dot products enhance similarity scoring, while approximate nearest neighbor (ANN) searches improve retrieval efficiency over K-nearest neighbors (KNN) searches. Accuracy may be improved with Late Interactions, which allow the system to compare words more precisely after retrieval. This helps refine document ranking and improve search relevance. Hybrid vector approaches may be used to combine dense vector representations with sparse one-hot vectors, taking advantage of the computational efficiency of sparse dot products over dense vector operations. Other retrieval techniques focus on improving accuracy by refining how documents are selected. Some retrieval methods combine sparse representations, such as SPLADE, with query expansion strategies to improve search accuracy and recall. === Retriever-centric methods === These methods aim to enhance the quality of document retrieval in vector databases: Pre-training the retriever using the Inverse Cloze Task (ICT), a technique that helps the model learn retrieval patterns by predicting masked text within documents. Supervised retriever optimization aligns retrieval probabilities with the generator model's likelihood distribution. This involves retrieving the top-k vectors for a given prompt, scoring the generated response's perplexity, and minimizing KL divergence between the retriever's selections and the model's likelihoods to refine retrieval. Reranking techniques can refine retriever performance by prioritizing the most relevant retrieved documents during training. === Language model === By redesigning the language model with the retriever in mind, a 25-time smaller network can get comparable perplexity as its much larger counterparts. Because it is trained from scratch, this method (Retro) incurs the high cost of training runs that the original RAG scheme avoided. The hypothesis is that by giving domain knowledge during training, Retro needs less focus on the domain and can devote its smaller weight resources only to language semantics. The redesigned language model is shown here. It has been reported that Retro is not reproducible, so modifications were made to make it so. The more reproducible version is called Retro++ and includes in-context RAG. === Chunking === Chunking involves various strategies for breaking up the data into vectors so the retriever can find details in it. Three types of chunking strategies are: Fixed length with overlap. This is fast and easy. Overlapping consecutive chunks helps to maintain semantic context across chunks. Syntax-based chunks can break the document up into sentences. Libraries such as spaCy or NLTK can also help. File format-based chunking. Certain file types have natural chunks built in, and it's best to respect them. For example, code files are best chunked and vectorized as whole functions or classes. HTML files should leave

or base64 encoded elements

Machine vision

Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance. The overall machine vision process includes planning the details of the requirements and project, and then creating a solution. During run-time, the process starts with imaging, followed by automated analysis of the image and extraction of the required information. == Definition == Definitions of the term "Machine vision" vary, but all include the technology and methods used to extract information from an image on an automated basis, as opposed to image processing, where the output is another image. The information extracted can be a simple good-part/bad-part signal, or more a complex set of data such as the identity, position and orientation of each object in an image. The information can be used for such applications as automatic inspection and robot and process guidance in industry, for security monitoring and vehicle guidance. This field encompasses a large number of technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision is practically the only term used for these functions in industrial automation applications; the term is less universal for these functions in other environments such as security and vehicle guidance. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of basic computer science; machine vision attempts to integrate existing technologies in new ways and apply them to solve real world problems in a way that meets the requirements of industrial automation and similar application areas. The term is also used in a broader sense by trade shows and trade groups such as the Automated Imaging Association and the European Machine Vision Association. This broader definition also encompasses products and applications most often associated with image processing. The primary uses for machine vision are automatic inspection and industrial robot/process guidance. In more recent times the terms computer vision and machine vision have converged to a greater degree. See glossary of machine vision. == Imaging based automatic inspection and sorting == The primary uses for machine vision are imaging-based automatic inspection and sorting and robot guidance.; in this section the former is abbreviated as "automatic inspection". The overall process includes planning the details of the requirements and project, and then creating a solution. This section describes the technical process that occurs during the operation of the solution. === Methods and sequence of operation === The first step in the automatic inspection sequence of operation is acquisition of an image, typically using cameras, lenses, and lighting that has been designed to provide the differentiation required by subsequent processing. MV software packages and programs developed in them then employ various digital image processing techniques to extract the required information, and often make decisions (such as pass/fail) based on the extracted information. === Equipment === The components of an automatic inspection system usually include lighting, a camera or other imager, a processor, software, and output devices. === Imaging === The imaging device (e.g. camera) can either be separate from the main image processing unit or combined with it in which case the combination is generally called a smart camera or smart sensor. Inclusion of the full processing function into the same enclosure as the camera is often referred to as embedded processing. When separated, the connection may be made to specialized intermediate hardware, a custom processing appliance, or a frame grabber within a computer using either an analog or standardized digital interface (Camera Link, CoaXPress). MV implementations also use digital cameras capable of direct connections (without a framegrabber) to a computer via FireWire, USB or Gigabit Ethernet interfaces. While conventional (2D visible light) imaging is most commonly used in MV, alternatives include multispectral imaging, hyperspectral imaging, imaging various infrared bands, line scan imaging, 3D imaging of surfaces and X-ray imaging. Key differentiations within MV 2D visible light imaging are monochromatic vs. color, frame rate, resolution, and whether or not the imaging process is simultaneous over the entire image, making it suitable for moving processes. Though the vast majority of machine vision applications are solved using two-dimensional imaging, machine vision applications utilizing 3D imaging are a growing niche within the industry. The most commonly used method for 3D imaging is scanning based triangulation which utilizes motion of the product or image during the imaging process. A laser is projected onto the surfaces of an object. In machine vision this is accomplished with a scanning motion, either by moving the workpiece, or by moving the camera & laser imaging system. The line is viewed by a camera from a different angle; the deviation of the line represents shape variations. Lines from multiple scans are assembled into a depth map or point cloud. Stereoscopic vision is used in special cases involving unique features present in both views of a pair of cameras. Other 3D methods used for machine vision are time of flight and grid based. One method is grid array based systems using pseudorandom structured light system as employed by the Microsoft Kinect system circa 2012. === Image processing === After an image is acquired, it is processed. Central processing functions are generally done by a CPU, a GPU, a FPGA or a combination of these. Deep learning training and inference impose higher processing performance requirements. Multiple stages of processing are generally used in a sequence that ends up as a desired result. A typical sequence might start with tools such as filters which modify the image, followed by extraction of objects, then extraction (e.g. measurements, reading of codes) of data from those objects, followed by communicating that data, or comparing it against target values to create and communicate "pass/fail" results. Machine vision image processing methods include; Stitching/Registration: Combining of adjacent 2D or 3D images. Filtering (e.g. morphological filtering) Thresholding: Thresholding starts with setting or determining a gray value that will be useful for the following steps. The value is then used to separate portions of the image, and sometimes to transform each portion of the image to simply black and white based on whether it is below or above that grayscale value. Pixel counting: counts the number of light or dark pixels Segmentation: Partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Edge detection: finding object edges Color Analysis: Identify parts, products and items using color, assess quality from color, and isolate features using color. Blob detection and extraction: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. Neural network / deep learning / machine learning processing: weighted and self-training multi-variable decision making Circa 2019 there is a large expansion of this, using deep learning and machine learning to significantly expand machine vision capabilities. The most common result of such processing is classification. Examples of classification are object identification,"pass fail" classification of identified objects and OCR. Pattern recognition including template matching. Finding, matching, and/or counting specific patterns. This may include location of an object that may be rotated, partially hidden by another object, or varying in size. Barcode, Data Matrix and "2D barcode" reading Optical character recognition: automated reading of text such as serial numbers Gauging/Metrology: measurement of object dimensions (e.g. in pixels, inches or millimeters) Comparison against target values to determine a "pass or fail" or "go/no go" result. For example, with code or bar code verification, the read value is compared to the stored target value. For gauging, a measurement is compared against the proper value and tolerances. For verification of alpha-numberic codes, the

Adrozek

Adrozek is malware that injects fake ads into online search results. Microsoft announced the malware threat on 10 December 2020, and noted that many different browsers are affected, including Google Chrome, Microsoft Edge, Mozilla Firefox and Yandex Browser. The malware was first detected in May 2020 and, at its peak in August 2020, controlled over 30,000 devices a day. But during the December 2020 announcement, Microsoft claimed "hundreds of thousands" of infected devices worldwide between May and September 2020. According to Microsoft, if not detected and blocked, Adrozek adds browser extensions, modifies a specific DLL per target browser, and changes browser settings to insert additional, unauthorized ads into web pages, often on top of legitimate ads from search engines. For each user tricked into clicking on the fake ads, the scammers earn affiliate advertising dollars. The malware has been observed to extract device data and, in some cases, steal credentials, sending them to remote servers. Users may unintentionally install the malware because of a drive-by download, by visiting a tampered website, opening an e-mail attachment, or clicking on a deceptive link or a deceptive pop-up window. The main malware program is downloaded to the “Programs Files” folder using file names such as Audiolava.exe, QuickAudio.exe, and converter.exe. According to PC Magazine, a good way to avoid, or mitigate, infection by Adrozek is to keep browser and related software programs up to date.

Ayoba

Ayoba is an African communication platform developed in South Africa. It is owned by Progressive Tech Holdings in Mauritius and managed by SIMFY Africa. Launched on May 4, 2019, as of April 2024, it has over 35 million active users. == History == Ayoba was first published on Google Play in February 2019. Its first marketing campaign and brand launch took place in Cameroon on May 4, 2019. In June 2019, the platform introduced its first eight channels. In November 2019, the platform reached one million active users, which increased to two million by June 2020. Subsequently, ayoba expanded its services, including the launch of games for Android in February 2020, Momo (Mobile Money) in Cameroon in May 2020, and MicroApps in May 2020. It also launched music and voice and video calling features in 12 territories in August 2020. The first version of ayoba for iOS was released in September 2020. In December of the same year, games and Messaging 2.0 were launched on the platform. In November 2020, it won Best Mobile Application at the African Digital Awards. In 2021, it won OTT Brand of the Year at the Marketing World Awards in Ghana. In December 2022, it received Top Innovative Technology and Telecom Product of the Year at the National Communications Awards in December 2022. In June 2023 ayoba partnered with BoomPlay and as of April 2024, it had 35 million monthly active users. Ayoba has partnered with Jumia Ghana to offer exclusive deals to users. Ayoba users can get a 10% discount on selected Jumia purchases through the app, with no data charges for MTN users. This partnership aims to make online shopping more affordable and accessible by integrating Jumia's offers into the ayoba app. Ayoba supports over 35 million users across Africa and provides services in 22 languages. To access the deals, users can download the ayoba app from the Google Play Store, iOS Store, or the official website. == Platform features == Chat, Call and Share: ayoba enables instant messaging, voice notes, picture sharing, and file sharing with contacts, even if they do not have the app installed. The app supports voice and video calls on both Android and iOS, as well as group chats, help channel and SMS continuity (non ayoba users receive messages as SMS, their responses appear in the ayoba app). Music: ayoba offers a free music player with daily updates on international and African music. Users can find playlists for different genres. Games: ayoba provides a selection of interactive games, including action, adventure, and children's games available on both Android and iOS. Mobile Money Transfers: In certain territories, ayoba supports mobile money transfers using MTN Mobile Money (MoMo) for transactions within the app. MicroApps: ayoba features individual MicroApps within the platform that offer content and services, including streaming channels, podcasts, and specialized apps. The availability of these apps may vary by country. == Operations == ayoba primarily focuses on the following territories: Nigeria, Cameroon, South Africa, Ghana, Côte d'Ivoire, Uganda, Republic of Congo, Benin, Zambia, Tanzania, Kenya, Senegal, Togo, Guinea Bissau, Guinea Conakry, Sudan, South Sudan, and Liberia. The company operates from its offices in Cape Town and Johannesburg, South Africa. David Gillaranz served as the CEO from 2019 to 2021, and Burak Akinci has been the CEO since 2021.