AI Data Defense

AI Data Defense — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • 2024–present global memory supply shortage

    2024–present global memory supply shortage

    A global computer memory supply shortage started in 2024 due to supply constraints and rapid price escalation in the semiconductor memory market, particularly affecting DRAM and NAND flash memory. This shortage is sometimes labelled by tech media outlets as "RAMmageddon" or the "RAMpocalypse". Unlike the 2020–2023 global chip shortage, which stemmed primarily from pandemic-related supply chain disruptions from COVID-19, this shortage is driven by a structural reallocation of manufacturing capacity toward high-margin products for artificial intelligence infrastructure, creating scarcity of computer memory in consumer and enterprise PC markets. According to a 2026 Kearney's PERLab analysis, the shortage is expected to last at least until 2030, with CEOs agreeing with the timelines. == Background == Following a severe market downturn in 2022–2023, major memory manufacturers—Samsung Electronics, SK Hynix, and Micron Technology—implemented strategic production cuts to stabilize pricing. By mid-2024, the rapid expansion of generative AI services triggered unprecedented demand for specialized memory products, particularly High Bandwidth Memory (HBM) used in AI accelerators and data center GPUs. Specialized components of semiconductor technology are also experiencing supply constraints due to high demand in AI application. For example, glass cloth, a high-performance glass fiber substrate used for power efficient high speed data transfer and a crucial component of semiconductor manufacturing, is experiencing a supply crisis. Nitto Boseki, a Japanese firm having overwhelming monopoly in its production, is not able to meet increased demands, making chip-makers such as Qualcomm, Apple, Nvidia and AMD compete for securing supply. There are also reports of smaller electronics companies struggling to find suppliers for components such as NAND flash. Memory suppliers are adapting to increased demands and market unpredictability by requiring prepayment or shorter time-frame of payment, which makes it more difficult for smaller firms to acquire capital to survive. By 2026, due to steadily increased demand on resources, CPUs are also experiencing shortage issues due to low fabrication capacity, prioritisation of server CPUs, and increased demand, with CPU prices also being forecast to increase by as much as 15%. The demand on memory has also increased strain on other electronic components such as hard disk devices, with reports such as Western Digital's hard disk supply for 2026 being booked for enterprise applications before February 2026. A 2024 McKinsey analysis projected that global demand for AI-ready data center capacity would grow at approximately 33% annually through 2030, with AI workloads consuming roughly 70% of total data center capacity by the decade's end. In addition, according to Kearney's State of Semiconductor 2025 Report, executives were already expecting a shortage in the <8nm wafer size with memory chips being mentioned as an acute source of concern. Multiple companies mentioned being prepared for it through long-term agreements with RAM suppliers or amassing additional inventory. On 24 March 2026, Google announced TurboQuant, a memory compression technology focused on large language models (LLM) and vector search engines, which it claimed achieves 6x lower memory consumption in tested local LLMs and 8x performance enhancement in tests running on H100 accelerators. The technology is also a drop in enhancement for existing inference pipeline. Amid speculation about memory demand trends, memory manufacturers, SanDisk, Micron, Western Digital and Seagate, among other companies involved in memory manufacture experienced stock price declines. Prices of memory kits also reduced in the following months, although still at inflated prices. == Causes == === HBM production displacement === HBM manufacturing requires significantly more wafer capacity per bit than standard DRAM modules. Industry sources reported that as manufacturers allocated increasing wafer capacity to HBM production to meet contracts with AI infrastructure providers, the supply of conventional DDR4 and DDR5 modules for consumer PCs and smartphones contracted sharply. By September 2025, Samsung Electronics had reportedly expanded its 1c DRAM capacity to target 60,000 wafers per month specifically for HBM4 production, further diverting resources from consumer memory lines. === Geopolitical and trade barriers === The supply chain was further constrained by escalating trade tensions between the United States and China. Throughout 2025, fears of U.S. regulatory backlash and new tariff structures led major manufacturers like Samsung and SK Hynix to halt sales of older semiconductor manufacturing equipment to Chinese entities, effectively capping production capacity in the region. Additionally, proposed tariff policies by the U.S. administration in late 2025 prompted supply chain realignments, with Apple reportedly accelerating plans to source all U.S.-bound iPhones from India to avoid potential levies. === NAND flash capacity constraints === In the NAND flash segment, manufacturers prioritized higher-margin enterprise SSDs for data center applications while phasing out older process nodes more rapidly than anticipated. In November 2025, contract prices for NAND wafers increased by more than 60% month-over-month for certain product categories, with 512GB TLC experiencing the steepest rise as legacy manufacturing capacity was retired. == Impact on industry and consumers == === Manufacturer responses === Major PC manufacturers responded to component cost increases with significant price adjustments and supply chain strategies. Dell Technologies Chief Operating Officer Jeff Clarke stated during a November 2025 analyst call that the company had "never witnessed costs escalating at the current pace," describing tighter availability across DRAM, hard drives, and NAND flash memory. Analysts at Morgan Stanley downgraded Dell Technologies stock from "Overweight" to "Underweight" in late 2025, citing the company's heavy exposure to rising server memory costs. The firm warned that skyrocketing memory prices could significantly erode margins for server and PC OEMs. Conversely, Apple Inc. was reportedly less affected than its competitors, having secured long-term supply agreements for DRAM through the first quarter of 2026. Lenovo Chief Financial Officer Winston Cheng described the cost surge as "unprecedented" and disclosed that the company's memory inventories were approximately 50% above normal levels in anticipation of further price increases. === Consumer electronics sector === The shortage particularly affected smartphone manufacturers and other consumer electronics producers. DRAM prices reportedly rose by 172% throughout 2025, leading manufacturers like Samsung to halt new orders for DDR5 modules to reassess pricing structures and Micron to exit its 'Crucial' brand of consumer products. In Tokyo's Akihabara electronics district, retailers began limiting purchases of memory products to prevent hoarding, with prices for popular DDR5 memory modules more than doubling in some cases. Despite the broad trend of rising hardware costs, some companies engaged in aggressive pricing strategies to maintain market share; for example, Sony reduced the price of the PlayStation 5 by $100 for Black Friday 2025, potentially absorbing increased component costs to stimulate software ecosystem growth. Due to memory prices more than doubling in a single quarter, HP revealed in its Q1 2026 earnings call that memory costs account for 35% of PC build materials up from 15-18% previous quarter. Despite showing strong Q1 2026 earning driven by Windows 11 upgrade cycle and AI PC adoption, HP warned investors of low operating margins and up to double digit percentage decline for coming quarter. Trendforce, an IT analytics company, updated its forecast from 1.7% year-over-year growth in PC market to 2.6% year-over-year decline for 2026, amid backdrop of steadily increasing prices and supply crisis. Research and analytics firms, Gartner and IDC expect worldwide PC market to decline 10-11% and smartphone market to decline 8-9% in 2026. Gartner also projects that rising memory prices will make low-margin entry level laptops under 500 USD financially unviable in two years. The RAM shortage has delayed the release of Valve's second Steam Machine due to increased memory prices. The device was originally set to launch in early 2026. === AI infrastructure competition === Technology companies including Google, Amazon, Microsoft, and Meta Platforms placed open-ended orders with memory suppliers, indicating they would accept as much supply as available regardless of cost, according to Reuters sources. The limited supply of AI chips has been cited as a reason for the slow down in compute growth. In October 2025, OpenAI formally announced a strategic partnership using letters of intent with Samsung Electronics and SK Hynix

    Read more →
  • Permutation automaton

    Permutation automaton

    In automata theory, a permutation automaton, or pure-group automaton, is a deterministic finite automaton such that each input symbol permutes the set of states. Formally, a deterministic finite automaton A may be defined by the tuple (Q, Σ, δ, q0, F), where Q is the set of states of the automaton, Σ is the set of input symbols, δ is the transition function that takes a state q and an input symbol x to a new state δ(q,x), q0 is the initial state of the automaton, and F is the set of accepting states (also: final states) of the automaton. A is a permutation automaton if and only if, for every two distinct states qi and qj in Q and every input symbol x in Σ, δ(qi,x) ≠ δ(qj,x). A formal language is p-regular (also: a pure-group language) if it is accepted by a permutation automaton. For example, the set of strings of even length forms a p-regular language: it may be accepted by a permutation automaton with two states in which every transition replaces one state by the other. == Applications == The pure-group languages were the first interesting family of regular languages for which the star height problem was proved to be computable. Another mathematical problem on regular languages is the separating words problem, which asks for the size of a smallest deterministic finite automaton that distinguishes between two given words of length at most n – by accepting one word and rejecting the other. The known upper bound in the general case is O ( n 2 / 5 ( log ⁡ n ) 3 / 5 ) {\displaystyle O(n^{2/5}(\log n)^{3/5})} . The problem was later studied for the restriction to permutation automata. In this case, the known upper bound changes to O ( n 1 / 2 ) {\displaystyle O(n^{1/2})} .

    Read more →
  • Best AI Analytics Tools in 2026

    Best AI Analytics Tools in 2026

    Curious about the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • How to Choose an AI Sales Assistant

    How to Choose an AI Sales Assistant

    In search of the best AI sales assistant? An AI sales assistant is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI sales assistant slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • CodeCheck

    CodeCheck

    CodeCheck is a mobile app that provides consumers with information about the ingredients in cosmetic products, as well as the ingredients and nutritional values of food. Users can access this information by scanning the product’s barcode with a smartphone or by using a text-based search. The app is available for iOS and Android devices in Germany, Austria, Switzerland, the United Kingdom, the United States, and the Netherlands. == History == CodeCheck was founded in 2010 as an association, online database, and app by Roman Bleichenbacher, who was then a student in Zurich. A website of the same name had already been launched in 2002, where users could enter information about ingredients, nutritional values, and manufacturers of products. The first round of financing took place in July 2014 and raised over 1.1 million Swiss francs, which coincided with the founding of CodeCheck AG. Investors included Doodle founders Myke Näf and Paul E. Sevinç. The company subsequently expanded to Austria and Germany. In the same year, Boris Manhart became CEO. CodeCheck GmbH was established in Berlin in 2016. The app became available in the United States in 2017 and in the United Kingdom in November 2019. In 2020, it was also launched in the Netherlands. Following insolvency proceedings, the app has been owned by Producto Check GmbH since 2022. == Functions == The app can be used to scan the barcode of food and cosmetic products. It then displays information about ingredients, nutritional values, manufacturers and certification labels. For many years, users were able to enter and edit product information themselves and indicate advantages and disadvantages of individual products. Since 2020, the app has placed greater emphasis on machine text recognition. The collected data is combined with substance ratings using an algorithm. These ratings are based on scientific studies and expert assessments, including those from the Consumer Advice Centre in Hamburg, Greenpeace, the WWF and the German Association for the Environment and Nature Conservation (BUND e. V.), and cannot be modified by users or manufacturers. The app also provides information on the sugar and fat content of food products. In addition, it indicates whether a product contains hormone-active substances, microplastics, palm oil, animal-derived ingredients, lactose or gluten. Since 2020, the app has displayed a climate score for food products in cooperation with the Eaternity Institute. == Financing == CodeCheck is primarily financed through native advertising and banner ads. Since 2018, the company has also offered analysis services and survey tools directly to fast-moving consumer goods (FMCG) manufacturers. In addition, access to the API is available, enabling other companies to use the product database. With the introduction of a subscription model in 2019, the CodeCheck app can be used ad-free and in offline mode. Since 2021, CodeCheck has also offered its own “Green Label” certification for manufacturers. Products are certified if at least 90 percent of their ingredients are classified as harmless. == Awards == In May 2015, the app topped the download charts for the first time, reaching 2.3 million installations. By September 2019, the app had once again reached the top of the German app charts, surpassing five million downloads.

    Read more →
  • Paola Velardi

    Paola Velardi

    Paola Velardi (born in Rome, April 26, 1955) is a full professor of computer science at Sapienza University in Rome, Italy. Her research encompasses Artificial Intelligence and specifically, natural language processing, machine learning business intelligence and semantic web. Velardi is one of the hundred female scientists included in the database "100esperte.it" (translated from Italian with "100 female experts"). This online, open database champions the recognition of top-rated female scientists in Science, Technology, Engineering and Mathematics (STEM) areas. Among her prestigious appointments and honors, her inclusion stands out —alongside 45 other international female scientists from the past, present, and future— in the Women in Science pavilion of UNESCO’s Virtual Science Museum. == Research == Paola Velardi's research activity has focused, since the early 1980s, on Artificial Intelligence, with a particular emphasis on natural language processing (NLP), Machine learning, and data mining. Her scientific contributions have evolved over time, following the sector's primary paradigms: Semantic Web and Ontologies: She is known for her pioneering work on semantic disambiguation and automated ontology learning, collaborating on the development of systems such as OntoLearn. Social Computing and Predictive Analysis: She has conducted research on extracting information from social media for epidemiological monitoring (syndromic surveillance) and for the identification of opinion leaders. In the educational field, she has developed machine learning models to predict the risk of student dropout. AI for Health and Elder Monitoring: She has coordinated projects to support frailty in the elderly, developing systems based on ambient intelligence and wearables to detect clinical and behavioral anomalies. She has also contributed to models for analyzing behavioral changes through dynamic clustering. Generative AI and Finance: More recently, her research has expanded into the use of generative AI and deep learning for finance, including benchmark studies on price trend prediction based on Limit Order Books (LOB) and the development of diffusion models for realistic market simulation (the TRADES project). According to Google Scholar bibliometrics updated until December 2025, Velardi's scientific publications have been cited more than 8100 times. Her h-index was 42. She has published more than 200 papers in international journals and conference proceedings. Some of her publications have been published in top rated journals such as Artificial Intelligence, Computational Linguistics, Knowledge-Based Systems, IEEE Transactions on Data and Knowledge Engineering , IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Computers, IEEE Transactions on Software Engineering , Data Mining and Knowledge Discovery, and Journal of Web Semantics. == Education and previous employments == Velardi graduated in electronic engineering from Sapienza University in 1978. From 1978 to 1983, she worked for the Ugo Bordoni Foundation, a research institution focusing on ICT and working under the supervision of the Italian Ministry of Economic Development. In 1983, she was a visiting scholar at Stanford University. During this period she became passionate about Artificial Intelligence, which will remain her area of research throughout her career. From 1984 to 1986, she came back to her natal city and worked as a researcher for IBM. From 1986 to 1996 she was an associate professor in the engineering faculty of Polytechnic University of the Marches (Ancona, Italy). Starting in November 1996, she taught in and did research for the Department of Computer Science at the Sapienza University. Velardi was the head of Bachelor and Master Programs in Computer Science at Sapienza University from 2010 to 2013 and from 2015 to 2016. == Current employment == Since November 2001, Velardi has been a full professor in the department of computer science ("Dipartimento di Informatica" in Italian) at Sapienza University in Rome, Italy. Since 2013, she has been the coordinator of the Distance Learning Degree in Computer Science at Sapienza University. As of today, Velardi is a Senior Associate at the Institute of Cognitive Sciences and Technologies (ISTC) of the CNR. == Recognition == Velardi is one of the hundred female scientists included in the database "100esperte.it" (translated from Italian with "100 female experts"). This database lists top Italian female STEM scientists. Six out of one hundred scientists in the 100esperte's database are computer scientists like Velardi. Velardi is in the list of the top Italian scientists. A top scientist appearing in the Top-Italian-Scientists database is a scientist whose h-index is greater than 30. In March 2017, she was given an IBM Faculty Award for her research on social recommender systems. In December 2018, Velardi was included in the list of the 50 most influential Italian women in science and technology by Inspiring Fifty, a non-profit that aims to increase diversity in STEM by making female role models in tech more visible. In September 2019 she was the local co-organizer and Program Chair of the 6th ACM Celebration of Women in Computing. In November 2019 Velardi received the Standout Woman Award International at the seat of the Italian Parliament in Montecitorio. == Causes == Velardi aims at debunking the myth of computer science as a man-oriented and "inflexible" discipline. She is the founder of the project "NERD? Non e' roba per donne?" (translated from Italian: "NERD? Is it not stuff for women?"). This project was launched by Velardi in 2012 in the Department of Computer Science at Sapienza University. Since 2013 the project has been carried out in partnership with IBM Italy, which later created a spin-off of the project. The goal of the project is two-fold: (1) conveying computer science as creative, interdisciplinary and problem-solving-oriented science, and (2) encouraging young female students in studying computer science by, for instance, developing apps for smartphones. She has been the program chair of the 19th ACM celebration of Women in Computing. She is the creator and coordinator of the G4GRETA, an educational project that involves students of the third and fourth grades of Rome and Lazio. The project combines the development of IT skills with the themes of environmental sustainability and soft skills (teambuilding, pitching, social networking, etc.) Velardi is also involved in scientific dissemination. In 2020 and 2021 she cooperated with RaiCultura, the cultural division of RAI, the national broadcasting company.

    Read more →
  • Phraselator

    Phraselator

    The Phraselator is a weatherproof handheld language translation device developed by Applied Data Systems and VoxTec, a former division of the military contractor Marine Acoustics, located in Annapolis, Maryland, USA. It was designed to serve as a handheld computer device that translates English into one of 40 different languages. == The device == The Phraselator is a small speech translation PDA-sized device designed to aid in interpretation. The device does not produce synthesized speech like that utilized by Stephen Hawking; instead, it plays pre-recorded foreign language MP3 files. Users can select the phrase they wish to convey from an English list on the screen or speak into the device. It then uses speech recognition technology called DynaSpeak, developed by SRI International, to play the proper sound file. The accuracy of the speech recognition software is over 70 percent according to software developer Jack Buchanan. The device can also record replies for translation later. Pre-recorded phrases are stored on Secure Digital flash memory cards. A 128 MB card can hold up to 12,000 phrases in four or five languages. Users can download phrase modules from the official website, which contained over 300,000 phrases as of March 2005. Users can also construct their own custom phrase modules. Earlier devices were known to have run on an SA-1110 Strong Arm 206 MHz CPU with 32MB SDRAM and 32MB onboard Flash RAM. A newer model, the P2, was released in 2004 and developed according to feedback from U.S. soldiers. It translates one way from English to approximately 60 other languages. It has a directional microphone, a larger library of phrases and a longer battery life. The 2004 release was created by and utilizes a computer board manufactured by InHand Electronics, Inc. In the future, the device will be able to display pictures so users can ask questions such as "Have you seen this person?" Developer Ace Sarich notes that the device is inferior to human interpreter. Conclusions derived from a Nepal field test conducted by U.S. and Nepal based NGO Himalayan Aid in 2004 seemed to confirm Sarich's comparisons: The very concept of using a machine as a communication point between individuals seemed to actually encourage a more limited form of interaction between tester and respondent. Usually, when limited language skills are present between parties, the genuine struggle and desire to communicate acts as a display of good will – we openly display our weakness in this regard – and the result is a more relaxed and human encounter. This was not necessarily present with the Phraselator as all parties abandoned learning about each other and instead focused on learning how to work with the device. As a tool for bridging any cultural differences or communicating effectively at any length, the Phraselator would not be recommended. This device, at least in the form tested, would best be used in large-scale operations where there is no time for language training and there is a need to communicate fixed ideas, quickly, over the greatest distance by employing large amounts of unskilled users. Large humanitarian or natural disasters in remote areas of third-world countries might be an effective example. == Origin == The original idea for the device came from Lee Morin, a Navy doctor in Operation Desert Storm. To communicate with patients, he played Arabic audio files from his laptop. He informed Ace Sarich, the vice president of VoxTec, about the idea. VoxTec won a DARPA Small Business Innovation Research grant in early 2001 to develop a military-grade handheld phrase translator. During its development, the Phraselator was tested and evaluated by scientists from the Army Research Laboratory. The device was first field tested in Afghanistan in 2001. By 2002, about 500 Phraselators were built for soldiers around the world with another 250 ordered by the U.S. Special Forces. The device cost $2000 to develop and could convert spoken English into one of 200,000 recorded commands and questions in 30 languages. However, the device could only translate one-way. At the time, the only existing two-way voice translator that could convert speech back and forth between languages was the Audio Voice Translation Guide System, or TONGUES, which was developed by Carnegie Mellon University for Lockheed Martin. As part of a DARPA program known as the Spoken Language Communication and Translation System for Tactical Use, SRI International has further developed two-way translation software for use in Iraq called IraqComm in 2006 which contains a vocabulary of 40,000 English words and 50,000 words in Iraqi Arabic. == Notable users == The handheld translator was recently used by U.S. troops while providing relief to tsunami victims in early 2005. About 500 prototypes of the device were provided to U.S. military forces in Operation Enduring Freedom. Units loaded with Haitian dialects have been provided to U.S. troops in Haiti. Army military police have used it in Kandahar to communicate with POWs. In late 2004, the U.S. Navy began to augment some ships with a version of the device attached to large speakers in order to broadcast clear voice instructions up to 400 yards (370 m) away. Corrections officers and law enforcement in Oneida County, New York, have tested the device. Hospital emergency rooms and health departments have also evaluated it. Several Native American tribes such as the Choctaw Nation, the Ponca, and the Comanche Nation have also used the device to preserve their dying languages. Various law enforcement agencies, such as the Los Angeles Police Department, also use the phraselator in their patrol cars. == Awards == In March 2004, DARPA director Dr. Tony Tether presented the Small Business Innovative Research Award to the VoxTec division of Marine Acoustics at DARPATech 2004 in Anaheim, CA. The device was recently listed as one of "Ten Emerging Technologies That Will Change Your World" in MIT's Technology Review. == Pop culture == Software developer Jack Buchanan believes that building a device similar to the fictional universal translator seen in Star Trek would be harder than building the Enterprise. The device was mentioned in a list of "Top 10 Star Trek Tech" on Space.com.

    Read more →
  • Pumping lemma for regular languages

    Pumping lemma for regular languages

    In the theory of formal languages, the pumping lemma for regular languages is a lemma that describes an essential property of all regular languages. Informally, it says that all sufficiently long strings in a regular language may be pumped—that is, have a middle section of the string repeated an arbitrary number of times—to produce a new string that is also part of the language. The pumping lemma is useful for proving that a specific language is not a regular language, by showing that the language does not have the property. Specifically, the pumping lemma says that for any regular language L {\displaystyle L} , there exists a constant p {\displaystyle p} such that any string w {\displaystyle w} in L {\displaystyle L} with length at least p {\displaystyle p} can be split into three substrings x {\displaystyle x} , y {\displaystyle y} and z {\displaystyle z} ( w = x y z {\displaystyle w=xyz} , with y {\displaystyle y} being non-empty), such that the strings x z , x y z , x y y z , x y y y z , . . . {\displaystyle xz,xyz,xyyz,xyyyz,...} are also in L {\displaystyle L} . The process of repeating y {\displaystyle y} zero or more times is known as "pumping". Moreover, the pumping lemma guarantees that the length of x y {\displaystyle xy} will be at most p {\displaystyle p} , thus giving a "small" substring x y {\displaystyle xy} that has the desired property. Languages with a finite number of strings vacuously satisfy the pumping lemma by having p {\displaystyle p} equal to the maximum string length in L {\displaystyle L} plus one. By doing so, no strings at all in L {\displaystyle L} have length at least p {\displaystyle p} . The pumping lemma was first proven by Michael Rabin and Dana Scott in 1959, and rediscovered shortly after by Yehoshua Bar-Hillel, Micha A. Perles, and Eli Shamir in 1961, as a simplification of their pumping lemma for context-free languages. == Formal statement == Let L {\displaystyle L} be a regular language. Then there exists an integer p ≥ 1 {\displaystyle p\geq 1} depending only on L {\displaystyle L} such that every string w {\displaystyle w} in L {\displaystyle L} of length at least p {\displaystyle p} ( p {\displaystyle p} is called the "pumping length") can be written as w = x y z {\displaystyle w=xyz} (i.e., w {\displaystyle w} can be divided into three substrings), satisfying the following conditions: | y | ≥ 1 {\displaystyle |y|\geq 1} | x y | ≤ p {\displaystyle |xy|\leq p} ( ∀ n ≥ 0 ) ( x y n z ∈ L ) {\displaystyle (\forall n\geq 0)(xy^{n}z\in L)} y {\displaystyle y} is the substring that can be pumped (removed or repeated any number of times, and the resulting string is always in L {\displaystyle L} ). (1) means the loop y {\displaystyle y} to be pumped must be of length at least one, that is, not an empty string; (2) means the loop must occur within the first p {\displaystyle p} characters. | x | {\displaystyle |x|} must be smaller than p {\displaystyle p} (conclusion of (1) and (2)), but apart from that, there is no restriction on x {\displaystyle x} and z {\displaystyle z} . In simple words, for any regular language L {\displaystyle L} , any sufficiently long string w {\displaystyle w} (in L {\displaystyle L} ) can be split into 3 parts, i.e. w = x y z {\displaystyle w=xyz} , such that all the strings x y n z {\displaystyle xy^{n}z} for n ≥ 0 {\displaystyle n\geq 0} are also in L {\displaystyle L} . Below is a formal expression of the pumping lemma. ∀ L ⊆ Σ ∗ , regular ( L ) ⟹ ∃ p ≥ 1 , ∀ w ∈ L , | w | ≥ p ⟹ ∃ x , y , z ∈ Σ ∗ , ( w = x y z ) ∧ ( | y | ≥ 1 ) ∧ ( | x y | ≤ p ) ∧ ( ∀ n ≥ 0 , x y n z ∈ L ) {\displaystyle {\begin{array}{l}\forall L\subseteq \Sigma ^{},{\mbox{regular}}(L)\implies \\\quad \exists p\geq 1,\forall w\in L,|w|\geq p\implies \\\qquad \exists x,y,z\in \Sigma ^{},(w=xyz)\land (|y|\geq 1)\land (|xy|\leq p)\land (\forall n\geq 0,xy^{n}z\in L)\end{array}}} == Use of the lemma to prove non-regularity == The pumping lemma is often used to prove that a particular language is non-regular: a proof by contradiction may consist of exhibiting a string (of the required length) in the language that lacks the property outlined in the pumping lemma. Example: The language L = { a n b n : n ≥ 0 } {\displaystyle L=\{a^{n}b^{n}:n\geq 0\}} over the alphabet Σ = { a , b } {\displaystyle \Sigma =\{a,b\}} can be shown to be non-regular as follows: Assume that some constant p ≥ 1 {\displaystyle p\geq 1} exists as required by the lemma. Let w {\displaystyle w} in L {\displaystyle L} be given by w = a p b p {\displaystyle w=a^{p}b^{p}} , which is a string longer than p {\displaystyle p} . By the pumping lemma, there must exist a decomposition w = x y z {\displaystyle w=xyz} with | x y | ≤ p {\displaystyle |xy|\leq p} and | y | ≥ 1 {\displaystyle |y|\geq 1} such that x y i z {\displaystyle xy^{i}z} in L {\displaystyle L} for every i ≥ 0 {\displaystyle i\geq 0} . Since | x y | ≤ p {\displaystyle |xy|\leq p} , the string y {\displaystyle y} only consists of instances of a {\displaystyle a} . Because | y | ≥ 1 {\displaystyle |y|\geq 1} , it contains at least one instance of the letter a {\displaystyle a} . Pumping y {\displaystyle y} to give x y 2 z {\displaystyle xy^{2}z} gives a word with more instances of the letter a {\displaystyle a} than the letter b {\displaystyle b} , since some instances of a {\displaystyle a} but none of b {\displaystyle b} were added. Therefore, x y 2 z {\displaystyle xy^{2}z} is not in L {\displaystyle L} which contradicts the pumping lemma. Therefore, L {\displaystyle L} cannot be regular. The proof that the language of balanced (i.e., properly nested) parentheses is not regular follows the same idea. Given p {\displaystyle p} , there is a string of balanced parentheses that begins with more than p {\displaystyle p} left parentheses, so that y {\displaystyle y} will consist entirely of left parentheses. By repeating y {\displaystyle y} , a string can be produced that does not contain the same number of left and right parentheses, and so they cannot be balanced. == Proof of the pumping lemma == For every regular language there is a finite-state automaton (FSA) that accepts the language. The number of states in such an FSA are counted and that count is used as the pumping length p {\displaystyle p} . For a string of length at least p {\displaystyle p} , let q 0 {\displaystyle q_{0}} be the start state and let q 1 , . . . , q p {\displaystyle q_{1},...,q_{p}} be the sequence of the next p {\displaystyle p} states visited as the string is emitted. Because the FSA has only p {\displaystyle p} states, within this sequence of p + 1 {\displaystyle p+1} visited states there must be at least one state that is repeated. Write q s {\displaystyle q_{s}} for such a state. The transitions that take the machine from the first encounter of state q s {\displaystyle q_{s}} to the second encounter of state q s {\displaystyle q_{s}} match some string. This string is called y {\displaystyle y} in the lemma, and since the machine will match a string without the y {\displaystyle y} portion, or with the string y {\displaystyle y} repeated any number of times, the conditions of the lemma are satisfied. For example, the following image shows an FSA. The FSA accepts the string: abcd. Since this string has a length at least as large as the number of states, which is four (so the total number of states that the machine passes through to scan abcd would be 5), the pigeonhole principle indicates that there must be at least one repeated state among the start state and the next four visited states. In this example, only q 1 {\displaystyle q_{1}} is a repeated state. Since the substring bc takes the machine through transitions that start at state q 1 {\displaystyle q_{1}} and end at state q 1 {\displaystyle q_{1}} , that portion could be repeated and the FSA would still accept, giving the string abcbcd. Alternatively, the bc portion could be removed and the FSA would still accept giving the string ad. In terms of the pumping lemma, the string abcd is broken into an x {\displaystyle x} portion a, a y {\displaystyle y} portion bc and a z {\displaystyle z} portion d. As a side remark, the problem of checking whether a given string can be accepted by a given nondeterministic finite automaton without visiting any state repeatedly, is NP hard. == General version of pumping lemma for regular languages == If a language L {\displaystyle L} is regular, then there exists a number p ≥ 1 {\displaystyle p\geq 1} (the pumping length) such that every string u w v {\displaystyle uwv} in L {\displaystyle L} with | w | ≥ p {\displaystyle |w|\geq p} can be written in the form u w v = u x y z v {\displaystyle uwv=uxyzv} with strings x {\displaystyle x} , y {\displaystyle y} and z {\displaystyle z} such that | x y | ≤ p {\displaystyle |xy|\leq p} , | y | ≥ 1 {\displaystyle |y|\geq 1} and u x y i z v {\displaystyle uxy^{i}zv} is in L {\displaystyle L} for every integer i ≥ 0 {\displaystyle i\geq 0} . From this, the above standard v

    Read more →
  • Captions (app)

    Captions (app)

    Mirage (formerly known as Captions) is a video-generating, video-editing and AI research company headquartered in New York City. Their first app, Captions, is available on iOS, Android, and Web and offers a suite of tools aimed at streamlining the creation and editing of videos. Their enterprise platform, Mirage Studio, generates AI actors and videos for marketing assets and video campaigns. == History == Mirage was co-founded by Gaurav Misra and Dwight Churchill. During Misra's time leading design engineering at Snap Inc., he followed the rise of a new category of video, the "talking video." In 2021, Misra left Snap to found Mirage with his former colleague Churchill. Later that year, the Captions app launched with early backing from venture capital firms Sequoia Capital and Andreessen Horowitz as well as individual investors. In 2023, the company released Lipdub, an Al dubbing app which translates any video with spoken audio into 28 languages. In October 2023, Captions shared that it maintained over 100,000 daily active users with "about a million" videos being created monthly. In November 2024, Captions acquired AlpacaML, a generative AI company that focused on art and other images. In June 2025, Captions launched Mirage Studio, for marketers and advertising agencies. In September 2025, Captions rebranded their company to Mirage. This change reflects the company's focus on developing their proprietary foundation model and future video products. == Products == The Captions app offers features to automate common production tasks including captioning, editing, dubbing, script creation, and music integration. Mirage Studio allows users to generate AI avatars and create short-form videos from prompts or audio. == Awards == In 2023, the company was recognized as part of Fast Company's "Next Big Things In Tech" series. In 2024, the company won 2 Webby Awards for Best Use of AI & Machine Learning and Creative Production.

    Read more →
  • Best AI Paragraph Rewriters in 2026

    Best AI Paragraph Rewriters in 2026

    In search of the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • AI Pair Programmers Reviews: What Actually Works in 2026

    AI Pair Programmers Reviews: What Actually Works in 2026

    Curious about the best AI pair programmer? An AI pair programmer is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI pair programmer slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • AI Text-to-image Tools Reviews: What Actually Works in 2026

    AI Text-to-image Tools Reviews: What Actually Works in 2026

    In search of the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Inductive programming

    Inductive programming

    Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative (logic or functional) and often recursive programs from incomplete specifications, such as input/output examples or constraints. Depending on the programming language used, there are several kinds of inductive programming. Inductive functional programming, which uses functional programming languages such as Lisp or Haskell, and most especially inductive logic programming, which uses logic programming languages such as Prolog and other logical representations such as description logics, have been more prominent, but other (programming) language paradigms have also been used, such as constraint programming or probabilistic programming. == Definition == Inductive programming incorporates all approaches which are concerned with learning programs or algorithms from incomplete (formal) specifications. Possible inputs in an IP system are a set of training inputs and corresponding outputs or an output evaluation function, describing the desired behavior of the intended program, traces or action sequences which describe the process of calculating specific outputs, constraints for the program to be induced concerning its time efficiency or its complexity, various kinds of background knowledge such as standard data types, predefined functions to be used, program schemes or templates describing the data flow of the intended program, heuristics for guiding the search for a solution or other biases. Output of an IP system is a program in some arbitrary programming language containing conditionals and loop or recursive control structures, or any other kind of Turing-complete representation language. In many applications the output program must be correct with respect to the examples and partial specification, and this leads to the consideration of inductive programming as a special area inside automatic programming or program synthesis, usually opposed to 'deductive' program synthesis, where the specification is usually complete. In other cases, inductive programming is seen as a more general area where any declarative programming or representation language can be used and we may even have some degree of error in the examples, as in general machine learning, the more specific area of structure mining or the area of symbolic artificial intelligence. A distinctive feature is the number of examples or partial specification needed. Typically, inductive programming techniques can learn from just a few examples. The diversity of inductive programming usually comes from the applications and the languages that are used: apart from logic programming and functional programming, other programming paradigms and representation languages have been used or suggested in inductive programming, such as functional logic programming, constraint programming, probabilistic programming, abductive logic programming, modal logic, action languages, agent languages and many types of imperative languages. == History == The early works of Plotkin, and his "relative least general generalization (rlgg)", had an enormous impact in inductive logic programming. There were some encouraging results on learning recursive Prolog programs such as quicksort from examples together with suitable background knowledge, for example with GOLEM. However, after initial success, the community got disappointed by limited progress about the induction of recursive programs with ILP less and less focusing on recursive programs and leaning more and more towards a machine learning setting with applications in relational data mining and knowledge discovery. In parallel to work in ILP, Koza proposed genetic programming in the early 1990s as a generate-and-test based approach to learning programs. The idea of genetic programming was further developed into the inductive programming system ADATE and the systematic-search-based system MagicHaskeller. Here again, functional programs are learned from sets of positive examples together with an output evaluation (fitness) function which specifies the desired input/output behavior of the program to be learned. The early work in grammar induction (also known as grammatical inference) is related to inductive programming, as rewriting systems or logic programs can be used to represent production rules. In fact, early works in inductive inference considered grammar induction and Lisp program inference as basically the same problem. The results in terms of learnability were related to classical concepts, such as identification-in-the-limit, as introduced in the seminal work of Gold. More recently, the language learning problem was addressed by the inductive programming community. In the recent years, the classical approaches have been resumed and advanced with great success. Therefore, the synthesis problem has been reformulated on the background of constructor-based term rewriting systems taking into account modern techniques of functional programming, as well as moderate use of search-based strategies and usage of background knowledge as well as automatic invention of subprograms. Many new and successful applications have recently appeared beyond program synthesis, most especially in the area of data manipulation, programming by example and cognitive modelling (see below). Other ideas have also been explored with the common characteristic of using declarative languages for the representation of hypotheses. For instance, the use of higher-order features, schemes or structured distances have been advocated for a better handling of recursive data types and structures; abstraction has also been explored as a more powerful approach to cumulative learning and function invention. One powerful paradigm that has been recently used for the representation of hypotheses in inductive programming (generally in the form of generative models) is probabilistic programming (and related paradigms, such as stochastic logic programs and Bayesian logic programming). == Application areas == The first workshop on Approaches and Applications of Inductive Programming (AAIP) Archived 2016-03-03 at the Wayback Machine held in conjunction with ICML 2005 identified all applications where "learning of programs or recursive rules are called for, [...] first in the domain of software engineering where structural learning, software assistants and software agents can help to relieve programmers from routine tasks, give programming support for end users, or support of novice programmers and programming tutor systems. Further areas of application are language learning, learning recursive control rules for AI-planning, learning recursive concepts in web-mining or for data-format transformations". Since then, these and many other areas have shown to be successful application niches for inductive programming, such as end-user programming, the related areas of programming by example and programming by demonstration, and intelligent tutoring systems. Other areas where inductive inference has been recently applied are knowledge acquisition, artificial general intelligence, reinforcement learning and theory evaluation, and cognitive science in general. There may also be prospective applications in intelligent agents, games, robotics, personalisation, ambient intelligence and human interfaces.

    Read more →
  • Flex (lexical analyzer generator)

    Flex (lexical analyzer generator)

    Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex. It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers"). It is frequently used as the lex implementation together with Berkeley Yacc parser generator on BSD-derived operating systems (as both lex and yacc are part of POSIX), or together with GNU bison (a version of yacc) in BSD ports and in Linux distributions. Unlike Bison, flex is not part of the GNU Project and is not released under the GNU General Public License, although a manual for Flex was produced and published by the Free Software Foundation. == History == Flex was written in C around 1987 by Vern Paxson, with the help of many ideas and much inspiration from Van Jacobson. Original version by Jef Poskanzer. The fast table representation is a partial implementation of a design done by Van Jacobson. The implementation was done by Kevin Gong and Vern Paxson. == Example lexical analyzer == This is an example of a Flex scanner for the instructional programming language PL/0. The tokens recognized are: '+', '-', '', '/', '=', '(', ')', ',', ';', '.', ':=', '<', '<=', '<>', '>', '>='; numbers: 0-9 {0-9}; identifiers: a-zA-Z {a-zA-Z0-9} and keywords: begin, call, const, do, end, if, odd, procedure, then, var, while. == Internals == These programs perform character parsing and tokenizing via the use of a deterministic finite automaton (DFA). A DFA is a theoretical machine accepting regular languages, and is equivalent to read-only right moving Turing machines. The syntax is based on the use of regular expressions. See also nondeterministic finite automaton. == Issues == === Time complexity === A Flex lexical analyzer usually has time complexity O ( n ) {\displaystyle O(n)} in the length of the input. That is, it performs a constant number of operations for each input symbol. This constant is quite low: GCC generates 12 instructions for the DFA match loop. Note that the constant is independent of the length of the token, the length of the regular expression and the size of the DFA. However, using the REJECT macro in a scanner with the potential to match extremely long tokens can cause Flex to generate a scanner with non-linear performance. This feature is optional. In this case, the programmer has explicitly told Flex to "go back and try again" after it has already matched some input. This will cause the DFA to backtrack to find other accept states. The REJECT feature is not enabled by default, and because of its performance implications its use is discouraged in the Flex manual. === Reentrancy === By default the scanner generated by Flex is not reentrant. This can cause serious problems for programs that use the generated scanner from different threads. To overcome this issue there are options that Flex provides in order to achieve reentrancy. A detailed description of these options can be found in the Flex manual. === Usage under non-Unix environments === Normally the generated scanner contains references to the unistd.h header file, which is Unix specific. To avoid generating code that includes unistd.h, %option nounistd should be used. Another issue is the call to isatty (a Unix library function), which can be found in the generated code. The %option never-interactive forces flex to generate code that does not use isatty. === Using flex from other languages === Flex can only generate code for C and C++. To use the scanner code generated by flex from other languages a language binding tool such as SWIG can be used. === Unicode support === Flex is limited to matching 1-byte (8-bit) binary values and therefore does not support Unicode. RE/flex and other alternatives do support Unicode matching. == Flex++ == flex++ is a similar lexical scanner for C++ which is included as part of the flex package. The generated code does not depend on any runtime or external library except for a memory allocator (malloc or a user-supplied alternative) unless the input also depends on it. This can be useful in embedded and similar situations where traditional operating system or C runtime facilities may not be available. The flex++ generated C++ scanner includes the header file FlexLexer.h, which defines the interfaces of the two C++ generated classes.

    Read more →
  • Is an AI Pair Programmer Worth It in 2026?

    Is an AI Pair Programmer Worth It in 2026?

    Shopping for the best AI pair programmer? An AI pair programmer is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI pair programmer slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →