AI Generator Zootopia

AI Generator Zootopia — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Process map

Process map is a global-system process model that is used to outline the processes that make up the business system and how they interact with each other. Process map shows the processes as objects, which means it is a static and non-algorithmic view of the processes. It should be differentiated from a detailed process model, which shows a dynamic and algorithmic view of the processes, usually known as a process flow diagram. There are different notation standards that can be used for modelling process maps, but the most notable ones are TOGAF Event Diagram, Eriksson-Penker notation, and ARIS Value Added Chain. == Global process models == Global characteristics of the business system are captured by global or system models. Global process models are presented using different methodologies and sometimes under different names. Most notably, they are named process map in Visual Paradigm and MMABP, value-added chain in ARIS, and process diagram in Eriksson-Penker notation – which can easily lead to the confusion with process flow (detailed process model). Global models are mainly object-oriented and present a static view of the business system; they do not describe dynamic aspects of processes. A process map shows the presence of processes and their mutual relationships. The requirement for the global perspective of the system as a supplementary to the internal process logic description results from the necessity of taking into consideration not only the internal process logic but also its significant surroundings. The algorithmic process model cannot take the place of this perspective since it represents the system model of the process. The detailed process model and the global process model represent different perspectives on the same business system, so these models must be mutually consistent. A macro process map represents the major processes required to deliver a product or service to the customer. These macro process maps can be further detailed in sub-diagrams. It is often the case that process maps cross different functional areas of the organization. Process maps are used by many companies to have a holistic view of all processes and the connections between them. Maps help in navigating the sub-processes and make understanding of the organization's operations easier. The process map shows relationships and dependencies between processes and its focus should be on core business processes of the organization. A process map can be seen as the most abstract level of the process architecture, and it acts as the introduction to the more detailed levels. A process map that is correctly designed is able to provide a general understanding of a company's operations. Designing the process map is an important and strategic step for the organization, and it is followed by further business process modelling implementation. == Context == Methodology for Modelling and Analysis of Business Process (MMABP) is a business process modelling methodology developed at the Department of Information Technology, Faculty of Informatics and Statistics of the Prague University of Economics and Business. The methodology is defined as a “general methodology for modelling business systems using informatics methods and approaches”. Methodology is used to analyse business processes and to develop a comprehensive model of the system. The goal of developing a model is to be used for process optimization. The model should be created following the characteristics and specifics of the organization in question and following external influences that can affect the organization. The model should be optimal from an economic perspective, but it should also be optimal from a factual perspective, meaning that it should be as simple as possible while maintaining complete functionality. Business system modelling is based on a two-dimensional approach: Real World structure (substance) – set of objects and their relationships Real World behaviour – set of mutually connected business processes Additionally, there are also two views of the systems: Global view of the system Detailed view of the system's parts This results in the need to model the system from four different perspectives in order to achieve the complete and comprehensive view of the business system. MMABP also proposes which notation languages can be used for modelling each perspective, and it also suggests some improvements to the notation languages in order to fit the purpose. Global view of the objects – Conceptual model (Class diagram) Detailed view of the objects – Object life cycle (State Chart) Global view of the processes – Process map (Eriksson-Penker Diagram/TOGAF Event Diagram/ARIS VAC) Detailed view of the processes – Model of the process flow (BPMN Diagram) Data Flow Diagram (DFD) is additional diagram used for describing the required functionalities of the information system. == Notation standards == === Eriksson-Penker Diagram === Eriksson-Penker diagram is a tool used in business model analysis and design. It is named after Hans-Erik Eriksson and Magnus Penker, who developed the concept in their book "Business modelling with UML: Business Patterns at Work”. Eriksson-Penker diagrams are used to map out the key components of a business model and how they interact with one another. The diagrams typically consist of a series of boxes and lines that represent the different elements of the business model, such as the value proposition, customer segments, channels, revenue streams, and key resources. The lines between the boxes represent the relationships and dependencies between the different elements of the business model. These diagrams are useful for visualizing and understanding the various components of a business model, and can help organizations identify potential areas for improvement or areas of risk. They can also be used as a communication tool to help stakeholders understand the business model and its underlying assumptions. These diagrams are useful for visualizing and understanding the various components of a business model, and can help organizations identify potential areas for improvement or areas of risk. They can also be used as a communication tool to help stakeholders understand the business model and its underlying assumptions. It is possible to use Eriksson-Penker diagrams to create a global process view of a business. In this case, a diagram would be used to map out the key processes and activities that are involved in the business, as well as the relationships and dependencies between these processes. For example, an Eriksson-Penker diagram could be used to depict the various steps involved in the product development process, from concept development to market launch. It could also be used to show how different functions within the organization, such as marketing, sales, and production, interact and depend on one another to support the overall business. Eriksson-Penker diagram is one of the most popular de facto standards that can be used for an object-oriented global view of business processes. It is developed as an extension of the UML, and it is often used together with the BPMN to compensate for the lack of possibility to model the global view with this widely accepted standard. === TOGAF Event Diagram === TOGAF (The Open Group Architecture Framework) is a framework for enterprise architecture that provides a common language and set of standards for designing, planning, implementing, and governing an enterprise's IT architecture. TOGAF event diagrams are diagrams used in the TOGAF framework to represent the flow of events within a system or process. The TOGAF Event Diagram is a visual representation of the events within an organization or system. It can be used to show the sequence of events that occur in a particular process, as well as the relationships between the events and the stakeholders involved. TOGAF Event Diagrams can be useful in creating a global process view because they provide a visual representation of the events, which can be helpful in understanding how the process fits into the larger context of the organization. TOGAF Event Diagram is the most perspective standard for the system view of processes today. It is used to represent the system of processes as well as their connections to the functional organizational structure. === ARIS Value Added Chain === ARIS (Architecture of Integrated Information Systems) is a methodology and a set of tools for designing and managing business processes. It is based on the idea that business processes are the core of an organization and that they can be modelled and optimized to improve efficiency and effectiveness. The ARIS methodology provides a framework for understanding and analysing business processes, as well as for designing and implementing improvements to those processes. It includes a set of graphical modelling languages and tools for creating process models, as well as a database for storing and managing pr
Read more →
Talkman

Talkman is an edutainment video game developed and published by Sony Computer Entertainment for the PlayStation Portable. It utilizes voice-activated translation software that operates in four languages, Japanese, English, Korean, and Mandarin Chinese. The name "Talkman" is a reference to Sony's Walkman line of portable audio products. It was released in Japan on November 17, 2005, and in America on August 5, 2008 (via the PlayStation Store), as Talkman Travel. In America, however, instead of receiving all the languages included in the Japanese version in one package, single-language packs are available for $2.99 each. Available packs are: Paris (French), Rome (Italian), and Tokyo (Japanese). The software is designed for travelers and entertainment, mostly containing slang and useful travel phrases. While originally sold in and designed for the Japanese market for Japanese users, its translation function operates between all four languages. In Japan, the software has proven popular with the middle-aged female demographic due to an interest in South Korean products, and Korean-language soap operas and movies; and as a fun English education aid for children. Outside of pure translations, Talkman also lets players play games to test their fluency of a language. The program comes with a USB microphone included. This microphone draws power through two gold-colored contacts on the top of the PSP, one on each side of the mini-USB port. This is uncommon due to the ability for most USB products to draw power through USB. These proprietary contacts are similar to the gold-colored contacts on the bottom-right of the device, which are used for charging. Note: The Chotto Shot (aka "Go!Cam") has a built-in microphone that also can be used with the Talkman program. Furthermore, the PSP-3000 model and PSP Go have built-in microphones that work with this application, without the need for any external attachments. == Talkman Euro == Following the success of the Asian version of Talkman, a version designed for translating European languages was developed and released on June 16, 2006. Talkman Euro is available in two versions. The Japanese version contains support for English, Italian, Spanish, German, French, and Japanese, while the Chinese version contains support for Traditional Chinese instead of Japanese. The differences on the packaging (the Japanese flag as opposed to a flag with the word "mie" in Chinese) are minimal and hard to notice. == Talkman UMD-only package == Talkman is also released as a UMD-only package, so users who already have the USB mic or camera can choose to purchase this standalone version. The Sony PSP Headset and the built-in microphone on later model PSPs have also been confirmed to work with Talkman.
Read more →
Top 10 AI Humanizers Compared (2026)

Looking for the best AI humanizer? An AI humanizer is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI humanizer slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.
Read more →
Jian Ma (computational biologist)

Jian Ma (Chinese: 马坚) is an American computer scientist and computational biologist. He is the Ray and Stephanie Lane Professor of Computational Biology in the School of Computer Science at Carnegie Mellon University. He is a faculty member in the Ray and Stephanie Lane Computational Biology Department. His lab develops AI/ML methods to study the structure and function of the human genome and cellular organization and their implications for health and disease. During his Ph.D. and postdoc training, he developed algorithms to reconstruct the ancestral mammalian genome and evolutionary history. His research group has recently pioneered a series of new machine learning solutions for 3D genome organization, single-cell epigenomics, spatial omics, and complex molecular interactions. His lab also explores large language models to uncover gene regulatory mechanisms and the intricate connections among cellular components, with the aim of driving discovery and guiding experimentation. He received an NSF CAREER award in 2011. In 2020, he was awarded a Guggenheim Fellowship in Computer Science. He received the Allen Newell Award for Research Excellence (2025). He is an elected Fellow of the American Association for the Advancement of Science, the American Institute for Medical and Biological Engineering, the International Society for Computational Biology, and the Association for Computing Machinery. He leads an NIH 4D Nucleome Center to develop machine learning algorithms to better understand the cell nucleus. He served as the Program Chair for RECOMB 2024. He is also a member of the Scientific Advisory Board of the Chan Zuckerberg Biohub Chicago (CZ Biohub Chicago) and the RECOMB Steering Committee. In 2024, he launched the Center for AI-Driven Biomedical Research (AI4BIO) at CMU, which will be a catalyst for innovations at the intersection of AI and biomedicine across the School of Computer Science and campus. == Selected Recent Publications == Chen V#, Yang M#, Cui W, Kim JS, Talwalkar A, and Ma J. Applying interpretable machine learning in computational biology - pitfalls, recommendations and opportunities for new developments. Nature Methods, 21(8):1454-1461, 2024. Xiong K#, Zhang R#, and Ma J. scGHOST: Identifying single-cell 3D genome subcompartments. Nature Methods, 21(5):814-822, 2024. Zhou T, Zhang R, Jia D, Doty RT, Munday AD, Gao D, Xin L, Abkowitz JL, Duan Z, and Ma J. GAGE-seq concurrently profiles multiscale 3D genome organization and gene expression in single cells. Nature Genetics, 56(8):1701-1711, 2024. Zhang Y, Boninsegna L, Yang M, Misteli T, Alber F, and Ma J. Computational methods for analysing multiscale 3D genome organization. Nature Reviews Genetics, 5(2):123-141, 2024. Chidester B#, Zhou T#, Alam S, and Ma J. SPICEMIX enables integrative single-cell spatial modeling of cell identity. Nature Genetics, 55(1):78-88, 2023. [Cover Article] Zhang R#, Zhou T#, and Ma J. Ultrafast and interpretable single-cell 3D genome analysis with Fast-Higashi. Cell Systems, 13(10):P798-807.E6, 2022. [Cover Article] Zhu X#, Zhang Y#, Wang Y, Tian D, Belmont AS, Swedlow JR, and Ma J. Nucleome Browser: An integrative and multimodal data navigation platform for 4D Nucleome. Nature Methods, 19(8):911-913, 2022. Zhang R, Zhou T, and Ma J. Multiscale and integrative single-cell Hi-C analysis with Higashi. Nature Biotechnology, 40:254–261, 2022.
Read more →
ArcSoft ShowBiz

ShowBiz is a video editor by ArcSoft for the Windows operating system. It can create VCD and DVDs and can also export to the formats AVI, MPEG, WMV, and MOV. ShowBiz also contains a DVD burning and menu building feature. As of 2003, it was one of the three most dominant bundled titles. == Reception == PC Magazine reviewer Jan Ozer states: "ArcSoft's ShowBiz has evolved into a competent editor that's generally more usable than Dazzle's MovieStar program, providing more configuration controls, better preview features, and a much greater range of fun effects." John Virata, senior editor of Digital Media Online, says in his three page review of ShowBiz DVD 2, "It is an easy editor to work with and has a logically laid out interface that takes you step by step through the video creation and DVD creation process"
Read more →
Moore machine

In the theory of computation, a Moore machine is a finite-state machine whose current output values are determined only by its current state. This is in contrast to a Mealy machine, whose output values are determined both by its current state and by the values of its inputs. Like other finite state machines, in Moore machines, the input typically influences the next state. Thus the input may indirectly influence subsequent outputs, but not the current or immediate output. The Moore machine is named after Edward F. Moore, who presented the concept in a 1956 paper, “Gedanken-experiments on Sequential Machines.” == Formal definition == A Moore machine can be defined as a 6-tuple ( S , s 0 , Σ , Λ , δ , G ) {\displaystyle (S,s_{0},\Sigma ,\Lambda ,\delta ,G)} consisting of the following: A finite set of states S {\displaystyle S} A start state (also called initial state) s 0 {\displaystyle s_{0}} which is an element of S {\displaystyle S} A finite set called the input alphabet Σ {\displaystyle \Sigma } A finite set called the output alphabet Λ {\displaystyle \Lambda } A transition function δ : S × Σ → S {\displaystyle \delta :S\times \Sigma \rightarrow S} mapping a state and the input alphabet to the next state An output function G : S → Λ {\displaystyle G:S\rightarrow \Lambda } mapping each state to the output alphabet "Evolution across time" is realized in this abstraction by having the state machine consult the time-changing input symbol at discrete "timer ticks" t 0 , t 1 , t 2 , . . . {\displaystyle t_{0},t_{1},t_{2},...} and react according to its internal configuration at those idealized instants, or else having the state machine wait for a next input symbol (as on a FIFO) and react whenever it arrives. A Moore machine can be regarded as a restricted type of finite-state transducer. == Visual representation == === Table === A state transition table is a table listing all the triples in the transition relation δ : S × Σ → S {\displaystyle \delta :S\times \Sigma \rightarrow S} . === Diagram === The state diagram for a Moore machine, or Moore diagram, is a state diagram that associates an output value with each state. == Relationship with Mealy machines == As Moore and Mealy machines are both types of finite-state machines, they are equally expressive: either type can be used to parse a regular language. The difference between Moore machines and Mealy machines is that in the latter, the output of a transition is determined by the combination of current state and current input ( S × Σ {\displaystyle S\times \Sigma } as the domain of G {\displaystyle G} ), as opposed to just the current state ( S {\displaystyle S} as the domain of G {\displaystyle G} ). When represented as a state diagram, for a Moore machine, each node (state) is labeled with an output value; for a Mealy machine, each arc (transition) is labeled with an output value. Every Moore machine M {\displaystyle M} is equivalent to the Mealy machine with the same states and transitions and the output function G ( s , σ ) = G M ( δ M ( s , σ ) ) {\displaystyle G(s,\sigma )=G_{M}(\delta _{M}(s,\sigma ))} , which takes each state-input pair ( s , σ ) {\displaystyle (s,\sigma )} and yields G M ( δ M ( s , σ ) ) {\displaystyle G_{M}(\delta _{M}(s,\sigma ))} , where G M {\displaystyle G_{M}} is M {\displaystyle M} 's output function and δ M {\displaystyle \delta _{M}} is M {\displaystyle M} 's transition function. However, not every Mealy machine can be converted to an equivalent Moore machine. Some can be converted only to an almost equivalent Moore machine, with outputs shifted in time. This is due to the way that state labels are paired with transition labels to form the input/output pairs. Consider a transition s i → s j {\displaystyle s_{i}\rightarrow s_{j}} from state s i {\displaystyle s_{i}} to state s j {\displaystyle s_{j}} . The input causing the transition s i → s j {\displaystyle s_{i}\rightarrow s_{j}} labels the edge ( s i , s j ) {\displaystyle (s_{i},s_{j})} . The output corresponding to that input, is the label of state s i {\displaystyle s_{i}} . Notice that this is the source state of the transition. So for each input, the output is already fixed before the input is received, and depends solely on the present state. This is the original definition by E. Moore. It is a common mistake to use the label of state s j {\displaystyle s_{j}} as output for the transition s i → s j {\displaystyle s_{i}\rightarrow s_{j}} . == Examples == Types according to number of inputs/outputs. === Simple === Simple Moore machines have one input and one output: edge detector using XOR binary adding machine clocked sequential systems (a restricted form of Moore machine where the state changes only when the global clock signal changes) Most digital electronic systems are designed as clocked sequential systems. Clocked sequential systems are a restricted form of Moore machine where the state changes only when the global clock signal changes. Typically the current state is stored in flip-flops, and a global clock signal is connected to the "clock" input of the flip-flops. Clocked sequential systems are one way to solve metastability problems. A typical electronic Moore machine includes a combinational logic chain to decode the current state into the outputs (lambda). The instant the current state changes, those changes ripple through that chain, and almost instantaneously the output gets updated. There are design techniques to ensure that no glitches occur on the outputs during that brief period while those changes are rippling through the chain, but most systems are designed so that glitches during that brief transition time are ignored or are irrelevant. The outputs then stay the same indefinitely (LEDs stay bright, power stays connected to the motors, solenoids stay energized, etc.), until the Moore machine changes state again. ==== Worked example ==== A sequential network has one input and one output. The output becomes 1 and remains 1 thereafter when at least two 0's and two 1's have occurred as inputs. A Moore machine with nine states for the above description is shown on the right. The initial state is state A, and the final state is state I. The state table for this example is as follows: === Complex === More complex Moore machines can have multiple inputs as well as multiple outputs. == Gedanken-experiments == In Moore's 1956 paper "Gedanken-experiments on Sequential Machines", the ( n ; m ; p ) {\displaystyle (n;m;p)} automata (or machines) S {\displaystyle S} are defined as having n {\displaystyle n} states, m {\displaystyle m} input symbols and p {\displaystyle p} output symbols. Nine theorems are proved about the structure of S {\displaystyle S} , and experiments with S {\displaystyle S} . Later, " S {\displaystyle S} machines" became known as "Moore machines". At the end of the paper, in Section "Further problems", the following task is stated: Another directly following problem is the improvement of the bounds given at the theorems 8 and 9. Moore's Theorem 8 is formulated as: Given an arbitrary ( n ; m ; p ) {\displaystyle (n;m;p)} machine S {\displaystyle S} , such that every two of its states are distinguishable from one another, then there exists an experiment of length n ( n − 1 ) 2 {\displaystyle {\tfrac {n(n-1)}{2}}} which determines the state of S {\displaystyle S} at the end of the experiment. In 1957, A. A. Karatsuba proved the following two theorems, which completely solved Moore's problem on the improvement of the bounds of the experiment length of his "Theorem 8". Theorem A. If S {\displaystyle S} is an ( n ; m ; p ) {\displaystyle (n;m;p)} machine, such that every two of its states are distinguishable from one another, then there exists a branched experiment of length at most ( n − 1 ) ( n − 2 ) 2 + 1 {\displaystyle {\tfrac {(n-1)(n-2)}{2}}+1} through which one may determine the state of S {\displaystyle S} at the end of the experiment. Theorem B. There exists an ( n ; m ; p ) {\displaystyle (n;m;p)} machine, every two states of which are distinguishable from one another, such that the length of the shortest experiments establishing the state of the machine at the end of the experiment is equal to ( n − 1 ) ( n − 2 ) 2 + 1 {\displaystyle {\tfrac {(n-1)(n-2)}{2}}+1} . Theorems A and B were used for the basis of the course work of a student of the fourth year, A. A. Karatsuba, "On a problem from the automata theory", which was distinguished by testimonial reference at the competition of student works of the faculty of mechanics and mathematics of Moscow State University in 1958. The paper by Karatsuba was given to the journal Uspekhi Mat. Nauk on 17 December 1958 and was published there in June 1960. Until the present day (2011), Karatsuba's result on the length of experiments is the only exact nonlinear result, both in automata theory, and in similar problems of computational complexity theory.
Read more →
Vlado Keselj

Vlado Keselj (Vlado Kešelj) is a Serbian-Canadian computer scientist known for his research in natural language processing and authorship attribution. He is a professor at Dalhousie University. == Education == As a high school student in Yugoslavia, Keselj competed in the 1987 International Mathematical Olympiad, earning a bronze medal. He earned his Ph.D. in 2002 at the University of Waterloo, with the dissertation Modular Stochastic HPSGs for Question Answering supervised by Nick Cercone. == Awards == Vlado Keselj is a recipient of the 2019 CAIAC Distinguished Service Award, awarded by the Canadian Artificial Intelligence Association (CAIAC). == Selected publications == Kešelj, V., Peng, F., Cercone, N., & Thomas, C. (2003, August). N-gram-based author profiles for authorship attribution. In Proceedings of the Conference of the Pacific Association for Computational Linguistics, PACLING 2003 (Vol. 3, pp. 255–264).
Read more →
Two-way finite automaton

In computer science, in particular in automata theory, a two-way finite automaton is a finite automaton that is allowed to re-read its input. == Two-way deterministic finite automaton == A two-way deterministic finite automaton (2DFA) is an abstract machine, a generalized version of the deterministic finite automaton (DFA) which can revisit characters already processed. As in a DFA, there are a finite number of states with transitions between them based on the current character, but each transition is also labelled with a value indicating whether the machine will move its position in the input to the left, right, or stay at the same position. Equivalently, 2DFAs can be seen as read-only Turing machines with no work tape, only a read-only input tape. 2DFAs were introduced in a seminal 1959 paper by Rabin and Scott, who proved them to have equivalent power to one-way DFAs. That is, any formal language which can be recognized by a 2DFA can be recognized by a DFA which only examines and consumes each character in order. Since DFAs are obviously a special case of 2DFAs, this implies that both kinds of machines recognize precisely the class of regular languages. However, the equivalent DFA for a 2DFA may require exponentially many states, making 2DFAs a much more practical representation for algorithms for some common problems. 2DFAs are also equivalent to read-only Turing machines that use only a constant amount of space on their work tape, since any constant amount of information can be incorporated into the finite control state via a product construction (a state for each combination of work tape state and control state). == Formal description == Formally, a two-way deterministic finite automaton can be described by the following 8-tuple: M = ( Q , Σ , L , R , δ , s , t , r ) {\displaystyle M=(Q,\Sigma ,L,R,\delta ,s,t,r)} where Q {\displaystyle Q} is the finite, non-empty set of states Σ {\displaystyle \Sigma } is the finite, non-empty set of input symbols L {\displaystyle L} is the left endmarker R {\displaystyle R} is the right endmarker δ : Q × ( Σ ∪ { L , R } ) → Q × { l e f t , r i g h t } {\displaystyle \delta :Q\times (\Sigma \cup \{L,R\})\rightarrow Q\times \{\mathrm {left,right} \}} s {\displaystyle s} is the start state t {\displaystyle t} is the end state r {\displaystyle r} is the reject state In addition, the following two conditions must also be satisfied: For all q ∈ Q {\displaystyle q\in Q} δ ( q , L ) = ( q ′ , r i g h t ) {\displaystyle \delta (q,L)=(q^{\prime },\mathrm {right} )} for some q ′ ∈ Q {\displaystyle q^{\prime }\in Q} δ ( q , R ) = ( q ′ , l e f t ) {\displaystyle \delta (q,R)=(q^{\prime },\mathrm {left} )} for some q ′ ∈ Q {\displaystyle q^{\prime }\in Q} It says that there must be some transition possible when the pointer reaches either end of the input word. For all symbols σ ∈ Σ ∪ { L } {\displaystyle \sigma \in \Sigma \cup \{L\}} δ ( t , σ ) = ( t , R ) {\displaystyle \delta (t,\sigma )=(t,R)} δ ( r , σ ) = ( r , R ) {\displaystyle \delta (r,\sigma )=(r,R)} δ ( t , R ) = ( t , L ) {\displaystyle \delta (t,R)=(t,L)} δ ( r , R ) = ( r , L ) {\displaystyle \delta (r,R)=(r,L)} It says that once the automaton reaches the accept or reject state, it stays in there forever and the pointer goes to the right most symbol and cycles there infinitely. == Two-way nondeterministic finite automaton == A two-way nondeterministic finite automaton (2NFA) may have multiple transitions defined in the same configuration. Its transition function is δ : Q × ( Σ ∪ { L , R } ) → 2 Q × { l e f t , r i g h t } {\displaystyle \delta :Q\times (\Sigma \cup \{L,R\})\rightarrow 2^{Q\times \{\mathrm {left,right} \}}} . Like a standard one-way NFA, a 2NFA accepts a string if at least one of the possible computations is accepting. Like the 2DFAs, the 2NFAs also accept only regular languages. == Two-way alternating finite automaton == A two-way alternating finite automaton (2AFA) is a two-way extension of an alternating finite automaton (AFA). Its state set is Q = Q ∃ ∪ Q ∀ {\displaystyle Q=Q_{\exists }\cup Q_{\forall }} where Q ∃ ∩ Q ∀ = ∅ {\displaystyle Q_{\exists }\cap Q_{\forall }=\emptyset } . States in Q ∃ {\displaystyle Q_{\exists }} and Q ∀ {\displaystyle Q_{\forall }} are called existential resp. universal. In an existential state a 2AFA nondeterministically chooses the next state like an NFA, and accepts if at least one of the resulting computations accepts. In a universal state 2AFA moves to all next states, and accepts if all the resulting computations accept. == State complexity tradeoffs == Two-way and one-way finite automata, deterministic and nondeterministic and alternating, accept the same class of regular languages. However, transforming an automaton of one type to an equivalent automaton of another type incurs a blow-up in the number of states. Christos Kapoutsis determined that transforming an n {\displaystyle n} -state 2DFA to an equivalent DFA requires n ( n n − ( n − 1 ) n ) {\displaystyle n(n^{n}-(n-1)^{n})} states in the worst case. If an n {\displaystyle n} -state 2DFA or a 2NFA is transformed to an NFA, the worst-case number of states required is ( 2 n n + 1 ) = O ( 4 n n ) {\displaystyle {\binom {2n}{n+1}}=O\left({\frac {4^{n}}{\sqrt {n}}}\right)} . Ladner, Lipton and Stockmeyer. proved that an n {\displaystyle n} -state 2AFA can be converted to a DFA with 2 n 2 n {\displaystyle 2^{n2^{n}}} states. The 2AFA to NFA conversion requires 2 Θ ( n log ⁡ n ) {\displaystyle 2^{\Theta (n\log n)}} states in the worst case, see Geffert and Okhotin. It is an open problem whether every 2NFA can be converted to a 2DFA with only a polynomial increase in the number of states. The problem was raised by Sakoda and Sipser, who compared it to the P vs. NP problem in the computational complexity theory. Berman and Lingas discovered a formal relation between this problem and the L vs. NL open problem, see Kapoutsis for a precise relation. == Sweeping automata == Sweeping automata are 2DFAs of a special kind that process the input string by making alternating left-to-right and right-to-left sweeps, turning only at the endmarkers. Sipser constructed a sequence of languages, each accepted by an n-state NFA, yet which is not accepted by any sweeping automata with fewer than 2 n {\displaystyle 2^{n}} states. == Two-way quantum finite automaton == The concept of 2DFAs was in 1997 generalized to quantum computing by John Watrous's "On the Power of 2-Way Quantum Finite State Automata", in which he demonstrates that these machines can recognize nonregular languages and so are more powerful than DFAs. == Two-way pushdown automaton == A pushdown automaton that is allowed to move either way on its input tape is called two-way pushdown automaton (2PDA); it has been studied by Hartmanis, Lewis, and Stearns (1965). Aho, Hopcroft, Ullman (1968) and Cook (1971) characterized the class of languages recognizable by deterministic (2DPDA) and non-deterministic (2NPDA) two-way pushdown automata; Gray, Harrison, and Ibarra (1967) investigated the closure properties of these languages.
Read more →
Free Studio

Free Studio is a freeware set of multimedia computer programs developed by DVDVideoSoft. The programs are available in one integrated package and also as separate downloads (Free Studio Manager is included in both). == Overview == The Free Studio software bundle consists of about 48 programs, grouped into several sections: YouTube, MP3 & Audio, CD-DVD-BD, DVD & Video, Photo & Images, Mobiles, Apple Devices, and 3D. The largest group is the DVD & Video section containing 14 different applications. Mobiles section is the second largest group with 13 programs. However, the YouTube section, particularly YouTube downloading programs, has gained more popularity among users. The programs have been tested and endorsed by a dozen of software portals and have won awards from these sites. Free Studio is most popular in Germany, Greece, Italy, and the United States. It is also popular in Japan, France, and the United Kingdom. Some of the programs in the package are free and open-source software. == History == DVDVideoSoft project was launched in 2006 by company Digital Wave Ltd., for software development to produce multimedia application software. The founders distributed paid software as an affiliate at the start, later their own products appeared on the site. Free YouTube Download was the first successful program, then DVDVideoSoft created and launched several other 'Free YouTube' applications. Later on upon users' requests DVDVideoSoft started developing other kinds of applications including media converters etc. Today DVDVideoSoft offers up to 49 different programs for video, audio and image processing individually or integrated into the Free Studio package. == Features == DVDVideoSoft YouTube programs can be used to download YouTube videos in their original format and convert them to AVI, DVD, MP4, WMV etc. or different audio formats. YouTube section contains Free Video Call Recorder for Skype button, but the program itself is not included into FS installation (it has to be downloaded and installed separately). The "MP3 & Audio" section consists of the programs which convert audio files between different formats, convert audio files to Flash for web, extract audio from video files, edit audio files (Free Audio Dub), rip and burn CDs. Enclosed in the CD-DVD-BD section are the applications that enable users to burn files and folders to discs, to convert videos to a DVD format and vice versa, to burn CDs, and to copy music from audio CDs into files. The "DVD and Video" section contains several desktop video and DVD converters. Some of the programs can flip, rotate and cut (Free Video Dub) videos. One of the most popular programs from the section is Free Video Dub. Converted videos are now, contrary to previous versions, watermarked if no paid membership is present. Free Studio includes several applications for Apple phones, iPods and other devices. The Mobiles section contains a dozen video converters for various mobile devices such as cell phones, Tablets and Game consoles. They convert videos to play them on (BlackBerry, HTC, LG phones, Sony/Sony Ericsson, Nintendo, Xbox, Motorola phones, etc.) The "Photo & Images" section incorporates the programs for image conversion and resizing, extracting JPEG frames from videos (Free Video To JPEG Converter), recording screen activities, making screenshots (Free Screen Recorder). The 3D section is composed of the programs to make 3D videos and 3D images. There are several algorithms which allow to create different types of 3D images. == Supported formats == === Video formats === Input: .avi; .ivf; .div; .divx; .mpg; .mpeg; .mpe; .mp4; .m4v; .wmv; .asf; .webm; .mkv; .mov; .qt; .ts; .mts; .m2t; .m2ts; .mod; .tod; .vro; .dat; .3gp2; .3gpp; .3gp; .3g2; .dvr-ms; .flv; .f4v; .amv; .rm; .rmm; .rv; .rmvb; .ogv; DVD video Output: .mp4; .wmv; .avi; .mkv; .webm; .flv; .swf; .mov; .3gp; .m2ts; DVD video === Audio formats === Input: .mp3 .wav; .aac; .m4a; .m4b; .wma; .ogg; .flac; .ra; .ram; .amr; .ape; .mka; .tta; .aiff; .au; .mpc; .spx; .ac3; audio cd Output: .mp3; .m4a; .aac; .wav; .wma; .ogg; .flac; .ape; audio CD === Image formats === Input: .jpg, .png, .bmp, .gif, .tga Output: .jpg, .png, .bmp, .gif, .tga, .pdf == Reception == The programs have been tested and endorsed by Chip Online, Tucows, SnapFiles, Brothersoft, and Softonic and have won awards from these sites. Free Studio is most popular in Germany, United States and Italy. It is also popular in Japan, France and the United Kingdom. The most popular applications, according to CNET statistics, include Free YouTube to MP3 Converter, Free Video to MP3 Converter, Free MP4 Video Converter and Free YouTube Download. Other programs with high rank: Free AVI Video Converter, Free Video Editor, Free Audio Converter and Free Studio in a whole. == Criticism == Free Studio (as can be common for freeware packages) is criticized for toolbar and Web search engine installation. Older versions have also included OpenCandy, which is loaded automatically, with no request for user approval. There can be difficulties installing only the programs needed without installing bundled extra programs. In March 2017, DVDVideoSoft announced that it had stopped showing other products' ads during installation and removed all toolbars, search engines, and OpenCandy.
Read more →
Is an AI Humanizer Worth It in 2026?

Shopping for the best AI humanizer? An AI humanizer is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI humanizer slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.
Read more →
Eric Xing

Eric Poe Xing (Chinese: 邢波) is an American computer scientist who has been serving as president of Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) since January 2021. He is also a professor in the Carnegie Mellon University School of Computer Science where he founded the SAILING Lab in 2004, and is the co-founder of the AI companies Petuum and GenBio AI. Xing's research focuses on statistical machine learning, probabilistic graphical models, and systems for distributed machine learning. He was elected a Fellow of the Institute of Electrical and Electronics Engineers in 2019 for "contributions to machine learning algorithms and systems" and a Fellow of the Association for Computing Machinery in 2022 for "contributions to algorithms, architectures, and applications in machine learning." == Education == Xing earned a B.Sc. in physics from Tsinghua University in 1993, and an M.Sc. in computer science from Rutgers University in 1998. He earned a Ph.D. in molecular biology and biochemistry from Rutgers in 1999, supervised by molecular cancer researcher Chung S. Yang. His dissertation examined the inactivation of the Rb and p53 pathways in human esophageal squamous cell carcinoma. He earned a second Ph.D. in computer science from the University of California, Berkeley in 2004, supervised by Richard Karp, Michael I. Jordan, and Stuart J. Russell. His thesis applied probabilistic graphical models to motif identification and haplotype inference in genomic data. == Career == Xing joined Carnegie Mellon University (CMU) as a faculty member in 2004, where he created the Statistical Artificial Intelligence and Integrative Genomics (SAILING) Lab. He held visiting appointments from 2010 to 2011, serving as a visiting research professor at Facebook Inc. and as a visiting associate professor in the Department of Statistics at Stanford University. He served as co-Program Chair of the International Conference on Machine Learning (ICML) in 2014 and General Chair in 2019. Xing served as the founding director of CMU’s Center for Machine Learning and Health, established in 2015 as part of the Pittsburgh Health Data Alliance, a collaboration between CMU, the University of Pittsburgh, and the University of Pittsburgh Medical Center. In 2016, Xing co-founded Petuum Inc., a US-based startup. In 2017, Petuum raised $93 million in a round of venture funding from SoftBank. In 2018 Petuum was named a World Economic Forum Technology Pioneer. In 2019, Xing received the Carnegie Science Award for Startup Entrepreneurs in recognition of his leadership of Petuum. On 29 November 2020, Xing was appointed president of the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), with the appointment taking effect in January 2021. In 2024, Xing co-founded GenBio AI where he is chief scientist. The US-based startup, which he co-founded with David Baker, Ziv Bar-Joseph, Emma Lundberg, Le Song and Fred Hu, aims to create AI-driven digital organisms (AIDO) for the purposes of modeling medical treatments. Xing has overseen the launch of the MBZUAI Institute of Foundation Models (IFM), which focuses on research and development of large-scale foundation models. In 2025–2026, IFM released the open-source reasoning model K2 Think, which was covered internationally as part of the UAE’s push to develop domestically controlled (“sovereign”) AI capabilities. IFM presented PAN as a “world model” research project and demonstrated related systems publicly. MBZUAI also collaborated with G42 and Cerebras Systems on the Jais language model, an open-source Arabic–English large language model released in 2023, according to Reuters. == Awards and honors == Xing is a recipient of the National Science Foundation (NSF) Career Award and the Alfred P. Sloan Research Fellowship. Xing is an elected Fellow of the following institutes and associations: Association for the Advancement of Artificial Intelligence (AAAI) 2016 Institute of Electrical and Electronics Engineers (IEEE) 2019 for "contributions to machine learning algorithms and systems" American Statistical Association (ASA) 2022 Association for Computing Machinery (ACM) 2022 for "contributions to algorithms, architectures, and applications in machine learning" Institute of Mathematical Statistics (IMS) 2023 International Society for Computational Biology (ISCB) 2026 == Selected publications == Eric P. Xing; Michael I. Jordan; Stuart J. Russell; Andrew Y. Ng (2003). "Distance Metric Learning with Application to Clustering with Side-Information" (PDF). Advances in Neural Information Processing Systems 15. Advances in Neural Information Processing Systems. Wikidata Q77691192. Edoardo M. Airoldi; David M. Blei; Stephen E Fienberg; Eric P Xing (1 September 2008). "Mixed Membership Stochastic Blockmodels". Journal of Machine Learning Research. 9: 1981–2014. ISSN 1533-7928. PMC 3119541. PMID 21701698. Wikidata Q35058357. Eric P. Xing; Michael I. Jordan; Richard M. Karp (28 June 2001), Feature selection for high-dimensional genomic microarray data, vol. 18, pp. 601–608, Wikidata Q138678867 Xing EP; Karp RM (1 January 2001). "CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts". Bioinformatics. 17 Suppl 1: S306-15. doi:10.1093/BIOINFORMATICS/17.SUPPL_1.S306. ISSN 1367-4803. PMID 11473022. Wikidata Q30657299.
Read more →
State complexity

State complexity is an area of theoretical computer science dealing with the size of abstract automata, such as different kinds of finite automata. The classical result in the area is that simulating an n {\displaystyle n} -state nondeterministic finite automaton by a deterministic finite automaton requires exactly 2 n {\displaystyle 2^{n}} states in the worst case. == Transformation between variants of finite automata == Finite automata can be deterministic and nondeterministic, one-way (DFA, NFA) and two-way (2DFA, 2NFA). Other related classes are unambiguous (UFA), self-verifying (SVFA) and alternating (AFA) finite automata. These automata can also be two-way (2UFA, 2SVFA, 2AFA). All these machines can accept exactly the regular languages. However, the size of different types of automata necessary to accept the same language (measured in the number of their states) may be different. For any two types of finite automata, the state complexity tradeoff between them is an integer function f {\displaystyle f} where f ( n ) {\displaystyle f(n)} is the least number of states in automata of the second type sufficient to recognize every language recognized by an n {\displaystyle n} -state automaton of the first type. The following results are known. NFA to DFA: 2 n {\displaystyle 2^{n}} states. This is the subset construction by Rabin and Scott, proved optimal by Lupanov. UFA to DFA: 2 n {\displaystyle 2^{n}} states, see Leung, An earlier lower bound by Schmidt was smaller. NFA to UFA: 2 n − 1 {\displaystyle 2^{n}-1} states, see Leung. There was an earlier smaller lower bound by Schmidt. SVFA to DFA: Θ ( 3 n / 3 ) {\displaystyle \Theta (3^{n/3})} states, see Jirásková and Pighizzini 2DFA to DFA: n ( n n − ( n − 1 ) n ) {\displaystyle n(n^{n}-(n-1)^{n})} states, see Kapoutsis. Earlier construction by Shepherdson used more states, and an earlier lower bound by Moore was smaller. 2DFA to NFA: ( 2 n n + 1 ) = O ( 4 n n ) {\displaystyle {\binom {2n}{n+1}}=O({\frac {4^{n}}{\sqrt {n}}})} , see Kapoutsis. Earlier construction by Birget used more states. 2NFA to NFA: ( 2 n n + 1 ) {\displaystyle {\binom {2n}{n+1}}} , see Kapoutsis. 2NFA to NFA accepting the complement: O ( 4 n ) {\displaystyle O(4^{n})} states, see Vardi. AFA to DFA: 2 2 n {\displaystyle 2^{2^{n}}} states, see Chandra, Kozen and Stockmeyer. AFA to NFA: 2 n {\displaystyle 2^{n}} states, see Fellah, Jürgensen and Yu. 2AFA to DFA: 2 n 2 n {\displaystyle 2^{n2^{n}}} , see Ladner, Lipton and Stockmeyer. 2AFA to NFA: 2 Θ ( n log ⁡ n ) {\displaystyle 2^{\Theta (n\log n)}} , see Geffert and Okhotin. === The 2DFA vs. 2NFA problem and logarithmic space === It is an open problem whether all 2NFAs can be converted to 2DFAs with polynomially many states, i.e. whether there is a polynomial p ( n ) {\displaystyle p(n)} such that for every n {\displaystyle n} -state 2NFA there exists a p ( n ) {\displaystyle p(n)} -state 2DFA. The problem was raised by Sakoda and Sipser, who compared it to the P vs. NP problem in the computational complexity theory. Berman and Lingas discovered a formal relation between this problem and the L vs. NL open problem. This relation was further elaborated by Kapoutsis. == State complexity of operations for finite automata == Given a binary regularity-preserving operation on languages ∘ {\displaystyle \circ } and a family of automata X (DFA, NFA, etc.), the state complexity of ∘ {\displaystyle \circ } is an integer function f ( m , n ) {\displaystyle f(m,n)} such that for each m-state X-automaton A and n-state X-automaton B there is an f ( m , n ) {\displaystyle f(m,n)} -state X-automaton for L ( A ) ∘ L ( B ) {\displaystyle L(A)\circ L(B)} , and for all integers m, n there is an m-state X-automaton A and an n-state X-automaton B such that every X-automaton for L ( A ) ∘ L ( B ) {\displaystyle L(A)\circ L(B)} must have at least f ( m , n ) {\displaystyle f(m,n)} states. Analogous definition applies for operations with any number of arguments. The first results on state complexity of operations for DFAs were published by Maslov and by Yu, Zhuang and Salomaa. Holzer and Kutrib pioneered the state complexity of operations on NFA. The known results for basic operations are listed below. === Union === If language L 1 {\displaystyle L_{1}} requires m states and language L 2 {\displaystyle L_{2}} requires n states, how many states does L 1 ∪ L 2 {\displaystyle L_{1}\cup L_{2}} require? DFA: m n {\displaystyle mn} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m + n + 1 {\displaystyle m+n+1} states, see Holzer and Kutrib. UFA: at least min ( n , m ) Ω ( log ⁡ ( min ( n , m ) ) ) {\displaystyle \min(n,m)^{\Omega (\log(\min(n,m)))}} ; between m n + m + n {\displaystyle mn+m+n} and m + n m 2 0.79 m {\displaystyle m+nm2^{0.79m}} states, see Jirásek, Jirásková and Šebej. SVFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Szabari. 2DFA: between m + n {\displaystyle m+n} and 4 m + n + 4 {\displaystyle 4m+n+4} states, see Kunc and Okhotin. 2NFA: m + n {\displaystyle m+n} states, see Kunc and Okhotin. === Intersection === How many states does L 1 ∩ L 2 {\displaystyle L_{1}\cap L_{2}} require? DFA: m n {\displaystyle mn} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m n {\displaystyle mn} states, see Holzer and Kutrib. UFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Šebej. SVFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Szabari. 2DFA: between m + n {\displaystyle m+n} and m + n + 1 {\displaystyle m+n+1} states, see Kunc and Okhotin. 2NFA: between m + n {\displaystyle m+n} and m + n + 1 {\displaystyle m+n+1} states, see Kunc and Okhotin. === Complementation === If language L requires n states then how many states does its complement require? DFA: n {\displaystyle n} states, by exchanging accepting and rejecting states. NFA: 2 n {\displaystyle 2^{n}} states, see Birget. or Jirásková UFA: at least n Ω ~ ( log ⁡ n ) {\displaystyle n^{{\tilde {\Omega }}(\log n)}} states, see Göös, Kiefer and Yuan, (this follows an earlier bound by Raskin); and at most n + 1 ⋅ 2 0.5 n {\displaystyle {\sqrt {n+1}}\cdot 2^{0.5n}} states, see Indzhev and Kiefer. SVFA: n {\displaystyle n} states, by exchanging accepting and rejecting states. 2DFA: at least n {\displaystyle n} and at most 4 n {\displaystyle 4n} states, see Geffert, Mereghetti and Pighizzini. === Concatenation === How many states does L 1 L 2 = { w 1 w 2 ∣ w 1 ∈ L 1 , w 2 ∈ L 2 } {\displaystyle L_{1}L_{2}=\{w_{1}w_{2}\mid w_{1}\in L_{1},w_{2}\in L_{2}\}} require? DFA: m ⋅ 2 n − 2 n − 1 {\displaystyle m\cdot 2^{n}-2^{n-1}} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m + n {\displaystyle m+n} states, see Holzer and Kutrib. UFA: 3 4 2 m + n − 1 {\displaystyle {\frac {3}{4}}2^{m+n}-1} states, see Jirásek, Jirásková and Šebej. SVFA: Θ ( 3 n / 3 2 m ) {\displaystyle \Theta (3^{n/3}2^{m})} states, see Jirásek, Jirásková and Szabari. 2DFA: at least 2 Ω ( n ) log ⁡ m {\displaystyle {\frac {2^{\Omega (n)}}{\log m}}} and at most 2 m m + 1 ⋅ 2 n n + 1 {\displaystyle 2m^{m+1}\cdot 2^{n^{n+1}}} states, see Jirásková and Okhotin. === Kleene star === DFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Maslov and Yu, Zhuang and Salomaa. NFA: n + 1 {\displaystyle n+1} states, see Holzer and Kutrib. UFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Jirásek, Jirásková and Šebej. SVFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Jirásek, Jirásková and Szabari. 2DFA: at least 1 n 2 n 2 − 1 {\displaystyle {\frac {1}{n}}2^{{\frac {n}{2}}-1}} and at most 2 O ( n n + 1 ) {\displaystyle 2^{O(n^{n+1})}} states, see Jirásková and Okhotin. === Reversal === DFA: 2 n {\displaystyle 2^{n}} states, see Mirkin, Leiss, and Yu, Zhuang and Salomaa. NFA: n + 1 {\displaystyle n+1} states, see Holzer and Kutrib. UFA: n {\displaystyle n} states. SVFA: 2 n + 1 {\displaystyle 2n+1} states, see Jirásek, Jirásková and Szabari. 2DFA: between n + 1 {\displaystyle n+1} and n + 2 {\displaystyle n+2} states, see Jirásková and Okhotin. == Finite automata over a unary alphabet == State complexity of finite automata with a one-letter (unary) alphabet, pioneered by Chrobak, is different from the multi-letter case. Let g ( n ) = e Θ ( n ln ⁡ n ) {\displaystyle g(n)=e^{\Theta ({\sqrt {n\ln n}})}} be Landau's function. === Transformation between models === For a one-letter alphabet, transformations between different types of finite automata are sometimes more efficient than in the general case. NFA to DFA: g ( n ) + O ( n 2 ) {\displaystyle g(n)+O(n^{2})} states, see Chrobak. 2DFA to DFA: g ( n ) + O ( n ) {\displaystyle g(n)+O(n)} states, see Chrobak and Kunc and Okhotin. 2NFA to DFA: O ( g ( n ) ) {\displaystyle O(g(n))} states, see Mereghetti and Pighizzini. and Geffert, Mereghetti and Pighizzini. NFA to 2DFA: at most O ( n 2 ) {\displaystyle O(n^{2})} states, see Chrobak. 2NFA to 2DFA: at most n O ( log ⁡ n ) {\displaystyle n^{O(\log n)}} states, proved by implementing the method of Savitch's theorem, see
Read more →
AI content watermarking

AI content watermarking is the process of embedding imperceptible yet detectable signals into content generated by artificial intelligence systems, such as text, images, audio, or video. The technique allows the content to be traced and identified as machine-generated without compromising its quality for the end user. AI watermarking has emerged as a key approach to address growing concerns about misinformation, deepfakes, copyright infringement, and the traceability of synthetic content in the context of the rapid development of generative artificial intelligence. Unlike traditional visible watermarks used in photography, AI content watermarks are typically invisible to humans and can only be detected and deciphered algorithmically. The concept is distinct from the watermarking of AI models themselves (to prevent model theft) and from the watermarking of training data (to combat unauthorized data use). Modern AI watermarking schemes are typically formalized as a pair of algorithms, an embedding (or generation) algorithm and a detection algorithm, sharing a secret key, whose performance is evaluated along three competing axes: quality (the watermark must not noticeably degrade outputs), detectability (the watermark must be statistically distinguishable from unwatermarked content), and robustness (the watermark must persist under adversarial or incidental modifications). == Background == Digital watermarking has been used for decades to protect physical and digital media, from paper currency to photographs. Classical schemes typically embedded a fixed bit-string into a fixed cover signal, with robustness criteria defined against a small fixed set of distortions such as JPEG compression or additive Gaussian noise. The rapid advancement of generative AI in the early 2020s, however, created a new and qualitatively different demand: rather than protecting a single artifact, watermarks for AI content must be embedded automatically across an open-ended distribution of generated outputs while remaining robust to a much wider class of adversarial transformations, including paraphrasing, image regeneration via diffusion models, and re-recording. Large image generation models such as DALL-E, Stable Diffusion, and Midjourney, along with large language models like ChatGPT, made it possible to produce highly realistic synthetic text, images, audio, and video at scale, raising significant ethical and security concerns. In July 2023, the Biden administration secured voluntary commitments from leading AI companies, including OpenAI, Alphabet, Meta, and Amazon, to develop watermarking and other provenance technologies to help users identify AI-generated content. == Formal definitions and design goals == Most modern AI watermarking schemes can be formalized as a pair of algorithms ( W m , D e t e c t ) {\displaystyle ({\mathsf {Wm}},{\mathsf {Detect}})} parameterized by a secret key k {\displaystyle k} . The embedding algorithm W m {\displaystyle {\mathsf {Wm}}} takes a generative model M {\displaystyle M} (and optionally a prompt) and returns a watermarked output x {\displaystyle x} ; the detection algorithm D e t e c t ( x , k ) {\displaystyle {\mathsf {Detect}}(x,k)} outputs a real-valued score (typically a p-value or log-likelihood ratio) used to decide whether x {\displaystyle x} was produced by the watermarked generator. The literature evaluates such schemes along several largely conflicting criteria: Criteria for evaluation include imperceptibility or quality preservation, measured for text via perplexity and human preference judgments, and for images and audio via metrics such as PSNR, SSIM, LPIPS, or PESQ. Detectability is typically expressed as the true positive rate at a fixed false positive rate (e.g. 1% or 10^-6), or as the number of tokens or pixels needed to reach a given confidence level. Robustness refers to the requirement that the watermark should survive expected modifications like JPEG or MP3 compression, cropping, noise, paraphrasing, or machine translation. Distortion-freeness is a stronger property requiring that the marginal distribution of any single watermarked output be statistically identical to the unwatermarked model's distribution. Schemes due to Aaronson, Christ et al., and Kuditipudi et al. are distortion-free in this sense, while the original Kirchenbauer et al. scheme is not. Forgery resistance or unforgeability means an adversary without the secret key should be unable to produce content that passes detection. == Techniques == AI watermarking techniques vary significantly depending on the type of content being watermarked. At its core, the process involves two main stages: embedding (or encoding) the watermark, and detection. There are two primary methods for embedding: watermarking during content generation, which requires access to the AI model itself but is generally more robust, and post-generation watermarking, which can be applied to content from any source, including closed-source models. Watermarks can be broadly classified as visible, including overt marks such as logos or text overlays, or imperceptible, which are detectable only by algorithms. They can also be classified by durability: robust watermarks are designed to withstand common transformations such as compression, cropping, and re-encoding, while fragile watermarks are easily destroyed by any alteration, making them useful for tamper detection. A further axis distinguishes zero-bit watermarks, which only signal "this content was generated by model M," from multi-bit watermarks, which embed an arbitrary payload (such as a user identifier) that can be recovered at detection time. === Text === Text watermarking is considered one of the most challenging modalities because natural language offers relatively limited redundancy compared to images or audio. Modern approaches for large language models alter the autoregressive sampling process so that some statistical signature is left in the choice of tokens, while leaving the surface form of the text unchanged. The literature distinguishes three main families of generation-time text watermarks. Logit-biasing schemes (e.g. KGW) add a fixed bias δ {\displaystyle \delta } to a pseudorandomly selected subset of vocabulary logits before softmax sampling. Reweighting or sampling-based schemes (e.g. SynthID-Text) compose multiple pseudorandom tournaments over the model's full distribution. Distortion-free schemes based on the Gumbel-max trick or inverse transform sampling (Aaronson 2022; Kuditipudi et al. 2023; Christ et al. 2024) preserve the marginal output distribution of the model. ==== KGW: token-probability shifting ==== The pioneering "green list / red list" scheme of Kirchenbauer et al. (KGW), introduced at ICML 2023, is the foundation for most subsequent text watermarks. At each decoding step t {\displaystyle t} , a pseudorandom function (PRF) keyed by a secret k {\displaystyle k} is applied to a context window of h {\displaystyle h} previous tokens to deterministically partition the vocabulary V {\displaystyle V} of size N {\displaystyle N} into a "green list" G ⊂ V {\displaystyle G\subset V} of size γ N {\displaystyle \gamma N} and its complement, the "red list" R = V ∖ G {\displaystyle R=V\setminus G} , where γ ∈ ( 0 , 1 ) {\displaystyle \gamma \in (0,1)} (typically γ = 1 / 2 {\displaystyle \gamma =1/2} ) is the green fraction. A logits processor then increments every green-list logit by a fixed bias δ > 0 {\displaystyle \delta >0} before softmax: ℓ v ′ = ℓ v + δ ⋅ 1 [ v ∈ G ] {\displaystyle \ell '_{v}=\ell _{v}+\delta \cdot \mathbf {1} [v\in G]} so that, after sampling, green tokens are over-represented but generation is not constrained to green tokens alone; high-entropy positions tolerate the bias gracefully, while low-entropy positions (where one token dominates the logits) override the watermark and preserve correctness on factual content. Detection requires only the secret key and the candidate text, not the language model itself. The detector recomputes the partition g ( ⋅ ) {\displaystyle g(\cdot )} for each token, counts the number of green hits | G | hits {\displaystyle |G|_{\text{hits}}} in a sequence of length T {\displaystyle T} , and computes a one-proportion z-test statistic: z = | G | hits − γ T T γ ( 1 − γ ) {\displaystyle z={\frac {|G|_{\text{hits}}-\gamma T}{\sqrt {T\gamma (1-\gamma )}}}} Under the null hypothesis that the text was written by an unwatermarked source (human or another model), the green-hit count is approximately binomially distributed with mean γ T {\displaystyle \gamma T} ; a large positive z {\displaystyle z} rejects the null hypothesis. The original paper reports that fewer than 25 watermarked tokens are sufficient to detect a watermark with a false positive rate below 10^-5 on the OPT-1.3B model. A follow-up study by the same group documented robustness under temperature sampling, top-p (nucleus) sampling, and human paraphrasing, and proposed sliding-window
Read more →
Heng Ji

Heng Ji is a computer scientist who works on information extraction and natural language processing. She is well known for her work on joined named entity recognition and relation extraction, as well as for her work on cross-document event extraction. She has been coordinating the popular NIST TAC Knowledge Base Population task since 2010. She has been recognised as one of AI's 10 to watch by IEEE Intelligent Systems in 2013, and has won multiple awards, including a NSF Career Award in 2009, Google Research awards in 2009 and 2014, and an IBM Watson Faculty Award in 2012. == Education == Heng Ji obtained a Bachelor's and master's degree in Computational Linguistics from Tsinghua University. She subsequently obtained a MSc, then PhD in Computer Science from New York University in 2008 under the supervision of Ralph Grishman. Her PhD thesis was on the topic of information extraction, with a particular focus on joint training of multiple components in the information extraction pipeline, as well as cross-lingual learning. == Career == Upon graduating with a PhD from New York University, Ji took up a position as assistant professor at Queens College, City University of New York, where she founded the BLENDER Lab, which focuses on research on cross-lingual, cross-documents, cross-media information extraction and fusion. In 2013, she joined Rensselaer Polytechnic Institute as an Edward P. Hamilton Development Chair and Tenured associate professor in Computer Science. Since 2019, she has been a full professor at the University of Illinois at Urbana–Champaign, as well as an Amazon Scholar. == Research == Heng Ji works in the area of natural language processing, machine learning and information extraction. She has published over 300 peer-reviewed research papers. Her work is published in the proceedings of computer science conferences, including the Annual Meeting of the Association for Computational Linguistics, The Web Conference, and the ACM Conference on Knowledge Discovery and Data Mining (KDD). Ji is a leading researcher in information extraction, having coordinated the popular NIST TAC Knowledge Base Population shared task since 2010. She is most recognised for her work on modelling interactions between subtasks in information extraction, which was also the topic of her PhD thesis, and for her work on event detection using cross-document signals. == Selected honors and distinctions == 2009 NSF Career Award 2009 Google Research Award 2012 IBM Watson Faculty Award 2013 IEEE AI's 10 to Watch 2014 Google Research Award 2016 World Economic Forum, 'Young Scientist' 2017 World Economic Forum, 'Young Scientist' 2020 Annual Meeting of the Association for Computational Linguistics, best demonstration paper
Read more →
Markov chain Monte Carlo

In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it, i.e. the Markov chain's equilibrium distribution matches the target distribution. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Markov chain Monte Carlo methods are used to study probability distributions that are too complex or too high dimensional to study with analytic techniques alone. Various algorithms exist for constructing such Markov chains, including the Metropolis–Hastings algorithm. == General explanation == Markov chain Monte Carlo methods create samples from a continuous random variable, with probability density proportional to a known function. These samples can be used to evaluate an integral over that variable, as its expected value or variance. Practically, an ensemble of chains is generally developed, starting from a set of points arbitrarily chosen and sufficiently distant from each other. These chains are stochastic processes of "walkers" which move around randomly according to an algorithm that looks for places with a reasonably high contribution to the integral to move into next, assigning them higher probabilities. Random walk Monte Carlo methods are a kind of random simulation or Monte Carlo method. However, whereas the random samples of the integrand used in a conventional Monte Carlo integration are statistically independent, those used in MCMC are autocorrelated. Correlations of samples introduces the need to use the Markov chain central limit theorem when estimating the error of mean values. These algorithms create Markov chains such that they have an equilibrium distribution which is proportional to the function given. == History == The development of MCMC methods is deeply rooted in the early exploration of Monte Carlo (MC) techniques in the mid-20th century, particularly in physics. These developments were marked by the Metropolis algorithm proposed by Nicholas Metropolis, Arianna W. Rosenbluth, Marshall Rosenbluth, Augusta H. Teller, and Edward Teller in 1953, which was designed to tackle high-dimensional integration problems using early computers. Then in 1970, W. K. Hastings generalized this algorithm and inadvertently introduced the component-wise updating idea, later known as Gibbs sampling. Simultaneously, the theoretical foundations for Gibbs sampling were being developed, such as the Hammersley–Clifford theorem from Julian Besag's 1974 paper. Although the seeds of MCMC were sown earlier, including the formal naming of Gibbs sampling in image processing by Stuart Geman and Donald Geman (1984) and the data augmentation method by Martin A. Tanner and Wing Hung Wong (1987), its "revolution" in mainstream statistics largely followed demonstrations of the universality and ease of implementation of sampling methods (especially Gibbs sampling) for complex statistical (particularly Bayesian) problems, spurred by increasing computational power and software like BUGS. This transformation was accompanied by significant theoretical advancements, such as Luke Tierney's (1994) rigorous treatment of MCMC convergence, and Jun S. Liu, Wong, and Augustine Kong's (1994, 1995) analysis of Gibbs sampler structure. Subsequent developments further expanded the MCMC toolkit, including particle filters (Sequential Monte Carlo) for sequential problems, Perfect sampling aiming for exact simulation (Jim Propp and David B. Wilson, 1996), RJMCMC (Peter J. Green, 1995) for handling variable-dimension models, and deeper investigations into convergence diagnostics and the central limit theorem. Overall, the evolution of MCMC represents a paradigm shift in statistical computation, enabling the analysis of numerous previously intractable complex models and continually expanding the scope and impact of statistics. == Mathematical setting == Suppose (Xn) is a Markov Chain in the general state space X {\displaystyle {\mathcal {X}}} with specific properties. We are interested in the limiting behavior of the partial sums: S n ( h ) = 1 n ∑ i = 1 n h ( X i ) {\displaystyle S_{n}(h)={\dfrac {1}{n}}\sum _{i=1}^{n}h(X_{i})} as n goes to infinity. Particularly, we hope to establish the Law of Large Numbers and the Central Limit Theorem for MCMC. In the following, we state some definitions and theorems necessary for the important convergence results. In short, we need the existence of invariant measure and Harris recurrent to establish the Law of Large Numbers of MCMC (Ergodic Theorem). And we need aperiodicity, irreducibility and extra conditions such as reversibility to ensure the Central Limit Theorem holds in MCMC. === Irreducibility and aperiodicity === Recall that in the discrete setting, a Markov chain is said to be irreducible if it is possible to reach any state from any other state in a finite number of steps with positive probability. However, in the continuous setting, point-to-point transitions have zero probability. In this case, φ-irreducibility generalizes irreducibility by using a reference measure φ on the measurable space ( X , B ( X ) ) {\displaystyle ({\mathcal {X}},{\mathcal {B}}({\mathcal {X}}))} . Definition (φ-irreducibility) Given a measure φ {\displaystyle \varphi } defined on ( X , B ( X ) ) {\displaystyle ({\mathcal {X}},{\mathcal {B}}({\mathcal {X}}))} , the Markov chain ( X n ) {\displaystyle (X_{n})} with transition kernel K ( x , y ) {\displaystyle K(x,y)} is φ-irreducible if, for every A ∈ B ( X ) {\displaystyle A\in {\mathcal {B}}({\mathcal {X}})} with φ ( A ) > 0 {\displaystyle \varphi (A)>0} , there exists n {\displaystyle n} such that K n ( x , A ) > 0 {\displaystyle K^{n}(x,A)>0} for all x ∈ X {\displaystyle x\in {\mathcal {X}}} (Equivalently, P x ( τ A < ∞ ) > 0 {\displaystyle P_{x}(\tau _{A}<\infty )>0} , here τ A = inf { n ≥ 1 ; X n ∈ A } {\displaystyle \tau _{A}=\inf\{n\geq 1;X_{n}\in A\}} is the first n {\displaystyle n} for which the chain enters the set A {\displaystyle A} ). This is a more general definition for irreducibility of a Markov chain in non-discrete state space. In the discrete case, an irreducible Markov chain is said to be aperiodic if it has period 1. Formally, the period of a state ω ∈ X {\displaystyle \omega \in {\mathcal {X}}} is defined as: d ( ω ) := g c d { m ≥ 1 ; K m ( ω , ω ) > 0 } {\displaystyle d(\omega ):=\mathrm {gcd} \{m\geq 1\,;\,K^{m}(\omega ,\omega )>0\}} For the general (non-discrete) case, we define aperiodicity in terms of small sets: Definition (Cycle length and small sets) A φ-irreducible Markov chain ( X n ) {\displaystyle (X_{n})} has a cycle of length d if there exists a small set C {\displaystyle C} , an associated integer M {\displaystyle M} , and a probability distribution ν M {\displaystyle \nu _{M}} such that d is the greatest common divisor of: { m ≥ 1 ; ∃ δ m > 0 such that C is small for ν m ≥ δ m ν M } . {\displaystyle \{m\geq 1\,;\,\exists \,\delta _{m}>0{\text{ such that }}C{\text{ is small for }}\nu _{m}\geq \delta _{m}\nu _{M}\}.} A set C {\displaystyle C} is called small if there exists m ∈ N ∗ {\displaystyle m\in \mathbb {N} ^{}} and a nonzero measure ν m {\displaystyle \nu _{m}} such that: K m ( x , A ) ≥ ν m ( A ) , ∀ x ∈ C , ∀ A ∈ B ( X ) . {\displaystyle K^{m}(x,A)\geq \nu _{m}(A),\quad \forall x\in C,\,\forall A\in {\mathcal {B}}({\mathcal {X}}).} === Harris recurrent === Definition (Harris recurrence) A set A {\displaystyle A} is Harris recurrent if P x ( η A = ∞ ) = 1 {\displaystyle P_{x}(\eta _{A}=\infty )=1} for all x ∈ A {\displaystyle x\in A} , where η A = ∑ n = 1 ∞ I A ( X n ) {\displaystyle \eta _{A}=\sum _{n=1}^{\infty }\mathbb {I} _{A}(X_{n})} is the number of visits of the chain ( X n ) {\displaystyle (X_{n})} to the set A {\displaystyle A} . The chain ( X n ) {\displaystyle (X_{n})} is said to be Harris recurrent if there exists a measure ψ {\displaystyle \psi } such that the chain is ψ {\displaystyle \psi } -irreducible and every measurable set A {\displaystyle A} with ψ ( A ) > 0 {\displaystyle \psi (A)>0} is Harris recurrent. A useful criterion for verifying Harris recurrence is the following: Proposition If for every A ∈ B ( X ) {\displaystyle A\in {\mathcal {B}}({\mathcal {X}})} , we have P x ( τ A < ∞ ) = 1 {\displaystyle P_{x}(\tau _{A}<\infty )=1} for every x ∈ A {\displaystyle x\in A} , then P x ( η A = ∞ ) = 1 {\displaystyle P_{x}(\eta _{A}=\infty )=1} for all x ∈ X {\displaystyle x\in {\mathcal {X}}} , and the chain ( X n ) {\displaystyle (X_{n})} is Harris recurrent. This definition is only needed when the state space X {\displaystyle {\mathcal {X}}} is uncountable. In the countable case, recurrence corresponds to E x [ η x ] = ∞ {\displaystyle \mathbb {E} _{x}[\eta _{x}]=\infty } , which is equivalent to P x ( τ x < ∞ ) = 1 {\displaystyle P_{x}(\tau _{x}<\infty )=1} for all x ∈ X {\displaystyle x\i
Read more →