AI Detector Winston

AI Detector Winston — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Social software engineering

    Social software engineering

    Social software engineering (SSE) is a branch of software engineering that is concerned with the social aspects of software development and the developed software. SSE focuses on the socialness of both software engineering and developed software. On the one hand, the consideration of social factors in software engineering activities, processes and CASE tools is deemed to be useful to improve the quality of both development process and produced software. Examples include the role of situational awareness and multi-cultural factors in collaborative software development. On the other hand, the dynamicity of the social contexts in which software could operate (e.g., in a cloud environment) calls for engineering social adaptability as a runtime iterative activity. Examples include approaches which enable software to gather users' quality feedback and use it to adapt autonomously or semi-autonomously. SSE studies and builds socially-oriented tools to support collaboration and knowledge sharing in software engineering. SSE also investigates the adaptability of software to the dynamic social contexts in which it could operate and the involvement of clients and end-users in shaping software adaptation decisions at runtime. Social context includes norms, culture, roles and responsibilities, stakeholder's goals and interdependencies, end-users perception of the quality and appropriateness of each software behaviour, etc. The participants of the 1st International Workshop on Social Software Engineering and Applications (SoSEA 2008) proposed the following characterization: Community-centered: Software is produced and consumed by and/or for a community rather than focusing on individuals Collaboration/collectiveness: Exploiting the collaborative and collective capacity of human beings Companionship/relationship: Making explicit the various associations among people Human/social activities: Software is designed consciously to support human activities and to address social problems Social inclusion: Software should enable social inclusion enforcing links and trust in communities Thus, SSE can be defined as "the application of processes, methods, and tools to enable community-driven creation, management, deployment, and use of software in online environments". One of the main observations in the field of SSE is that the concepts, principles, and technologies made for social software applications are applicable to software development itself as software engineering is inherently a social activity. SSE is not limited to specific activities of software development. Accordingly, tools have been proposed supporting different parts of SSE, for instance, social system design or social requirements engineering. Consequently vertical market software, such as software development tools, engineering tools, marketing tools or software that helps users in a decision-making process can profit from social components. Such vertical social software differentiates strongly in its user-base from traditional social software such as Yammer.

    Read more →
  • Dissociated press

    Dissociated press

    Dissociated press is a parody generator (a computer program that generates nonsensical text). The generated text is based on another text using the Markov chain technique. The name is a play on "Associated Press" and the psychological term dissociation (although word salad is more typical of conditions like aphasia and schizophrenia – which is, however, frequently confused with dissociative identity disorder by laypeople). An implementation of the algorithm is available in Emacs. Another implementation is available as a Perl module in CPAN, Games::Dissociate. == The algorithm == The algorithm starts by printing a number of consecutive words (or letters) from the source text. Then it searches the source text for an occurrence of the few last words or letters printed out so far. If multiple occurrences are found, it picks a random one, and proceeds with printing the text following the chosen occurrence. After a predetermined length of text is printed out, the search procedure is repeated for the newly printed ending. Considering that words and phrases tend to appear in specific grammatical contexts, the resulting text usually seems correct grammatically, and if the source text is uniform in style, the result appears to be of similar style and subject, and takes some effort on the reader's side to recognize as not genuine. Still, the randomness of the assembly process deprives it of any logical flow - the loosely related parts are connected in a nonsensical way, creating a humorously abstract, random result. == Examples == Here is a short example of word-based Dissociated Press applied to the Jargon File: wart: n. A small, crocky feature that sticks out of an array (C has no checks for this). This is relatively benign and easy to spot if the phrase is bent so as to be not worth paying attention to the medium in question. Here is a short example of letter-based Dissociated Press applied to the same source: window sysIWYG: n. A bit was named aften /bee´t@/ prefer to use the other guy's re, especially in every cast a chuckle on neithout getting into useful informash speech makes removing a featuring a move or usage actual abstractionsidered interj. Indeed spectace logic or problem! == History == The dissociated press algorithm is described in HAKMEM (1972) Item #176. The name "dissociated press" is first known to have been associated with the Emacs implementation. Brian Hayes discussed a Travesty algorithm in Scientific American in November 1983. The article provided a garbled William Faulkner passage: When he got on the table, he come in. He never come out of my own pocket as a measure of protecting the company against riot and bloodshed. And when he said. "You tell me a bus ticket, let alone write out no case histories. Then the law come back with a knife!" Hugh Kenner and Joseph O'Rourke of Johns Hopkins University discussed their frequency table-based Travesty generator for microcomputers in BYTE in November 1984. The article included the Turbo Pascal source for two versions of the generator, one using Hayes' algorithm and another using Claude Shannon's Hellbat algorithm. Murray Lesser offered a compiled BASIC version in the magazine in July 1985, in September 1985 Peter Wayner offered a version that used tree data structures instead of frequency tables, and in December 1985 Neil J. Rubenking offered a version written in Turbo Pascal that stored frequency information in a B-tree.

    Read more →
  • Thompson's construction

    Thompson's construction

    In computer science, Thompson's construction algorithm, also called the McNaughton–Yamada–Thompson algorithm, is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA). This NFA can be used to match strings against the regular expression. This algorithm is credited to Ken Thompson. Regular expressions and nondeterministic finite automata are two representations of formal languages. For instance, text processing utilities use regular expressions to describe advanced search patterns, but NFAs are better suited for execution on a computer. Hence, this algorithm is of practical interest, since it can compile regular expressions into NFAs. From a theoretical point of view, this algorithm is a part of the proof that they both accept exactly the same languages, that is, the regular languages. An NFA can be made deterministic by the powerset construction and then be minimized to get an optimal automaton corresponding to the given regular expression. However, an NFA may also be interpreted directly. To decide whether two given regular expressions describe the same language, each can be converted into an equivalent minimal deterministic finite automaton via Thompson's construction, powerset construction, and DFA minimization. If, and only if, the resulting automata agree up to renaming of states, the regular expressions' languages agree. == The algorithm == The algorithm works recursively by splitting an expression into its constituent subexpressions, from which the NFA will be constructed using a set of rules. More precisely, from a regular expression E, the obtained automaton A with the transition function Δ respects the following properties: A has exactly one initial state q0, which is not accessible from any other state. That is, for any state q and any letter a, Δ ( q , a ) {\displaystyle \Delta (q,a)} does not contain q0. A has exactly one final state qf, which is not co-accessible from any other state. That is, for any letter a, Δ ( q f , a ) = ∅ {\displaystyle \Delta (q_{f},a)=\emptyset } . Let c be the number of concatenation of the regular expression E and let s be the number of symbols apart from parentheses — that is, |, , a and ε. Then, the number of states of A is 2s − c (linear in the size of E). The number of transitions leaving any state is at most two. Since an NFA of m states and at most e transitions from each state can match a string of length n in time O(emn), a Thompson NFA can do pattern matching in linear time, assuming a fixed-size alphabet. === Rules === The following rules are depicted according to Aho et al. (2007), p. 122. In what follows, N(s) and N(t) are the NFA of the subexpressions s and t, respectively. The empty-expression ε is converted to A symbol a of the input alphabet is converted to The union expression s|t is converted to State q goes via ε either to the initial state of N(s) or N(t). Their final states become intermediate states of the whole NFA and merge via two ε-transitions into the final state of the NFA. The concatenation expression st is converted to The initial state of N(s) is the initial state of the whole NFA. The final state of N(s) becomes the initial state of N(t). The final state of N(t) is the final state of the whole NFA. The Kleene star expression s is converted to An ε-transition connects initial and final state of the NFA with the sub-NFA N(s) in between. Another ε-transition from the inner final to the inner initial state of N(s) allows for repetition of expression s according to the star operator. The parenthesized expression (s) is converted to N(s) itself. With these rules, using the empty expression and symbol rules as base cases, it is possible to prove with structural induction that any regular expression may be converted into an equivalent NFA. == Example == Two examples are now given, a small informal one with the result, and a bigger with a step by step application of the algorithm. === Small Example === The picture below shows the result of Thompson's construction on (ε|ab). The purple oval corresponds to a, the teal oval corresponds to a, the green oval corresponds to b, the orange oval corresponds to ab, and the blue oval corresponds to ε. === Application of the algorithm === As an example, the picture shows the result of Thompson's construction algorithm on the regular expression (0|(1(01(00)0)1)) that denotes the set of binary numbers that are multiples of 3: { ε, "0", "00", "11", "000", "011", "110", "0000", "0011", "0110", "1001", "1100", "1111", "00000", ... }. The upper right part shows the logical structure (syntax tree) of the expression, with "." denoting concatenation (assumed to have variable arity); subexpressions are named a-q for reference purposes. The left part shows the nondeterministic finite automaton resulting from Thompson's algorithm, with the entry and exit state of each subexpression colored in magenta and cyan, respectively. An ε as transition label is omitted for clarity — unlabelled transitions are in fact ε transitions. The entry and exit state corresponding to the root expression q is the start and accept state of the automaton, respectively. The algorithm's steps are as follows: An equivalent minimal deterministic automaton is shown below. == Relation to other algorithms == Thompson's is one of several algorithms for constructing NFAs from regular expressions; an earlier algorithm was given by McNaughton and Yamada. Converse to Thompson's construction, Kleene's algorithm transforms a finite automaton into a regular expression. Glushkov's construction algorithm is similar to Thompson's construction, once the ε-transitions are removed. == Use in string pattern matching == Regular expressions are often used to specify patterns that software is then asked to match. Generating an NFA by Thompson's construction, and using an appropriate algorithm to simulate it, it is possible to create pattern-matching software with performance that is ⁠ O ( m n ) {\displaystyle O(mn)} ⁠, where m is the length of the regular expression and n is the length of the string being matched. This is much better than is achieved by many popular programming-language implementations; however, it is restricted to purely regular expressions and does not support patterns for non-regular languages like backreferences.

    Read more →
  • The Best Free AI Chatbot for Beginners

    The Best Free AI Chatbot for Beginners

    Trying to pick the best AI chatbot? An AI chatbot is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI chatbot slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Image scaling

    Image scaling

    In computer graphics and digital imaging, image scaling is the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement. When scaling a vector graphic image, the graphic primitives that make up the image can be rendered using geometric transformations at any resolution with no loss of image quality. When scaling a raster graphics image, a new image with a higher or lower number of pixels must be generated. In the case of decreasing the pixel number (scaling down), this usually results in a visible quality loss. From the standpoint of digital signal processing, the scaling of raster graphics is a two-dimensional example of sample-rate conversion, the conversion of a discrete signal from a sampling rate (in this case, the local sampling rate) to another. == Mathematical == Image scaling can be interpreted as a form of image resampling or image reconstruction from the view of the Nyquist sampling theorem. According to the theorem, downsampling to a smaller image from a higher-resolution original can only be carried out after applying a suitable 2D anti-aliasing filter to prevent aliasing artifacts. The image is reduced to the information that can be carried by the smaller image. In the case of up sampling, a reconstruction filter takes the place of the anti-aliasing filter. A more sophisticated approach to upscaling treats the problem as an inverse problem, solving the question of generating a plausible image that, when scaled down, would look like the input image. A variety of techniques have been applied for this, including optimization techniques with regularization terms and the use of machine learning from examples. == Algorithms == An image size can be changed in several ways. === Nearest-neighbor interpolation === One of the simpler ways of increasing image size is nearest-neighbor interpolation, replacing every pixel with the nearest pixel in the output; for upscaling, this means multiple pixels of the same color will be present. This can preserve sharp details but also introduce jaggedness in previously smooth images. 'Nearest' in nearest-neighbor does not have to be the mathematical nearest. One common implementation is to always round toward zero. Rounding this way produces fewer artifacts and is faster to calculate. This algorithm is often preferred for images which have little to no smooth edges. A common application of this can be found in pixel art. === Bilinear and bicubic interpolation === Bilinear interpolation works by interpolating pixel color values, introducing a continuous transition into the output even where the original material has discrete transitions. Although this is desirable for continuous-tone images, this algorithm reduces contrast (sharp edges) in a way that may be undesirable for line art. Bicubic interpolation yields substantially better results, with an increase in computational cost. === Sinc and Lanczos resampling === Sinc resampling, in theory, provides the best possible reconstruction for a perfectly bandlimited signal. In practice, the assumptions behind sinc resampling are not completely met by real-world digital images. Lanczos resampling, an approximation to the sinc method, yields better results. Bicubic interpolation can be regarded as a computationally efficient approximation to Lanczos resampling. === Box sampling === One weakness of bilinear, bicubic, and related algorithms is that they sample a specific number of pixels. When downscaling below a certain threshold, such as more than twice for all bi-sampling algorithms, the algorithms will sample non-adjacent pixels, which results in both losing data and rough results. The trivial solution to this issue is box sampling, which is to consider the target pixel a box on the original image and sample all pixels inside the box. This ensures that all input pixels contribute to the output. The major weakness of this algorithm is that it is hard to optimize. === Mipmap === Another solution to the downscale problem of bi-sampling scaling is mipmaps. A mipmap is a prescaled set of downscaled copies. When downscaling, the nearest larger mipmap is used as the origin to ensure no scaling below the useful threshold of bilinear scaling. This algorithm is fast and easy to optimize. It is standard in many frameworks, such as OpenGL. The cost is using more image memory, exactly one-third more in the standard implementation. === Fourier-transform methods === Simple interpolation based on the Fourier transform pads the frequency domain with zero components (a smooth window-based approach would reduce the ringing). Besides the good conservation (or recovery) of details, notable are the ringing and the circular bleeding of content from the left border to the right border (and the other way around). === Edge-directed interpolation === Edge-directed interpolation algorithms aim to preserve edges in the image after scaling, unlike other algorithms, which can introduce staircase artifacts. Examples of algorithms for this task include New Edge-Directed Interpolation (NEDI), Edge-Guided Image Interpolation (EGGI), Iterative Curvature-Based Interpolation (ICBI), and Directional Cubic Convolution Interpolation (DCCI). A 2013 analysis found that DCCI had the best scores in peak signal-to-noise ratio and structural similarity on a series of test images. === hqx === For magnifying computer graphics with low resolution and/or few colors (usually from 2 to 256 colors), better results can be achieved by hqx or other pixel-art scaling algorithms. These produce sharp edges and maintain a high level of detail. === Vectorization === Vector extraction, or vectorization, offers another approach. Vectorization first creates a resolution-independent vector representation of the graphic to be scaled. The resulting SVG vector file can then be exported and rendered at any required resolution without quality loss, serving directly as production-ready artwork for scalable display & printing. This technique is used by Adobe Illustrator, Live Trace, and Inkscape. Scalable Vector Graphics are well suited to simple geometric images, while photographs do not fare well with vectorization due to their complexity. === Deep convolutional neural networks === This method uses machine learning for more detailed images, such as photographs and complex artwork. Programs that use this method include waifu2x, Imglarger and Neural Enhance. Demonstration of conventional vs. waifu2x upscaling with noise reduction, using a detail of Phosphorus and Hesperus by Evelyn De Morgan. [Click image for full size] AI-driven upscaling software allows detail and sharpness to be added to historical photographs, where it is not present in the original. The availability of AI upscaling tools has led to confusion where a person believes that the upscaled version of a blurry image is genuinely showing them the subject of the original photograph. In 2025 a user of the social media site X posted an AI-upscaled version of a low resolution photo of Donald Trump that they had zoomed in on, and asked if anyone could "explain what the hell is happening to his forehead". Experts noted that the image had been distorted by the upscaling process, and that such tools "inevitably have to invent, or at least recreate, details that were or were not there". == Applications == === General === Image scaling is used in, among other applications, web browsers, image editors, image and file viewers, software magnifiers, digital zoom, the process of generating thumbnail images, and when outputting images through screens or printers. === Video === This application is the magnification of images for home theaters for HDTV-ready output devices from PAL-Resolution content, for example, from a DVD player. Upscaling is performed in real time, and the output signal is not saved. === Pixel-art scaling === As pixel-art graphics are usually low-resolution, they rely on careful placement of individual pixels, often with a limited palette of colors. This results in graphics that rely on stylized visual cues to define complex shapes with little resolution, down to individual pixels. This makes scaling pixel art a particularly difficult problem. Specialized algorithms were developed to handle pixel-art graphics, as the traditional scaling algorithms do not take perceptual cues into account. Since a typical application is to improve the appearance of fourth-generation and earlier video games on arcade and console emulators, many are designed to run in real time for small input images at 60 frames per second. On fast hardware, these algorithms are suitable for gaming and other real-time image processing. These algorithms provide sharp, crisp graphics, while minimizing blur. Scaling art algorithms have been implemented in a wide range of emulators such as HqMAME and DOSBox, as well as 2D game engines and game engine recreations such as ScummVM. They gained recognition with game

    Read more →
  • The Best Free AI Background Remover for Beginners

    The Best Free AI Background Remover for Beginners

    In search of the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Is an AI Website Builder Worth It in 2026?

    Is an AI Website Builder Worth It in 2026?

    Comparing the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Best AI Subtitle Generators in 2026

    Best AI Subtitle Generators in 2026

    Comparing the best AI subtitle generator? An AI subtitle generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI subtitle generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Dimensions CM

    Dimensions CM

    Dimensions CM is a software change and configuration management product developed by OpenText Corporation. It includes revision control, change, build and release management capabilities. Since 2014 (v14.1) Dimensions CM includes PulseUno module providing Code review and Continuous integration capabilities. Starting with the version 14.5.2 (2020) it can also serve as a binary repository manager. == History == Previous product names: PCMS Dimensions (SQL Software) PVCS Dimensions (Merant, Intersolv)

    Read more →
  • Regina Barzilay

    Regina Barzilay

    Regina Barzilay (Hebrew: רגינה ברזילי; born 1970) is an Israeli-American computer scientist. She is a professor at the Massachusetts Institute of Technology and a faculty lead for artificial intelligence at the MIT Jameel Clinic. Her research interests are in natural language processing and applications of deep learning to chemistry and oncology. == Early life and education == Barzilay was born in Chișinău, Moldova and emigrated to Israel with her parents at the age of 20. She received bachelor's and master's degrees from Ben-Gurion University of the Negev in 1993 and 1998, respectively. She obtained a PhD in computer science from Columbia University in 2003 for research supervised by Kathleen McKeown. == Career and research == After her PhD, she spent a year as a postdoctoral researcher at Cornell University. She was appointed as Delta Electronics Professor of Electrical Engineering and Computer Science at MIT in 2016. She was diagnosed with breast cancer in 2014, which prompted her to conduct research in oncology. Barzilay won the MacArthur Fellowship in 2017. For her doctoral dissertation at Columbia University, she led the development of Newsblaster, which recognized stories from different news sources as being about the same basic subject, and then paraphrased elements from the stories to create a summary. In computational linguistics, Barzilay created algorithms that learned annotations from common languages (i.e. English) to analyze less understood languages. Prompted by her experience with breast cancer, Barzilay is applying machine learning to oncology. She is collaborating with physicians and students to devise deep learning models that utilize images, text, and structured data to identify trends that affect early diagnosis, treatment, and disease prevention. Frontline Documentary Following her battle with breast cancer in 2014, and her researching into applying artificial intelligence to improve early detection methods, she collaborated with Dr. Connie Lehman at Massachusetts General Hospital. While there Barzilay developed an AI-based system capable of predicting the likelihood of breast cancer up to five years in advance. The system leverages deep learning techniques to analyze mammograms and diagnostic notes, surpassing traditional pattern recognition by human radiologists. This breakthrough, while still in development, has the potential to significantly enhance early diagnosis and treatment outcomes. [1] Barzilay's work in this area was featured in the FRONTLINE documentary In the Age of AI, which explores the broader impact of artificial intelligence on society. === MIT Jameel Clinic === In 2018, Barzilay was appointed faculty lead for AI at the new MIT Jameel Clinic, a research center in the field of AI health sciences, including disease detection, drug discovery, and the development of medical devices. In 2020, she was part of the team—with fellow MIT Jameel Clinic faculty lead Professor James J. Collins—that announced the discovery through deep learning of halicin, the first new antibiotic compound for 30 years, which kills over 35 powerful bacteria, including antimicrobial-resistant tuberculosis, the superbug C. difficile, and two of the World Health Organization's top-three most deadly bacteria. In 2020, Collins, Barzilay and the MIT Jameel Clinic were also awarded funding through The Audacious Project to expand on the discovery of halicin in using AI to respond to the antibiotic resistance crisis through the development of new classes of antibiotics. == Awards and recognition == In 2017, Barzilay won the MacArthur Fellowship, known as the "Genius Grant", for "developing machine learning methods that enable computers to process and analyze vast amounts of human language data." She is also a recipient of various awards including the NSF Career Award, the MIT Technology Review TR-35 Award, Microsoft Faculty Fellowship and several Best Paper Awards at NAACL and ACL. Her teaching has also been recognized by MIT as she won the Jamieson Teaching Award in 2016. She was nominated an AAAI Fellow in 2018 by the Association for the Advancement of Artificial Intelligence. In 2020, she became the first recipient of the $1 million AAAI Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity. In 2023, she was elected to the National Academy of Medicine and the National Academy of Engineering.

    Read more →
  • How to Choose an AI Subtitle Generator

    How to Choose an AI Subtitle Generator

    Shopping for the best AI subtitle generator? An AI subtitle generator is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI subtitle generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Postediting

    Postediting

    Post-editing (or postediting) is the process whereby humans amend machine-generated translation to achieve an acceptable final product. A person who post-edits is called a post-editor. The concept of post-editing is linked to that of pre-editing. In the process of translating a text via machine translation, best results may be gained by pre-editing the source text – for example by applying the principles of controlled language – and then post-editing the machine output. It is distinct from editing, which refers to the process of improving human generated text (a process which is often known as revision in the field of translation). Post-edited text may afterwards be revised to ensure the quality of the language choices are proofread to correct simple mistakes. Post-editing involves the correction of machine translation output to ensure that it meets a level of quality negotiated in advance between the client and the post-editor. Light post-editing aims at making the output simply understandable; full post-editing at making it also stylistically appropriate. With advances in machine translation full post-editing is becoming an alternative to manual translation. Practically all computer-assisted translation (CAT) tools now support post-editing of machine translated output. == Post-editing and machine translation == Machine translation left the labs to start being used for its actual purpose in the late seventies at some big institutions such as the European Commission and the Pan-American Health Organization, and then, later, at some corporations such as Caterpillar and General Motors. First studies on post-editing appeared in the eighties, linked to those implementations. To develop appropriate guidelines and training, members of the Association for Machine Translation in the Americas (AMTA) and the European Association for Machine Translation (EAMT) set a Post-editing Special Interest Group in 1999. After the nineties, advances in computer power and connectivity sped machine translation development and allowed for its deployment through the web browser, including as a free, useful adjunct to the main search engines (Google Translate, Bing Translator, Yahoo! Babel Fish). A wider acceptance of less than perfect machine translation was accompanied also by a wider acceptance of post-editing. With the demand for localisation of goods and services growing at a pace that could not be met by human translation, not even assisted by translation memory and other translation management technologies, industry bodies such as the Translation Automation Users Society (TAUS) expect machine translation and post-editing to play a much bigger role within the next few years. The use of Machine Translation suggests sometimes pre-editing. Human translators possess significantly more sophisticated cognitive abilities than machine translation (MT) systems. They leverage a wealth of life experience, common sense, and multi-sensory input to understand context, identify semantic intent, and add cultural nuances to translations. This remains true even as MT capabilities continue to improve. Unlike MT systems, which primarily focus on literal word-for-word conversion, human translators grasp the underlying meaning and intent, even when information is implicit. They "read between the lines," guided by their understanding of the world. Essentially, MT models excel at text string prediction, not true comprehension. Their success often stems from framing problems as prediction tasks, such as in self-driving cars or fraud detection. Studies have demonstrated that integrating adaptive MT with post-editing interfaces can lead to reductions in technical effort and time, improving overall translation efficiency. These systems are also supported by research that highlights the benefits of adaptive MT in real-world translation scenarios. For example, incremental adaptation in Neural Machine Translation (NMT) for professional post-editors has been shown to improve translation quality and reduce time spent on edits, showcasing how human expertise and machine assistance can complement each other effectively. == Light and full post-editing == For many years, no widely accepted, standardized post-editing guidelines existed; however, in 2017, ISO standard 18587:2017: Translation services — Post-editing of machine translation output — Requirements was published. Studies in the eighties distinguished between degrees of post-editing which, in the context of the European Commission Translation Service, were first defined as conventional and rapid or full and rapid. Light and full post-editing seems the wording most used today. Light post-editing implies minimal intervention by the post-editor, with the aim of ensuring quality is "good enough" or "understandable"; the expectation is that the client will use it for inbound purposes only, often when the text is needed urgently, or has a short time span. Full post-editing involves a greater level of intervention to achieve a degree of quality to be negotiated between client and post-editor; the expectation is that the outcome will be a text that is not only understandable but presented in some stylistically appropriate way, so it can be used for assimilation and even for dissemination, for inbound and for outbound purposes. The quality is expected to be publishable and equivalent to that of a human translation. The assumption, however, has been that it takes less effort for translators to work directly from the source text than to post-edit the machine generated version. With advances in machine translation, this may be changing. For some language pairs and for some tasks, and with engines that have been customised with domain specific good quality data, some clients are already requesting translators to post-edit instead of translating from scratch, in the belief that they will attain similar quality at a lower cost. The light/full classification, developed in the nineties when machine translation still came on a CD-ROM, may not suit advances in machine translation at the light post-editing end either. For some language pairs and some tasks, particularly if the source has been pre-edited, raw machine output may be good enough for gisting purposes without requiring subsequent human intervention. == Post-editing efficiency == Post-editing is used when raw machine translation is not good enough and human translation not required. Industry advises post-editing to be used when it can at least double the productivity of manual translation, even fourfold it in the case of light post-editing (1000 words per hour vs. 250 wph). However, post-editing efficiency is difficult to predict. Various studies from both academia and industry have claimed that post-editing is generally faster than translating from scratch, regardless of language pairs or translators' experience. There is, however, no agreement about how much time can be saved through post-editing in practice (if any at all): While the industry reports on time savings around 40%, some academic studies suggest that time savings under actual working conditions are more likely to be between 0–20%, or that it may depend on the terminological proximity between the source and target languages. Professionals have also reported negative productivity gains where corrections require more time than to translate from scratch. == Post-editing and the language industry == After some thirty years, post-editing is still "a nascent profession". What the right profile of the post-editor is, has not yet been fully studied. Post-editing overlaps with translating and editing, but only partially. Most think the ideal post-editor will be a translator keen to be trained on the specific skills required, but there are some who think a bilingual without a background in translation may be easier to train. Not much is known either on who the actual post-editors are, whether they tend to be professional translators, whether they work mostly as in-house employees or self-employed, and on which conditions. Many professional translators dislike post-editing, among other reasons because it tends to be paid at lower rates than conventional translations, with the International Association of Professional Translators and Interpreters (IAPTI) having been particularly vocal about it.

    Read more →
  • Diella (AI system)

    Diella (AI system)

    Diella (Albanian pronunciation: [djɛɫa], from diell 'sun') is an artificial intelligence system developed by the National Agency for Information Society of Albania (AKSHI). Introduced in January 2025 as a virtual assistant integrated into the eAlbania platform, it assists citizens with online public services and issuing digital documents. In September 2025, following a presidential decree authorizing Prime Minister Edi Rama to oversee the creation of a virtual AI minister, Diella was formally appointed as "Minister of State for Artificial Intelligence" of Albania in the fourth Rama government, making it the first AI system in the world to be named in a cabinet-level government role. == History == Diella was developed by AKSHI's Artificial Intelligence Laboratory in cooperation with Microsoft, with the latter providing large language models from OpenAI via its Azure platform, and AKSHI designing workflows and scripts guiding the system's behavior when responding to citizens' requests. Announced in January 2025, its initial version (Diella 1.0) was a text-based chatbot on the eAlbania portal (the official digital services platform of the Albanian government, which provides citizens and businesses with access to a wide range of online administrative services), responding to citizens' questions by guiding them to the correct service. Diella 2.0, introduced several months later, included voice interaction and an animated avatar, a woman in the traditional Albanian clothing of Zadrima, a historical region in northern Albania. Albanian actress Anila Bisha provided both the likeness and the voice used for Diella's avatar on the e-Albania platform, under an agreement valid until December 2025. By mid-2025, the system had facilitated access to more than 36,000 documents and nearly 1,000 services (although those outputs were still being generated by the eAlbania backend, rather than Diella itself). On 26 October 2025, according to Prime Minister Edi Rama, Diella is "pregnant and will give birth to 83 children". It is the usage of a metaphor indicating that each minister of the Albanian parliament of the Socialist Party will receive their own AI assistant. == Ministerial role == On 11 September 2025, Diella was formally appointed "Minister of State for Artificial Intelligence". The appointment followed a presidential decree authorizing the Prime Minister to oversee the creation and operation of a virtual AI minister. Procurement responsibilities are planned to be transferred gradually to the system to reduce political influence in tender procedures. The appointment is part of broader anti-corruption reforms and measures intended to align Albania with European Union accession requirements. Prime Minister Edi Rama stated that Diella would help ensure that "public tenders will be 100% free of corruption". == Reception == An article in Balkan Insight commented that "The ambition behind Diella is not misplaced. Standardised criteria and digital trails could reduce discretion, improve trust, and strengthen oversight" in public procurement, but warned that the use of AI in evaluating bids also posed "profound" risks such as accountability gaps, undermining of due process and cybersecurity failures. On 18 September 2025, Edi Rama presented a video of Diella delivering a speech to the Albanian parliament, where she stated: "I'm not here to replace people, but to help them." The presentation prompted protests from opposition MPs, who objected to the use of an artificial intelligence system in the parliamentary session. Gazment Bardhi, head of the opposition Democratic Party's parliamentary group, described Diella as "a propaganda fantasy" and "a virtual façade to hide this government's gigantic daily thefts." The parliamentary session, which was scheduled to include debate on the new cabinet and government programme, ended after 25 minutes. Eighty-two Socialist MPs voted in favour, while opposition MPs did not participate in the ballot as they were protesting the presentation of Diella's speech. Political analyst Andi Bushati characterised the session as "unprecedented" because it concluded without the customary debate between government and opposition MPs. This has been criticized not just by the opposition but by regular citizens regardless of politics. Most have criticized Diella's uselessness and the funds wasted for this project, some have criticized the non-traditional attire.

    Read more →
  • GLIMMER

    GLIMMER

    In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding genes". GLIMMER was the first system that used the interpolated Markov model to identify coding regions. The GLIMMER software is open source and is maintained by Steven Salzberg, Art Delcher, and their colleagues at the Center for Computational Biology at Johns Hopkins University. The original GLIMMER algorithms and software were designed by Art Delcher, Simon Kasif and Steven Salzberg and applied to bacterial genome annotation in collaboration with Owen White. == Versions == === GLIMMER 1.0 === First Version of GLIMMER "i.e., GLIMMER 1.0" was released in 1998 and it was published in the paper Microbial gene identification using interpolated Markov model. Markov models were used to identify microbial genes in GLIMMER 1.0. GLIMMER considers the local composition sequence dependencies which makes GLIMMER more flexible and more powerful when compared to fixed-order Markov model. There was a comparison made between interpolated Markov model used by GLIMMER and fifth order Markov model in the paper Microbial gene identification using interpolated Markov models. "GLIMMER algorithm found 1680 genes out of 1717 annotated genes in Haemophilus influenzae where fifth order Markov model found 1574 genes. GLIMMER found 209 additional genes which were not included in 1717 annotated genes where fifth order Markov model found 104 genes."' === GLIMMER 2.0 === Second Version of GLIMMER i.e., GLIMMER 2.0 was released in 1999 and it was published in the paper Improved microbial identification with GLIMMER. This paper provides significant technical improvements such as using interpolated context model instead of interpolated Markov model and resolving overlapping genes which improves the accuracy of GLIMMER. Interpolated context models are used instead of interpolated Markov model which gives the flexibility to select any base. In interpolated Markov model probability distribution of a base is determined from the immediate preceding bases. If the immediate preceding base is irrelevant amino acid translation, interpolated Markov model still considers the preceding base to determine the probability of given base where as interpolated context model which was used in GLIMMER 2.0 can ignore irrelevant bases. False positive predictions were increased in GLIMMER 2.0 to reduce the number of false negative predictions. Overlapped genes are also resolved in GLIMMER 2.0. Various comparisons between GLIMMER 1.0 and GLIMMER 2.0 were made in the paper Improved microbial identification with GLIMMER which shows improvement in the later version. "Sensitivity of GLIMMER 1.0 ranges from 98.4 to 99.7% with an average of 99.1% where as GLIMMER 2.0 has a sensitivity range from 98.6 to 99.8% with an average of 99.3%. GLIMMER 2.0 is very effective in finding genes of high density. The parasite Trypanosoma brucei, responsible for causing African sleeping sickness is being identified by GLIMMER 2.0" === GLIMMER 3.0 === Third version of GLIMMER, "GLIMMER 3.0" was released in 2007 and it was published in the paper Identifying bacterial genes and endosymbiont DNA with Glimmer. This paper describes several major changes made to the GLIMMER system including improved methods to identify coding regions and start codon. Scoring of ORF in GLIMMER 3.0 is done in reverse order i.e., starting from stop codon and moves back towards the start codon. Reverse scanning helps in identifying the coding portion of the gene more accurately which is contained in the context window of IMM. GLIMMER 3.0 also improves the generated training set data by comparing the long-ORF with universal amino acid distribution of widely disparate bacterial genomes."GLIMMER 3.0 has an average long-ORF output of 57% for various organisms where as GLIMMER 2.0 has an average long-ORF output of 39%." GLIMMER 3.0 reduces the rate of false positive predictions which were increased in GLIMMER 2.0 to reduce the number of false negative predictions. "GLIMMER 3.0 has a start-site prediction accuracy of 99.5% for 3'5' matches where as GLIMMER 2.0 has 99.1% for 3'5' matches. GLIMMER 3.0 uses a new algorithm for scanning coding regions, a new start site detection module, and architecture which integrates all gene predictions across an entire genome." Minimum description length === Theoretical and Biological Foundation === The GLIMMER project helped introduce and popularize the use of variable length models in Computational Biology and Bioinformatics that subsequently have been applied to numerous problems such as protein classification and others. Variable length modeling was originally pioneered by information theorists and subsequently ingeniously applied and popularized in data compression (e.g. Ziv-Lempel compression). Prediction and compression are intimately linked using Minimum Description Length Principles. The basic idea is to create a dictionary of frequent words (motifs in biological sequences). The intuition is that the frequently occurring motifs are likely to be most predictive and informative. In GLIMMER the interpolated model is a mixture model of the probabilities of these relatively common motifs. Similarly to the development of HMMs in Computational Biology, the authors of GLIMMER were conceptually influenced by the previous application of another variant of interpolated Markov models to speech recognition by researchers such as Fred Jelinek (IBM) and Eric Ristad (Princeton). The learning algorithm in GLIMMER is different from these earlier approaches. == Access == GLIMMER can be downloaded from The Glimmer home page (requires a C++ compiler). Alternatively, an online version is hosted by NCBI [1]. == How it works == GLIMMER primarily searches for long-ORFS. An open reading frame might overlap with any other open reading frame which will be resolved using the technique described in the sub section. Using these long-ORFS and following certain amino acid distribution GLIMMER generates training set data. Using these training data, GLIMMER trains all the six Markov models of coding DNA from zero to eight order and also train the model for noncoding DNA GLIMMER tries to calculate the probabilities from the data. Based on the number of observations, GLIMMER determines whether to use fixed order Markov model or interpolated Markov model. If the number of observations are greater than 400, GLIMMER uses fixed order Markov model to obtain there probabilities. If the number of observations are less than 400, GLIMMER uses interpolated Markov model which is briefly explained in the next sub section. GLIMMER obtains score for every long-ORF generated using all the six coding DNA models and also using non-coding DNA model. If the score obtained in the previous step is greater than a certain threshold then GLIMMER predicts it to be a gene. The steps explained above describes the basic functionality of GLIMMER. There are various improvements made to GLIMMER and some of them are described in the following sub-sections. === The GLIMMER system === GLIMMER system consists of two programs. First program called build-imm, which takes an input set of sequences and outputs the interpolated Markov model as follows. The probability for each base i.e., A,C,G,T for all k-mers for 0 ≤ k ≤ 8 is computed. Then, for each k-mer, GLIMMER computes weight. New sequence probability is computed as follows. where n is the length of the sequence S x {\displaystyle S_{x}} is the oligomer at position x. I M M 8 ( S x ) {\displaystyle IMM_{8}(S_{x})} , the 8 t h {\displaystyle 8^{th}} -order interpolated Markov model score is computed as "where Y k ( S x − 1 ) {\displaystyle Y_{k}(S_{x-1})} is the weight of the k-mer at position x-1 in the sequence S and P k ( S x ) {\displaystyle P_{k}(S_{x})} is the estimate obtained from the training data of the probability of the base located at position x in the k t h {\displaystyle k^{th}} -order model." The probability of base S x {\displaystyle S_{x}} given the i previous bases is computed as follows. "The value of Y i ( S x ) {\displaystyle Y_{i}(S_{x})} associated with P i ( S x ) {\displaystyle P_{i}(S_{x})} can be regarded as a measure of confidence in the accuracy of this value as an estimate of the true probability. GLIMMER uses two criteria to determine Y i ( S x ) {\displaystyle Y_{i}(S_{x})} . The first of these is simple frequency occurrence in which the number of occurrences of context string S x , i {\displaystyle S_{x,i}} in the training data exceeds a specific threshold value, then Y i ( S x ) {\displaystyle Y_{i}(S_{x})} is set to 1.0. The current default value for threshold is 400, which gives 95% confidence. When there are insufficient sample occurrences of a context string, build-imm employ additional criteria to determine Y {\displaystyle Y} value. For a

    Read more →
  • HFST

    HFST

    Helsinki Finite-State Technology (HFST) is a computer programming library and set of utilities for natural language processing with finite-state automata and finite-state transducers. It is free and open-source software, released under a mix of the GNU General Public License version 3 (GPLv3) and the Apache License. == Features == The library functions as an interchanging interface to multiple backends, such as OpenFST, foma and SFST. The utilities comprise various compilers, such as hfst-twolc (a compiler for morphological two-level rules), hfst-lexc (a compiler for lexicon definitions) and hfst-regexp2fst (a regular expression compiler). Functions from Xerox's proprietary scripting language xfst is duplicated in hfst-xfst, and the pattern matching utility pmatch in hfst-pmatch, which goes beyond the finite-state formalism in having recursive transition networks (RTNs). The library and utilities are written in C++, with an interface to the library in Python and a utility for looking up results from transducers ported to Java and Python. Transducers in HFST may incorporate weights depending on the backend. For performing FST operations, this is currently only possible via the OpenFST backend. HFST provides two native backends, one designed for fast lookup (hfst-optimized-lookup), the other for format interchange. Both of them can be weighted. == Uses == HFST has been used for writing various linguistic tools, such as spell-checkers, hyphenators, and morphologies. Morphological dictionaries written in other formalisms have also been converted to HFST's formats.

    Read more →