AI Detector Similar To Turnitin

AI Detector Similar To Turnitin — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Macromedia FreeHand

    Macromedia FreeHand

    Macromedia FreeHand (formerly Aldus FreeHand) is a discontinued computer application for creating two-dimensional vector graphics oriented primarily to professional illustration, desktop publishing and content creation for the Web. FreeHand was similar in scope, intended market, and functionality to Adobe Illustrator, CorelDRAW and Xara Designer Pro. Because of FreeHand's dedicated page layout and text control features, it also compares to Adobe InDesign and QuarkXPress. Professions using FreeHand include graphic design, illustration, cartography, fashion and textile design, product design, architects, scientific research, and multimedia production. FreeHand was created by Altsys Corporation in 1988 and licensed to Aldus Corporation, which released versions 1 through 4. In 1994, Aldus merged with Adobe Systems and because of the overlapping market with Adobe Illustrator, FreeHand was returned to Altsys by order of the Federal Trade Commission. Altsys was later bought by Macromedia, which released FreeHand versions 5 through 11 (FreeHand MX). In 2005, Adobe Systems acquired Macromedia and its product line which included FreeHand MX, under whose ownership it presently resides. Since 2003, FreeHand development has been discontinued; in the Adobe Systems catalog, FreeHand has been replaced by Adobe Illustrator. FreeHand MX continues to run under Windows 11 and under Mac OS X 10.6 (Snow Leopard) within Rosetta, a PowerPC code emulator, and requires a registration patch supplied by Adobe. FreeHand 10 runs without problems on Mac OS X Snow Leopard with Rosetta enabled, and does not require a registration patch. Later versions of macOS can use a Mac OS X Snow Leopard Server virtual machine to emulate the required PowerPC support. == History == === Altsys and Aldus FreeHand === In 1984, James R. Von Ehr founded Altsys Corporation to develop graphics applications for personal computers. Based in Plano, Texas, the company initially produced font editing and conversion software; Fontastic Plus, Metamorphosis, and the Art Importer. Their premier PostScript font-design package, Fontographer, was released in 1986 and was the first such program on the market. With the PostScript background having been established by Fontographer, Altsys also developed FreeHand (originally called Masterpiece) as a Macintosh Postscript-based illustration program that used Bézier curves for drawing and was similar to Adobe Illustrator. FreeHand was announced as "... a Macintosh graphics program described as having all the features of Adobe's Illustrator plus drawing tools such as those in Mac Paint and Mac Draft and special effects similar to those in Cricket Draw." Seattle's Aldus Corporation acquired a licensing agreement with Altsys Corporation to release FreeHand along with their flagship product, Pagemaker, and Aldus FreeHand 1.0 was released in 1988. FreeHand's product name used intercaps; the F and H were capitalized. The partnership between the two companies continued with Altsys developing FreeHand and with Aldus controlling marketing and sales. After 1988, a competitive exchange between Aldus FreeHand and Adobe Illustrator ensued on the Macintosh platform with each software advancing new tools, achieving better speed, and matching significant features. Windows PC development also allowed Illustrator 2 (aka, Illustrator 88 on the Mac) and FreeHand 3 to release Windows versions to the graphics market. FreeHand 1.0 sold for $495 in 1988. It included the standard drawing tools and features as other draw programs including special effects in fills and screens, text manipulation tools, and full support for CMYK color printing. It was also possible to create and insert PostScript routines anywhere within the program. FreeHand performed in preview mode instead of keyline mode but performance was slower. FreeHand 2.0 sold for $495 in 1989. Besides improving on the features of FreeHand 1.0, FreeHand 2 added faster operation, Pantone colors, stroked text, flexible fill patterns and automatically import graphic assets from other programs. It added accurate control over a color monitor screen display, limited only by its resolution. FreeHand 3.0 sold for $595 in 1991. New features included resizable color, style, and layer panels including an Attributes menu. Also tighter precision of both the existing tools and aligning of objects. FH3 created compound Paths. Text could be converted to paths, applied to an ellipse, or made vertical. Carried over from version 1.0, FreeHand 3 suffered by having text entered into a dialog box instead of directly to the page. In October 1991, a 3.1 upgrade made FreeHand work with System 7 but additionally, it supported pressure-sensitive drawing which offered varying line widths with a users stroke. It improved element manipulation and added more import/export options. FreeHand 4.0 sold for $595 in 1994. Altsys ported FreeHand 3.0 to the NeXT system creating a new program named Virtuoso. Virtuoso continued its development at Altsys and version 2.0 of Virtuoso was feature-equivalent to FreeHand 4 (with the addition of NeXT-specific features such as Services and Display PostScript) and file compatible, with Virtuoso 2 able to open FreeHand 4 files and vice versa. A prominent feature of this version was the ability to type directly into the page and wrap inside or outside any shape. It also included drag-and-drop color imaging, a larger pasteboard, and a user interface that featured floating, rollup panels. The colors palette included a color mixer for adding new colors to the swatch list. Speed increases were made. In the same year of FreeHand 4 release, Adobe Systems announced merger plans with Aldus Corporation for $525 million. Fear about the end of competition between these two leading applications was reported in the media and expressed by customers (Illustrator versus FreeHand and Adobe Photoshop versus Aldus PhotoStyler.) Because of this overlapping of the market, Altsys stepped in by suing Aldus, saying that the merger deal was "a prima facie violation of a non-compete clause within the FreeHand licensing agreement." Altsys CEO Jim Von Ehr explained, "No one loves FreeHand more than we do. We will do whatever it takes to see it survive." The Federal Trade Commission issued a complaint against Adobe Systems on October 18, 1994, ordering a divestiture of FreeHand to "remedy the lessening of competition resulting from the acquisition as alleged in the Commission's complaint," and further, the FTC ordering, "That for a period of ten (10) years from the date on which this order becomes final, respondents shall not, without the prior approval of the Commission, directly or indirectly, through subsidiaries, partnerships, or otherwise .. Acquire any Professional Illustration Software or acquire or enter into any exclusive license to Professional Illustration Software;" (referring to FreeHand.) FreeHand was returned to Altsys with all licensing and marketing rights as well as Aldus FreeHand's customer list. === Macromedia Freehand === By late 1994, Altsys still retained all rights to FreeHand. Despite brief plans to keep it in-house to sell it along with Fontographer and Virtuoso, Altsys reached an agreement with the multimedia software company, Macromedia, to be acquired. This mutual agreement provided FreeHand and Fontographer a new home with ample resources for marketing, sales, and competition against the newly merged Adobe-Aldus company. Altsys would remain in Richardson, Texas, but would be renamed as the Digital Arts Group of Macromedia and was responsible for the continued development of FreeHand. Macromedia received FreeHand's 200,000 customers and expanded its traditional product line of multimedia graphics software to illustration and design graphics software. CEO James Von Ehr became a Macromedia vice-president until 1997 when he left to start another venture. FreeHand 5.0 sold for $595 in 1995. This version featured a more customizable and expanded workspace, multiple views, stronger design and editing tools, a report generator, spell check, paragraph styles, multicolor gradient fills up to 64 colors, speed improvements, and it accepted Illustrator plugins. In September 1995, a 5.5 upgrade added Photoshop plug-in support, PDF import capabilities, the Extract feature, inline graphics to text, improved auto-expanding text containers, the Crop feature, and the Create PICT Image feature. A FreeHand 5.5 upgrade was part of the FreeHand Graphics Studio (a suite that included Fontographer, Macromedia xRes image editing application, and Extreme 3D animation and modeling application). FreeHand 6.0 in 1996. This version only existed in beta. Some Freehand 7 prerelease versions were released under the Freehand 6 tag. FreeHand 7.0 sold for $399 in 1996, or $449 as part of the FreeHand Graphics Studio (see above.) Features included a redesigned user interface that allowed recombining Inspectors, Panel Tabs, Dockable Panels, Smart Cursors,

    Read more →
  • Grammar induction

    Grammar induction

    Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of re-write rules or productions or alternatively as a finite-state machine or automaton of some kind) from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as strings, trees and graphs. == Grammar classes == Grammatical inference has often been very focused on the problem of learning finite-state machines of various types (see the article Induction of regular languages for details on these approaches), since there have been efficient algorithms for this problem since the 1980s. Since the beginning of the century, these approaches have been extended to the problem of inference of context-free grammars and richer formalisms, such as multiple context-free grammars and parallel multiple context-free grammars. Other classes of grammars for which grammatical inference has been studied are combinatory categorial grammars, stochastic context-free grammars, contextual grammars and pattern languages. == Learning models == The simplest form of learning is where the learning algorithm merely receives a set of examples drawn from the language in question: the aim is to learn the language from examples of it (and, rarely, from counter-examples, that is, example that do not belong to the language). However, other learning models have been studied. One frequently studied alternative is the case where the learner can ask membership queries as in the exact query learning model or minimally adequate teacher model introduced by Angluin. == Methodologies == There is a wide variety of methods for grammatical inference. Two of the classic sources are Fu (1977) and Fu (1982). Duda, Hart & Stork (2001) also devote a brief section to the problem, and cite a number of references. The basic trial-and-error method they present is discussed below. For approaches to infer subclasses of regular languages in particular, see Induction of regular languages. A more recent textbook is de la Higuera (2010), which covers the theory of grammatical inference of regular languages and finite state automata. D'Ulizia, Ferri and Grifoni provide a survey that explores grammatical inference methods for natural languages. === Induction of probabilistic grammars === There are several methods for induction of probabilistic context-free grammars. === Grammatical inference by trial-and-error === The method proposed in Section 8.7 of Duda, Hart & Stork (2001) suggests successively guessing grammar rules (productions) and testing them against positive and negative observations. The rule set is expanded so as to be able to generate each positive example, but if a given rule set also generates a negative example, it must be discarded. This particular approach can be characterized as "hypothesis testing" and bears some similarity to Mitchel's version space algorithm. The Duda, Hart & Stork (2001) text provide a simple example which nicely illustrates the process, but the feasibility of such an unguided trial-and-error approach for more substantial problems is dubious. === Grammatical inference by genetic algorithms === Grammatical induction using evolutionary algorithms is the process of evolving a representation of the grammar of a target language through some evolutionary process. Formal grammars can easily be represented as tree structures of production rules that can be subjected to evolutionary operators. Algorithms of this sort stem from the genetic programming paradigm pioneered by John Koza. Other early work on simple formal languages used the binary string representation of genetic algorithms, but the inherently hierarchical structure of grammars couched in the EBNF language made trees a more flexible approach. Koza represented Lisp programs as trees. He was able to find analogues to the genetic operators within the standard set of tree operators. For example, swapping sub-trees is equivalent to the corresponding process of genetic crossover, where sub-strings of a genetic code are transplanted into an individual of the next generation. Fitness is measured by scoring the output from the functions of the Lisp code. Similar analogues between the tree structured lisp representation and the representation of grammars as trees, made the application of genetic programming techniques possible for grammar induction. In the case of grammar induction, the transplantation of sub-trees corresponds to the swapping of production rules that enable the parsing of phrases from some language. The fitness operator for the grammar is based upon some measure of how well it performed in parsing some group of sentences from the target language. In a tree representation of a grammar, a terminal symbol of a production rule corresponds to a leaf node of the tree. Its parent nodes corresponds to a non-terminal symbol (e.g. a noun phrase or a verb phrase) in the rule set. Ultimately, the root node might correspond to a sentence non-terminal. === Grammatical inference by greedy algorithms === Like all greedy algorithms, greedy grammar inference algorithms make, in iterative manner, decisions that seem to be the best at that stage. The decisions made usually deal with things like the creation of new rules, the removal of existing rules, the choice of a rule to be applied or the merging of some existing rules. Because there are several ways to define 'the stage' and 'the best', there are also several greedy grammar inference algorithms. These context-free grammar generating algorithms make the decision after every read symbol: Lempel-Ziv-Welch algorithm creates a context-free grammar in a deterministic way such that it is necessary to store only the start rule of the generated grammar. Sequitur and its modifications. These context-free grammar generating algorithms first read the whole given symbol-sequence and then start to make decisions: Byte pair encoding and its optimizations. === Distributional learning === A more recent approach is based on distributional learning. Algorithms using these approaches have been applied to learning context-free grammars and mildly context-sensitive languages and have been proven to be correct and efficient for large subclasses of these grammars. === Learning of pattern languages === Angluin defines a pattern to be "a string of constant symbols from Σ and variable symbols from a disjoint set". The language of such a pattern is the set of all its nonempty ground instances i.e. all strings resulting from consistent replacement of its variable symbols by nonempty strings of constant symbols. A pattern is called descriptive for a finite input set of strings if its language is minimal (with respect to set inclusion) among all pattern languages subsuming the input set. Angluin gives a polynomial algorithm to compute, for a given input string set, all descriptive patterns in one variable x. To this end, she builds an automaton representing all possibly relevant patterns; using sophisticated arguments about word lengths, which rely on x being the only variable, the state count can be drastically reduced. Erlebach et al. give a more efficient version of Angluin's pattern learning algorithm, as well as a parallelized version. Arimura et al. show that a language class obtained from limited unions of patterns can be learned in polynomial time. === Pattern theory === Pattern theory, formulated by Ulf Grenander, is a mathematical formalism to describe knowledge of the world as patterns. It differs from other approaches to artificial intelligence in that it does not begin by prescribing algorithms and machinery to recognize and classify patterns; rather, it prescribes a vocabulary to articulate and recast the pattern concepts in precise language. In addition to the new algebraic vocabulary, its statistical approach was novel in its aim to: Identify the hidden variables of a data set using real world data rather than artificial stimuli, which was commonplace at the time. Formulate prior distributions for hidden variables and models for the observed variables that form the vertices of a Gibbs-like graph. Study the randomness and variability of these graphs. Create the basic classes of stochastic models applied by listing the deformations of the patterns. Synthesize (sample) from the models, not just analyze signals with it. Broad in its mathematical coverage, pattern theory spans algebra and statistics, as well as local topological and global entropic properties. == Applications == The principle of grammar induction has been applied to other aspects of natural language processing, and has been applied (among many other problems) to semantic parsing, natural language understanding, example-based translation, language acquisition, grammar-based compre

    Read more →
  • Pronunciation assessment

    Pronunciation assessment

    Automatic pronunciation assessment uses computer speech recognition to determine how accurately speech has been pronounced, instead of relying on a human instructor or proctor. It is also called speech verification, pronunciation evaluation, and pronunciation scoring. This technology is used to grade speech quality, for language testing, for computer-aided pronunciation teaching (CAPT) in computer-assisted language learning (CALL), for speaking skill remediation, and for accent reduction. Pronunciation assessment is different from dictation or automatic transcription, because instead of determining unknown speech, it verifies learners' pronunciation of known word(s), often from prior transcription of the same utterance; ideally scoring the intelligibility of the learners' speech. Sometimes pronunciation assessment evaluates the prosody of the learners' speech, such as intonation, pitch, tempo, rhythm, and syllable and word stress, although those are usually not essential for being understood in most languages. Pronunciation assessment is also used in reading tutoring, for example in products from Google, Microsoft, and Amira Learning. Automatic pronunciation assessment can also be used to help diagnose and treat speech disorders such as apraxia. == Intelligibility == Intelligibility refers to how well a learner's utterance is understood by a listener, rather than how much it sounds like a native speaker. This is separate from measures of fluency, such as so-called "Goodness of Pronunciation" (GoP) scores, which estimate how closely an utterance aligns with those of native speakers. Intelligibility is widely regarded as the most important communicative goal in pronunciation teaching and assessment. For example, in the Common European Framework of Reference for Languages (CEFR) assessment criteria for "overall phonological control", intelligibility outweighs formally correct pronunciation at all levels. Studies in applied linguistics have shown that accent reduction does not always increase intelligibility because listeners can often comprehend heavily accented speech without difficulty. Pronunciation assessment systems often rely on acoustic methods such as GoP which compare learner speech to reference models to produce phoneme-level scores, which are in turn aggregated to produce word and phrase scores. While these methods are effective for identifying deviations from native speakers' utterances, they do not effectively measure how understandable speech is to human listeners. Intelligibility is influenced by broader linguistic and contextual factors such as stress placement, speech rate, and coarticulation, which are not represented in purely segmental scores. The earliest work on pronunciation assessment avoided measuring genuine listener intelligibility, a shortcoming corrected in 2011 at the Toyohashi University of Technology, and included in the Versant high-stakes English fluency assessment from Pearson and mobile apps from 17zuoye Education & Technology, but still missing in 2023 products from Google Search, Microsoft, Educational Testing Service, Speechace, and ELSA. Assessing authentic listener intelligibility is essential for avoiding inaccuracies from accent bias, especially in high-stakes assessments; from words with multiple correct pronunciations; and from phoneme coding errors in machine-readable pronunciation dictionaries. In 2022, researchers found that some newer speech-to-text systems, based on end-to-end reinforcement learning to map audio signals directly into words, produce word and phrase confidence scores (from 10-25ms audio frame logit aggregation) closely correlated with genuine listener intelligibility. Others have been able to assess intelligibility using Levenshtein or dynamic time warping distance measures from Wav2Vec2 representation of good speech. Further work through 2025 has focused specifically on measuring intelligibility. A 2025 study of 42 pronunciation and speech coaching apps (32 mobile and 10 web) found that none offered intelligibility assessment. Instead, most provided only segmental and accent-focused scoring. About two-thirds of the apps provided some form of specific pronunciation feedback, usually with phonetic transcriptions, but accompanied by visual cues (such as animations of the vocal tract or the lips and tongue from the front) in only about 5% of the apps. Less than a third provided feedback on learner perception of exemplar speech. == Evaluation == Although there are as yet no industry-standard benchmarks for evaluating pronunciation assessment accuracy, researchers occasionally release evaluation speech corpuses for others to use for improving assessment quality. Such evaluation databases often emphasize formally unaccented pronunciation to the exclusion of genuine intelligibility evident from blinded listener transcriptions. As of mid-2025, state of the art approaches for automatically transcribing phonemes typically achieve an error rate of about 10% from known good speech. The International Speech Communication Association (ISCA) 2025 Workshop on Speech and Language Technology in Education (SLaTE) administered a Speak & Improve Challenge: Spoken Language Assessment and Feedback, introducing benchmarks for evaluating pronunciation assessment and remediation systems across languages, accents, and learner populations. The challenge emphasized cross-lingual generalization and alignment with human intelligibility judgments, for more robust and interpretable assessment systems. Ethical issues in pronunciation assessment are present in both human and automatic methods. Authentic validity, fairness, and mitigating bias in evaluation are all crucial. Diverse speech data should be included in automatic pronunciation assessment models. Combining human judgments, especially blinded transcriptions from a wide diversity of listeners, with automated feedback can improve accuracy and fairness. Second language learners benefit substantially from their use of widely available speech recognition systems for dictation, virtual assistants, and AI chatbots. In such systems, users naturally try to correct their own errors evident in speech recognition results that they notice. Such use improves their grammar and vocabulary development along with their pronunciation skills. The extent to which explicit pronunciation assessment and remediation approaches improve on such self-directed interactions remains an open question. Similarly, automatic dictation results have been shown to reflect intelligibility about as well as human scorers. == Recent developments == During 2021–22, a smartphone-based CAPT system was used to sense articulation through both audible and inaudible signals, providing feedback at the phoneme level. Some promising areas for improvement which were being developed in 2024 include articulatory feature extraction and transfer learning to suppress unnecessary corrections. Other interesting advances under development include "augmented reality" interfaces for mobile devices using optical character recognition to provide pronunciation training on text found in user environments. In 2024, audio multimodal large language models were first described as assessing pronunciation. That work has been carried forward by other researchers in 2025 who report positive results. Subsequently, researchers demonstrated pronunciation scoring by providing a language model with textual descriptions of speech, including the speech-to-text transcript, phoneme sequences, pauses, and phoneme sequence matching; this approach can achieve performance similar to multimodal LLMs that analyze raw audio while avoiding their higher computational cost. In 2025, the Duolingo English Test authors published a description of their pronunciation assessment method, purportedly built to measure intelligibility rather than accent imitation. While achieving a correlation of 0.82 with expert human ratings, very close to inter-rater agreement and outperforming alternative methods, the method is nonetheless based on experts' scores along the six-point CEFR common reference levels scale, instead of actual blinded listener transcriptions. Further promising work in 2025 includes assessment feedback aligning learner speech to synthetic utterances using interpretable features, identifying continuous spans of words for remediation feedback; synthesizing corrected speech matching learners' self-perceived voices, which they prefer and imitate more accurately as corrections; and streaming such interactions. On January 21, 2026, Educational Testing Service's TOEFL iBT high-stakes English language test, required by US university admissions and employers from English as a foreign language applicants more often than all other internet-based tests combined, changed its speaking assessments. While official rubrics claim that the new scoring will be based primarily on intelligibility, the new test's technical description indicates that it ju

    Read more →
  • ELIZA

    ELIZA

    ELIZA is an early natural language processing computer program developed from 1964 to 1967 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and substitution methodology that gave users an illusion of understanding on the part of the program, but gave no response that could be considered really understanding what was being said by either party. Whereas the ELIZA program itself was written (originally) in MAD-SLIP, the pattern matching directives that contained most of its language capability were provided in separate "scripts", represented in a Lisp-like expression. The most famous script, DOCTOR, simulated a psychotherapist of the Rogerian school (in which the therapist often reflects back the patient's words to the patient), and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatbots (originally "chatterbots") and one of the first programs capable of attempting the Turing test. Weizenbaum intended the program as a method to explore communication between humans and machines. He was surprised that some people, including his secretary, attributed human-like feelings to the computer program, a phenomenon that came to be called the ELIZA effect. Many academics believed that the program would be able to positively influence the lives of many people, particularly those with psychological issues, and that it could aid doctors working on such patients' treatment. While ELIZA was capable of engaging in discourse, it could not converse with true understanding. However, many early users were convinced of ELIZA's intelligence and understanding, despite Weizenbaum's insistence to the contrary. The original ELIZA source code had been missing since its creation in the 1960s, as it was not common to publish articles that included source code at that time. However, more recently the MAD-SLIP source code was discovered in the MIT archives and published on various platforms, such as the Internet Archive. The source code is of high historical interest since it demonstrates not only the specificity of programming languages and techniques at that time, but also the beginning of software layering and abstraction as a means of achieving sophisticated software programming. == Overview == Joseph Weizenbaum's ELIZA, running the DOCTOR script, created a conversational interaction somewhat similar to what might take place in the office of "a [non-directive] psychotherapist in an initial psychiatric interview" and to "demonstrate that the communication between man and machine was superficial". While ELIZA is best known for acting in the manner of a psychotherapist, the speech patterns are due to the data and instructions supplied by the DOCTOR script. ELIZA itself examined the text for keywords, applied values to said keywords, and transformed the input into an output; the script that ELIZA ran determined the keywords, set the values of keywords, and set the rules of transformation for the output. Weizenbaum chose to make the DOCTOR script in the context of psychotherapy to "sidestep the problem of giving the program a data base of real-world knowledge", allowing it to reflect back the patient's statements to carry the conversation forward. The result was a somewhat intelligent-seeming response that reportedly deceived some early users of the program. Weizenbaum named his program ELIZA after Eliza Doolittle, a working-class character in George Bernard Shaw's Pygmalion (also appearing in the musical My Fair Lady, which was based on the play and was hugely popular at the time). According to Weizenbaum, ELIZA's ability to be "incrementally improved" by various users made it similar to Eliza Doolittle, since Eliza Doolittle was taught to speak with an upper-class accent in Shaw's play. However, unlike the human character in Shaw's play, ELIZA is incapable of learning new patterns of speech or new words through interaction alone. Edits must be made directly to ELIZA's active script in order to change the manner by which the program operates. Weizenbaum first implemented ELIZA in his own SLIP list-processing language, where, depending upon the initial entries by the user, the illusion of human intelligence could appear, or be dispelled through several interchanges. Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people." In 1966, interactive computing (via a teletype) was new. It was 11 years before the personal computer became familiar to the general public, and three decades before most people encountered attempts at natural language processing in Internet services like Ask.com or PC help systems such as Microsoft Office Clippit. Although those programs included years of research and work, ELIZA remains a milestone because it was the first time a programmer had attempted such a human-machine interaction with the goal of creating the illusion (however brief) of human–human interaction. At the ICCC 1972, ELIZA was brought together with another early artificial-intelligence program named PARRY for a computer-only conversation. While ELIZA was built to speak as a doctor, PARRY was intended to simulate a patient with schizophrenia. == Design and implementation == Weizenbaum originally wrote ELIZA in MAD-SLIP for CTSS on an IBM 7094 as a program to make natural-language conversation possible with a computer. To accomplish this, Weizenbaum identified five "fundamental technical problems" for ELIZA to overcome: the identification of key words, the discovery of a minimal context, the choice of appropriate transformations, the generation of responses in the absence of key words, and the provision of an editing capability for ELIZA scripts. Weizenbaum solved these problems and made ELIZA such that it had no built-in contextual framework or universe of discourse. However, this required ELIZA to have a script of instructions on how to respond to inputs from users. ELIZA starts its process of responding to an input by a user by first examining the text input for a "keyword". A "keyword" is a word designated as important by the acting ELIZA script, which assigns to each keyword a precedence number, or a RANK, designed by the programmer. If such words are found, they are put into a "keystack", with the keyword of the highest RANK at the top. The input sentence is then manipulated and transformed as the rule associated with the keyword of the highest RANK directs. For example, when the DOCTOR script encounters words such as "alike" or "same", it would output a message pertaining to similarity, in this case "In what way?", as these words had high precedence number. This also demonstrates how certain words, as dictated by the script, can be manipulated regardless of contextual considerations, such as switching first-person pronouns and second-person pronouns and vice versa, as these too had high precedence numbers. Such words with high precedence numbers are deemed superior to conversational patterns and are treated independently of contextual patterns. Following the first examination, the next step of the process is to apply an appropriate transformation rule, which includes two parts: the "decomposition rule" and the "reassembly rule". First, the input is reviewed for syntactical patterns in order to establish the minimal context necessary to respond. Using the keywords and other nearby words from the input, different disassembly rules are tested until an appropriate pattern is found. Using the script's rules, the sentence is then "dismantled" and arranged into sections of the component parts as the "decomposition rule for the highest-ranking keyword" dictates. The example that Weizenbaum gives is the input "You are very helpful", which is transformed to "I are very helpful". This is then broken into (1) empty (2) "I" (3) "are" (4) "very helpful". The decomposition rule has broken the phrase into four small segments that contain both the keywords and the information in the sentence. The decomposition rule then designates a particular reassembly rule, or set of reassembly rules, to follow when reconstructing the sentence. The reassembly rule takes the fragments of the input that the decomposition rule had created, rearranges them, and adds in programmed words to create a response. Using Weizenbaum's example previously stated, such a reassembly rule would take the fragments and apply them to the phrase "What makes

    Read more →
  • Wilkinson's Grammar of Graphics

    Wilkinson's Grammar of Graphics

    The Grammar of Graphics (GoG) is a grammar-based system for representing graphics to provide grammatical constraints on the composition of data and information visualizations. A graphical grammar differs from a graphics pipeline as it focuses on semantic components such as scales and guides, statistical functions, coordinate systems, marks and aesthetic attributes. For example, a bar chart can be converted into a pie chart by specifying a polar coordinate system without any other change in graphical specification. The grammar of graphics concept was launched by Leland Wilkinson in 2001 (Wilkinson et al., 2001; Wilkinson, 2005) and graphical grammars have since been written in a variety of languages with various parameterisations and extensions. The major implementations of graphical grammars are nViZn created by a team at SPSS/IBM, followed by Polaris focusing on multidimensional relational databases which is commercialised as Tableau, a revised Layered Grammar of Graphics by Hadley Wickham in Ggplot2, and Vega-Lite which is a visualisation grammar with added interactivity. The grammar of graphics continues to evolve with alternate parameterisations, extensions, or new specifications. == Wilkinson's Grammar of Graphics == === Theory === Wilkinson conceived the seven elements of a graphics to be Variables: mapping of objects to values represented in a graphic Algebra: operations to combine variables and specify dimensions of graphs Geometry: creation of geometric graphs from variables Aesthetics: sensory attributes Statistics: functions to change the appearance and representation of graphs Scales: represent variables on measured dimensions Coordinates: mapping to coordinate systems With these, Wilkinson hypothesised that These seven constructs are orthogonal and virtually all known statistical charts can be generated relatively parsimoniously This computational system is not a taxonomy of charts and rather it describes the meaning of what we do when we construct statistical graphics. === Implementations === Wilkinson wrote SYSTAT, a statistical software package, in the early 1980s. This program was noted for its comprehensive graphics, including the first software implementation of the heatmap display now widely used among biologists. After his company grew to 50 employees, he sold it to SPSS in 1995. At SPSS, he assembled a team of graphics programmers who developed the nViZn platform that produces the visualizations in SPSS, Clementine, and other analytics products. While at Stanford, Tableau founders Hanrahan and Stolte, as well as Diane Tang, created the predecessor to Tableau, named Polaris. Polaris was a data visualization software tool, built with the support of a United States Department of Energy defense program, the Accelerated Strategic Computing Initiative (ASCI). The main differences between Wilkinson's system and Polaris are the use of SQL relational algebra for database services and using shelves instead of cross and nest operators. == Wickham's Layered Grammar of Graphics == === Theory === Hadley Wickham conceived an alternate parameterisation of the syntax Wilkinson had derived, creating a layered grammar of graphics which he implemented as ggplot2 for R (programming language) users. This added a hierarchy of defaults based around the idea of building up a graphic from multiple layers. Wickham conceived these elements to be: Defaults: consists of data and mapping Data: dataset Mapping: aesthetic mappings Layer: consists of data, mapping, geom, stat, and position Data: dataset, or inherit from defaults Mapping: aesthetic mappings, or inherit from defaults Geom: geometric object Stat: statistical transformation Position: position adjustment Scale: mapping of data to aesthetic attributes Coord: mapping of data to the plane of the plot Facet: split up the data === Reception === Wilkinson is generally positive on Wickham's parameterisation and implementation of ggplot2, praising its elegance and expressivity whilst claiming that his original Grammar of Graphics is capable of representing a wider range of statistical graphics. === Implementations === ggplot2 is the first implementation of a layered grammar of graphics in R and implementations in other programming languages have ensued. These include direct ports plotnine for Python, gramm for MATLAB, Lets-Plot for Kotlin and gadfly for Julia. Projects inspired by elements of Wickham's grammar include Vega-Lite which specifies plots in JSON and uses a JavaScript engine. Implementations for Python include Vega-Altair (built on top of Vega-Lite). == Vega-Lite: A Grammar of Interactive Graphics == === Theory === Vega-Lite combines ideas from Wilkinson's Grammar of Graphics and Wickham's Layered Grammar of Graphics with a composition algebra for layered and multi-view displays with a grammar of interaction. The Vega-Lite specification is instantiated in JSON and rendered by the lower-level Vega. The graphical grammar implemented by Vega-Lite is composed of the following: Unit: consists of data, transforms, mark-type and encoding Data: relational table consisting of records (rows) and named attributes (columns) Transforms: data transformations Mark-type: geometric object for visual encoding Encodings: mapping of data attributes to visual marks properties where each encoding consists of: Channel: e.g. colour, shape, size, or text Field: data attribute Data-type: e.g. nominal, ordinal, quantitative, or temporal Value: use a literal instead of a data-type Functions: e.g. binning, aggregation, and sorting Scale: maps from data domain to visual range Guide: axis or legend for visualising scale Composite Views: compose views from multiple unit specifications with operators: Layer: charts plotted on top of each other Hconcat/Vconcat: place views side-by-side Facet: subset data to produce a trellis plot Repeat: multiple plots similar to facet but with full data replication in each cell Interaction: selections identify the set of points a user is interested in manipulating, with components: Selection: get the minimal number of backing points Name: reference Type: how many backing values are stored Predicate: determine the set of selected points e.g. single, list, interval Domain|Range: store data domain or visual range Event: e.g. mouseover, mousedown, mouseup, Init: initialise with specific backing points Transforms: e.g. project, toggle, translate, zoom, and nearest Resolve: resolve selections to union or intersect ==== Implementations ==== Whilst Vega-Lite is the sole implementation of this graphics grammar specification with compilation to Vega, other implementations do create JSON files which can be interpreted by Vega-Lite. == Related projects == Ggplot2 is an R package for plotting Tableau Software (originally known as Polaris) is a commercial software built using the Grammar of Graphics nViZn built by Wilkinson. SYSTAT (statistics package) built by Wilkinson ggpy, ggplot for Python, but has not been updated since 20 November 2016 plotnine started as an effort to improve the scalability of ggplot for Python and is largely compatible with ggplot2 syntax. Plotly - Interactive, online ggplot2 graphs gramm, a plotting class for MATLAB inspired by ggplot2 gadfly, a system for plotting and visualization written in Julia, based largely on ggplot2 Chart::GGPlot - ggplot2 port in Perl, but has not been updated since 16 March 2023 The Lets-Plot for Python library includes a native backend and a Python API, which was mostly based on the ggplot2 package. Lets-Plot Kotlin API is an open-source plotting library for statistical data implemented using the Kotlin programming language, and is built on the principles of layered graphics first described in the Leland Wilkinson's work The Grammar of Graphics. ggplotnim, plotting library using the Nim programming language inspired by ggplot2. Vega and Vega-Lite are plotting libraries that use JSON to specify plots. Vega-Altair, a Python library built on top of Vega-Lite chart-parts - React-friendly Grammar of Graphics, but has not been updated since 10 Dec 2021 g2 - a JavaScript library

    Read more →
  • Physics-informed neural networks

    Physics-informed neural networks

    In machine learning, physics-informed neural networks (PINNs), also referred to as theory-trained neural networks (TTNs), are a type of universal function approximator that can embed the knowledge of any physical laws that govern a given data-set in the learning process, and can be described by partial differential equations (PDEs). Low data availability for some biological and engineering problems limit the robustness of conventional machine learning models used for these applications. The prior knowledge of general physical laws acts in the training of neural networks (NNs) as a regularization agent that limits the space of admissible solutions, increasing the generalizability of the function approximation. This way, embedding this prior information into a neural network results in enhancing the information content of the available data, facilitating the learning algorithm to capture the right solution and to generalize well even with a low amount of training examples. Because they process continuous spatial and time coordinates and output continuous PDE solutions, they can be categorized as neural fields. == Function approximation == Most of the physical laws that govern the dynamics of a system can be described by partial differential equations. For example, the Navier–Stokes equations are a set of partial differential equations derived from the conservation laws (i.e., conservation of mass, momentum, and energy) that govern fluid mechanics. The solution of the Navier–Stokes equations with appropriate initial and boundary conditions allows the quantification of flow dynamics in a precisely defined geometry. However, these equations cannot be solved exactly and therefore numerical methods must be used (such as finite differences, finite elements and finite volumes). In this setting, these governing equations must be solved while accounting for prior assumptions, linearization, and adequate time and space discretization. Recently, solving the governing partial differential equations of physical phenomena using deep learning has emerged as a new field of scientific machine learning (SciML), leveraging the universal approximation theorem and high expressivity of neural networks. In general, deep neural networks could approximate any high-dimensional function given that sufficient training data are supplied. However, such networks do not consider the physical characteristics underlying the problem, and the level of approximation accuracy provided by them is still heavily dependent on careful specifications of the problem geometry as well as the initial and boundary conditions. Without this preliminary information, the solution is not unique and may lose physical correctness. To remedy this, Physics-Informed Neural Networks (PINNs) leverage governing physical equations in neural network training. Namely, PINNs are designed to be trained to satisfy the given training data as well as the imposed governing equations. In this fashion, a neural network can be guided with training datasets that do not necessarily need to be large or complete. An accurate solution of partial differential equations can potentially be found without knowing the boundary conditions. Therefore, with some knowledge about the physical characteristics of the problem and some form of training data (even sparse and incomplete), PINNs may be used for finding an optimal solution with high fidelity. PINNs can be applied to a wide range of problems in computational science, and are a pioneering technology leading to the development of new classes of numerical solvers for PDEs. PINNs can be thought of as a mesh-free alternative to traditional approaches (e.g., CFD for fluid dynamics), and new data-driven approaches for model inversion and system identification. Notably, a trained PINN network can be used to predict values on simulation grids of different resolutions without needing to be retrained. Additionally, the derivatives used in the partial differential equations can be computed using automatic differentiation (AD), which is assessed to be superior to numerical or symbolic differentiation. == Modeling and computation == A general nonlinear partial differential equation can be written as: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} where u ( t , x ) {\displaystyle u(t,x)} denotes the solution, N [ ⋅ ; λ ] {\displaystyle {\mathcal {N}}[\cdot ;\lambda ]} is a nonlinear operator parameterized by λ {\displaystyle \lambda } , and Ω {\displaystyle \Omega } is a subset of R D {\displaystyle \mathbb {R} ^{D}} . This general form of governing equations summarizes a wide range of problems in mathematical physics, such as conservative laws, diffusion process, advection-diffusion systems, and kinetic equations. Given noisy measurements of a generic dynamic system described by the equation above, PINNs can be designed to solve two classes of problems: data-driven solutions of partial differential equations data-driven discovery of partial differential equations === Data-driven solution of partial differential equations === The data-driven solution of PDE computes the hidden state u ( t , x ) {\displaystyle u(t,x)} of the system given boundary data and/or measurements z {\displaystyle z} , and fixed model parameters λ {\displaystyle \lambda } . We solve: u t + N [ u ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u]=0,\quad x\in \Omega ,\quad t\in [0,T]} . by defining the residual f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ] {\displaystyle f:=u_{t}+{\mathcal {N}}[u]} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network. This network can be differentiated using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} is the error between the PINN u ( t , x ) {\displaystyle u(t,x)} and the set of boundary conditions and measured data on the set of points Γ {\displaystyle \Gamma } where the boundary conditions and data are defined. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the mean-squared error of the residual function. This second term encourages the PINN to learn the structural information expressed by the PDE during the training process. This approach has been used to yield computationally efficient physics-informed surrogate models with applications in the forecasting of physical processes, model predictive control, multi-physics and multi-scale modeling, and simulation. It has been shown to converge to the solution of the PDE. === Data-driven discovery of partial differential equations === Given noisy and incomplete measurements z {\displaystyle z} of the state of the system, the data-driven discovery of PDEs results in computing the unknown state u ( t , x ) {\displaystyle u(t,x)} and learning model parameters λ {\displaystyle \lambda } that best describe the observed data: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} By defining f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ; λ ] = 0 {\displaystyle f:=u_{t}+{\mathcal {N}}[u;\lambda ]=0} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network, f ( t , x ) {\displaystyle f(t,x)} results in a PINN. This network can be derived using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} , together with the parameter λ {\displaystyle \lambda } of the differential operator can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} , with u {\displaystyle u} and z {\displaystyle z} state solutions and measurements at sparse location Γ {\displaystyle \Gamma } , respectively. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the residual function. This second term requires the structured information represented by the partial differential equations to be satisfied in the training process. This strategy allows for discovering dynamic models described by nonlinear PDEs assembling computationally efficient and fully differentiable surrogate models that may find application in predictive forecasting, control, and data assimilation. == Extensions and applications == === For piece-wise function approximation === PINNs are unable to approximate PDEs that have strong non-linearity or sharp gradients (such as those that commonly occur in practical fluid flow problems). Piecewise approximation has been an old practic

    Read more →
  • Database application

    Database application

    A database application is a computer program whose primary purpose is retrieving information from a computerized database. From here, information can be inserted, modified or deleted which is subsequently conveyed back into the database. Early examples of database applications were accounting systems and airline reservations systems, such as SABRE, developed starting in 1957. A characteristic of modern database applications is that they facilitate simultaneous updates and queries from multiple users. Systems in the 1970s might have accomplished this by having each user in front of a 3270 terminal to a mainframe computer. By the mid-1980s it was becoming more common to give each user a personal computer and have a program running on that PC that is connected to a database server. Information would be pulled from the database, transmitted over a network, and then arranged, graphed, or otherwise formatted by the program running on the PC. Starting in the mid-1990s it became more common to build database applications with a Web interface. Rather than develop custom software to run on a user's PC, the user would use the same Web browser program for every application. A database application with a Web interface had the advantage that it could be used on devices of different sizes, with different hardware, and with different operating systems. Examples of early database applications with Web interfaces include amazon.com, which used the Oracle relational database management system, the photo.net online community, whose implementation on top of Oracle was described in the book Database-Backed Web Sites (Ziff-Davis Press; May 1997), and eBay, also running Oracle. Electronic medical records are referred to on emrexperts.com, in December 2010, as "a software database application". A 2005 O'Reilly book uses the term in its title: Database Applications and the Web. Some of the most complex database applications remain accounting systems, such as SAP, which may contain thousands of tables in only a single module. Many of today's most widely used computer systems are database applications, for example, Facebook, which was built on top of MySQL. The etymology of the phrase "database application" comes from the practice of dividing computer software into systems programs, such as the operating system, compilers, the file system, and tools such as the database management system, and application programs, such as a payroll check processor. On a standard PC running Microsoft Windows, for example, the Windows operating system contains all of the systems programs while games, word processors, spreadsheet programs, photo editing programs, etc. would be application programs. As "application" is short for "application program", "database application" is short for "database application program". Not every program that uses a database would typically be considered a "database application". For example, many physics experiments, e.g., the Large Hadron Collider, generate massive data sets that programs subsequently analyze. The data sets constitute a "database", though they are not typically managed with a standard relational database management system. The computer programs that analyze the data are primarily developed to answer hypotheses, not to put information back into the database and therefore the overall program would not be called a "database application". == Examples of database applications == Amazon Student Data CNN eBay Facebook Fandango Filemaker (Mac OS) LibreOffice Base Microsoft Access Oracle relational database SAP (Systems, Applications & Products in Data Processing) Ticketmaster Wikipedia Yelp YouTube Google MySQL

    Read more →
  • Cleverbot

    Cleverbot

    Cleverbot is a chatterbot web application. It was created by British AI scientist Rollo Carpenter and launched in October 2008. It was preceded by Jabberwacky, a chatbot project that began in 1988 and went online in 1997. In its first decade, Cleverbot held several thousand conversations with Carpenter and his associates. Since launching on the web, the number of conversations held has exceeded 150 million. Besides the web application, Cleverbot is also available as an iOS, Android, and Windows Phone app. == Operation == Cleverbot's responses are not pre-programmed because it learns from human input: Humans type into the box below the Cleverbot logo and the system finds all keywords or an exact phrase matching the input. After searching through its saved conversations, it responds to the input by finding how a human responded to that input when it was asked, in part or in full, by Cleverbot. Cleverbot participated in a formal Turing test at the 2011 Techniche festival at the Indian Institute of Technology Guwahati on 3 September 2011. Out of the 1334 votes cast, Cleverbot was judged to be 59.3% human, compared to the rating of 63.3% human achieved by human participants. A score of 50.05% or higher is often considered to be a passing grade. The software running for the event had to handle just 1 or 2 simultaneous requests, whereas online Cleverbot is usually talking to around 10,000 to 50,000 people at once. == Developments == Cleverbot is constantly growing in data size at the rate of 4 to 7 million interactions per day. Updates to the software have been mostly behind the scenes. In 2014, Cleverbot was upgraded to use GPU serving techniques. Unlike Eliza, the program does not respond in a fixed way, instead choosing its responses heuristically using fuzzy logic, the whole of the conversation being compared to the millions that have taken place before. Cleverbot now uses over 279 million interactions, about 3-4% of the data it has already accumulated. The developers of Cleverbot are attempting to build a new version using machine learning techniques. An app that uses the Cleverscript engine to play a game of 20 Questions has been launched under the name Clevernator. Unlike other such games, the player asks the questions and it is the role of the AI to understand, and answer factually. An app that allows owners to create and talk to their own small Cleverbot-like AI has been launched, called Cleverme! for Apple products. == In popular culture == Cleverbot received media attention after being featured in the popular 2010 creepypasta ARG web serial Ben Drowned by Alexander D. Hall. In early 2017, a Twitch stream of two Google Home devices modified to talk to each other using Cleverbot garnered over 700,000 visitors and over 30,000 peak concurrent viewers.

    Read more →
  • Kdan Mobile

    Kdan Mobile

    Kdan Mobile Software Limited is a software application development company based in Tainan City, Taiwan. Kdan also has branches in Taipei, Changsha, Irvine, California, Japan, and South Korea. The company was founded in 2009 by Kenny Su, the company's CEO. == History == Kdan Mobile was founded in 2009 by Kenny Su (蘇柏州) and develops an application for PDF documents. Su previously worked at the Industrial Technology Research Institute (ITRI) . In 2018, the company completed its Series B round of fundraising, in which it raised 16 million USD in total. Four global firms, Dattoz Partners (South Korea), WI Harper Group (U.S.), Taiwania Capital (Taiwan), and Golden Asia Fund Mitsubishi UFJ Capital (Japan), made up the Series B investment. Kdan previously raised 5 million USD in its Series A round in 2018.

    Read more →
  • Visual descriptor

    Visual descriptor

    In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, or algorithms or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others. == Introduction == As a result of the new communication technologies and the massive use of Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them. The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a video, image or audio and they allow the quick and efficient searches of the audio-visual content. This system can be compared to the search engines for textual contents. Although it is relatively easy to find text with a computer, it is much more difficult to find concrete audio and video parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident its shape, color and texture description in images. The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the MPEG-7 (Motion Picture Expert Group - 7). == Types == Descriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes. Visual descriptors are divided in two main groups: General information descriptors: contain low level descriptors which give a description about color, shape, regions, textures and motion. Specific domain information descriptors: give information about objects and events in the scene. A concrete example would be face recognition. === General information descriptors === General information descriptors consist of a set of descriptors that covers different basic and elementary features like: color, texture, shape, motion, location and others. This description is automatically generated by means of signal processing. ==== Color ==== It's the most basic quality of visual content. Five tools are defined to describe color. The three first tools represent the color distribution and the last ones describe the color relation between sequences or group of images: Dominant color descriptor (DCD) Scalable color descriptor (SCD) Color structure descriptor (CSD) Color layout descriptor (CLD) Group of frame (GoF) or group-of-pictures (GoP) ==== Texture ==== It's an important quality in order to describe an image. The texture descriptors characterize image textures or regions. They observe the region homogeneity and the histograms of these region borders. The set of descriptors is formed by: Homogeneous texture descriptor (HTD) Texture browsing descriptor (TBD) Edge histogram descriptor (EHD) ==== Shape ==== It contains important semantic information due to human's ability to recognize objects through their shape. However, this information can only be extracted by means of a segmentation similar to the one that the human visual system implements. Nowadays, such a segmentation system is not available yet, however there exists a serial of algorithms which are considered to be a good approximation. These descriptors describe regions, contours and shapes for 2D images and for 3D volumes. The shape descriptors are the following ones: Region-based shape descriptor (RSD) Contour-based shape descriptor (CSD) 3-D shape descriptor (3-D SD) ==== Motion ==== It's defined by four different descriptors which describe motion in video sequence. Motion is related to the objects motion in the sequence and to the camera motion. This last information is provided by the capture device, whereas the rest is implemented by means of image processing. The descriptor set is the following one: Motion activity descriptor (MAD) Camera motion descriptor (CMD) Motion trajectory descriptor (MTD) Warping and parametric motion descriptor (WMD and PMD) ==== Location ==== Elements location in the image is used to describe elements in the spatial domain. In addition, elements can also be located in the temporal domain: Region locator descriptor (RLD) Spatio temporal locator descriptor (STLD) === Specific domain information descriptors === These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless, they can be manually processed. As mentioned before, face recognition is a concrete example of an application that tries to automatically obtain this information. == Descriptors applications == Among all applications, the most important ones are: Multimedia documents search engines and classifiers. Digital library: visual descriptors allow a very detailed and concrete search of any video or image by means of different search parameters. For instance, the search of films where a known actor appears, the search of videos containing the Everest mountain, etc. Personalized electronic news service. Possibility of an automatic connection to a TV channel broadcasting a soccer match, for example, whenever a player approaches the goal area. Control and filtering of concrete audiovisual content, like violent or pornographic material. Also, authorization for some multimedia content.

    Read more →
  • Salience (neuroscience)

    Salience (neuroscience)

    Salience (also called saliency, from Latin saliō meaning "leap, spring") is the property by which some thing stands out. Salient events are an attentional mechanism by which organisms learn and survive; those organisms can focus their limited perceptual and cognitive resources on the pertinent (that is, salient) subset of the sensory data available to them. Saliency typically arises from contrasts between items and their neighborhood. They might be represented, for example, by a red dot surrounded by white dots, or by a flickering message indicator of an answering machine, or a loud noise in an otherwise quiet environment. Saliency detection is often studied in the context of the visual system, but similar mechanisms operate in other sensory systems. Just what is salient can be influenced by training: for example, for human subjects particular letters can become salient by training. There can be a sequence of necessary events, each of which has to be salient, in turn, in order for successful training in the sequence; the alternative is a failure, as in an illustrated sequence when tying a bowline; in the list of illustrations, even the first illustration is a salient: the rope in the list must cross over, and not under the bitter end of the rope (which can remain fixed, and not free to move); failure to notice that the first salient has not been satisfied means the knot will fail to hold, even when the remaining salient events have been satisfied. When attention deployment is driven by salient stimuli, it is considered to be bottom-up, memory-free, and reactive. Conversely, attention can also be guided by top-down, memory-dependent, or anticipatory mechanisms, such as when looking ahead of moving objects or sideways before crossing streets. Humans and other animals have difficulty paying attention to more than one item simultaneously, so they are faced with the challenge of continuously integrating and prioritizing different bottom-up and top-down influences. == Neuroanatomy == The brain component named the hippocampus helps with the assessment of salience and context by using past memories to filter new incoming stimuli, and placing those that are most important into long term memory. The entorhinal cortex is the pathway into and out of the hippocampus, and is an important part of the brain's memory network; research shows that it is a brain region that suffers damage early on in Alzheimer's disease, one of the effects of which is altered (diminished) salience. The pulvinar nuclei (in the thalamus) modulate physical/perceptual salience in attentional selection. One group of neurons (i.e., D1-type medium spiny neurons) within the nucleus accumbens shell (NAcc shell) assigns appetitive motivational salience ("want" and "desire", which includes a motivational component), aka incentive salience, to rewarding stimuli, while another group of neurons (i.e., D2-type medium spiny neurons) within the NAcc shell assigns aversive motivational salience to aversive stimuli. The primary visual cortex (V1) generates a bottom-up saliency map from visual inputs to guide reflexive attentional shifts or gaze shifts. According to V1 Saliency Hypothesis, the saliency of a location is higher when V1 neurons give higher responses to that location relative to V1 neurons' responses to other visual locations. For example, a unique red item among green items, or a unique vertical bar among horizontal bars, is salient since it evokes higher V1 responses and attracts attention or gaze. The V1 neural responses are sent to the superior colliculus to guide gaze shifts to the salient locations. A fingerprint of the saliency map in V1 is that attention or gaze can be captured by the location of an eye-of-origin singleton in visual inputs, e.g., a bar uniquely shown to the left eye in a background of many other bars shown to the right eye, even when observers cannot tell the difference between the singleton and the background bars. == In psychology == The term is widely used in the study of perception and cognition to refer to any aspect of a stimulus that, for any of many reasons, stands out from the rest. Salience may be the result of emotional, motivational or cognitive factors and is not necessarily associated with physical factors such as intensity, clarity or size. Although salience is thought to determine attentional selection, salience associated with physical factors does not necessarily influence selection of a stimulus. === Salience bias === Salience bias (also referred to as perceptual salience) is a cognitive bias that predisposes individuals to focus on or attend to items, information, or stimuli that are more prominent, visible, or emotionally striking. This is as opposed to stimuli that are unremarkable, or less salient, even though this difference is often irrelevant by objective standards. The American Psychological Association (APA) defines the salience hypothesis as a theory regarding perception where "motivationally significant" information is more readily perceived than information with little or less significant motivational importance. Perceptual salience (salience bias) is linked to the vividness effect, whereby a more pronounced response is produced by a more vivid perception of a stimulus than the mere knowledge of the stimulus. Salience bias assumes that more dynamic, conspicuous, or distinctive stimuli engage attention more than less prominent stimuli, disproportionately impacting decision making, it is a bias which favors more salient information. ==== Application ==== ===== Cognitive Psychology ===== Salience bias, like all other cognitive biases, is an applicable concept to various disciplines. For example, cognitive psychology investigates cognitive functions and processes, such as perception, attention, memory, problem solving, and decision making, all of which could be influenced by salience bias. Salience bias acts to combat cognitive overload by focusing attention on prominent stimuli, which affects how individuals perceive the world as other, less vivid stimuli that could add to or change this perception, are ignored. Human attention gravitates towards novel and relevant stimuli and unconsciously filters out less prominent information, demonstrating salience bias, which influences behavior as human behavior is affected by what is attended to. Behavioral economists Tversky and Kahneman also suggest that the retrieval of instances is influenced by their salience, such as how witnessing or experiencing an event first-hand has a greater impact than when it is less salient, like if it were read about, implying that memory is affected by salience. ===== Language ===== It is also relevant in language understanding and acquisition. Focusing on more salient phenomena allows people to detect language patterns and dialect variations more easily, making dialect categorization more efficient. ===== Social Behavior ===== Furthermore, social behaviors and interactions can also be influenced by perceptual salience. Changes in the perceptual salience of an individual heavily influences their social behavior and subjective experience of their social interactions, confirming a "social salience effect". Social salience relates to how individuals perceive and respond to other people. ===== Behavioral Science ===== The connection between salience bias and other heuristics, like availability and representativeness, links it to the fields of behavioral science and behavioral economics. Salience bias is closely related to the availability heuristic in behavioral economics, based on the influence of information vividness and visibility, such as recency or frequency, on judgements, for example:Accessibility and salience are closely related to availability, and they are important as well. If you have personally experienced a serious earthquake, you're more likely to believe that an earthquake is likely than if you read about it in a weekly magazine. Thus, vivid and easily imagined causes of death (for example, tornadoes) often receive inflated estimates of probability, and less-vivid causes (for example, asthma attacks) receive low estimates, even if they occur with a far greater frequency (here, by a factor of twenty). Timing counts too: more recent events have a greater impact on our behavior, and on our fears, than earlier ones.Humans have bounded rationality, which refers to their limited ability to be rational in decision making, due to a limited capacity to process information and cognitive ability. Heuristics, such as availability, are employed to reduce the complexity of cognitive and social tasks or judgements, in order to decrease the cognitive load that result from bounded rationality. Despite the effectiveness of heuristics in doing so, they are limited by systematic errors that occur, often the result of influencing biases, such as salience. This can lead to misdirected or misinformed judgements, based on an overemphasis or overweighting of

    Read more →
  • MegaHAL

    MegaHAL

    MegaHAL is a computer conversation simulator, or "chatterbot", created by Jason Hutchens. == Background == In 1996, Jason Hutchens entered the Loebner Prize Contest with HeX, a chatterbot based on ELIZA. HeX won the competition that year and took the $2000 prize for having the highest overall score. In 1998, Hutchens again entered the Loebner Prize Contest with his new program, MegaHAL. MegaHAL made its debut in the 1998 Loebner Prize Contest. Like many chatterbots, the intent is for MegaHAL to appear as a human fluent in a natural language. As a user types sentences into MegaHAL, MegaHAL will respond with sentences that are sometimes coherent and at other times complete gibberish. MegaHAL learns as the conversation progresses, remembering new words and sentence structures. It will even learn new ways to substitute words or phrases for other words or phrases. Many would consider conversation simulators like MegaHAL to be a primitive form of artificial intelligence. However, MegaHAL doesn't understand the conversation or even the sentence structure. It generates its conversation based on sequential and mathematical relationships. In the world of conversation simulators, MegaHAL is based on relatively old technology and could be considered primitive. However, its popularity has grown due to its humorous nature; it has been known to respond with twisted or nonsensical statements that are often amusing. == Theory of Operation == MegaHal is based at least in part on a so-called "hidden Markov Model", so that the first thing that Megahal does when it "trains" on a script or text is to build a database of text fragments encompassing every possible subset of perhaps 4, 5, or even 6 consecutive words, so that for example - if MegaHal trains on the Declaration of Independence, then MegaHal will build a database containing text fragments such as "When in the course", "in the course of", "the course of human", "course of human events", "of human events, one", "human events, one people", and so on. Then if Megahal is fed another text, such has "Superman, Yes! It's Superman - he can change the course of mighty rivers, bend steel with his bare hands - and who disguised at Clark Kent …" IT MIGHT induce Megahal to apparently bemuse itself to proffer whether Superman can change the course of human events, or something else altogether - such as some rambling about "when in the course of mighty rivers", and so on. Thus likewise - if a phrase like "the White house said" comes up a lot in some text; then Megahal's ability to switch randomly between different contexts which otherwise share some similarity can result at times in some surprising lucidity, or else it might otherwise seem quite bizarre. == Examples == There are some sentences that MegaHAL generated: CHESS IS A FUN SPORT, WHEN PLAYED WITH SHOT GUNS. and COWS FLY LIKE CLOUDS BUT THEY ARE NEVER COMPLETELY SUCCESSFUL. == Distribution == MegaHAL is distributed under the Unlicense. Its source code can be downloaded from the Github repository.

    Read more →
  • CineAsset

    CineAsset

    CineAsset was a complete mastering software suite by Doremi Labs that could create and playback encrypted (Pro version) and unencrypted DCI compliant packages from virtually any source. CineAsset included a separate "Editor" application for generating Digital Cinema Packages (DCPs). CineAsset Pro added the ability to generate encrypted DCPs and Key Delivery Messages (KDMs) for any encrypted content in the database. It has since been discontinued, along with CineAsset Player. == Features == == Supported formats == === Input === Source: ==== Containers ==== AVI MOV MXF MPG TS WMV M2TS MTS MP4 MKV ==== Video Codecs ==== JPEG2000 ProRes 422 DNxHD® YUV Uncompressed 8-10 bits DIVX® XVID® MPEG4 AVC / H-264 VC-1 MPEG2 ==== Image Sequences ==== BMP TIFF TGA DPX JPG J2C ==== Audio Files ==== WAV MP3 WMA MP2 === Output === Source: ==== JPEG2000 ==== 2D and 3D at up to 4K resolution Bit Rate: 50–250 Mbit/s (500 Mbit/s for frame rates above 30 fps) Speed: Faster than real-time processing when using optional render nodes ==== MPEG2 ==== I-Only or Long GOP 1080p up to 80 Mbit/s ==== H264 ==== 1080p up to 50 Mbit/s ==== VC1 ==== DCP wrapping only (no transcode)

    Read more →
  • BLOOM (language model)

    BLOOM (language model)

    The BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is an open-access large language model (LLM) released in 2022. It was created by a volunteer-driven research effort to provide a transparently-created alternative to proprietary AI models. With 176 billion parameters, BLOOM is a transformer-based autoregressive model designed to generate text in 46 natural languages and 13 programming languages. The model is distributed under the project's "Responsible AI License". == Development == BLOOM is the main outcome of the BigScience initiative, a one-year-long research workshop. The project was coordinated by Hugging Face using funding from the French government and involved several hundred volunteer researchers and engineers from academia and the private sector. The model was trained between March and July 2022 on the Jean Zay public supercomputer in France, managed by GENCI and IDRIS (CNRS). Unlike GPT-3, BLOOM was trained to be multilingual. The source code is released under the Apache 2.0 license. The model's parameters are released under BigScience's "Responsible AI License" (RAIL), which grants open access and reuse rights but with some usage restrictions. BLOOM was used in the chatbots BLOOMChat and HuggingChat due to its multilingual abilities. BLOOM's training corpus, named ROOTS, combines data extracted from the then-latest version of the web-based OSCAR corpus (38% of ROOTS) and newly collected data extracted from a manually selected and documented list of language data sources. In total, the model was trained on approximately 366 billion (1.6TB) tokens. It was developed using the open-source libraries DeepSpeed Megatron. BigScience then released xP3, a multilingual dataset for LLM supervised learning. It also released BLOOMZ, a variant of BLOOM fine-tuned on xP3 to follow instructions.

    Read more →
  • ELIZA

    ELIZA

    ELIZA is an early natural language processing computer program developed from 1964 to 1967 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and substitution methodology that gave users an illusion of understanding on the part of the program, but gave no response that could be considered really understanding what was being said by either party. Whereas the ELIZA program itself was written (originally) in MAD-SLIP, the pattern matching directives that contained most of its language capability were provided in separate "scripts", represented in a Lisp-like expression. The most famous script, DOCTOR, simulated a psychotherapist of the Rogerian school (in which the therapist often reflects back the patient's words to the patient), and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatbots (originally "chatterbots") and one of the first programs capable of attempting the Turing test. Weizenbaum intended the program as a method to explore communication between humans and machines. He was surprised that some people, including his secretary, attributed human-like feelings to the computer program, a phenomenon that came to be called the ELIZA effect. Many academics believed that the program would be able to positively influence the lives of many people, particularly those with psychological issues, and that it could aid doctors working on such patients' treatment. While ELIZA was capable of engaging in discourse, it could not converse with true understanding. However, many early users were convinced of ELIZA's intelligence and understanding, despite Weizenbaum's insistence to the contrary. The original ELIZA source code had been missing since its creation in the 1960s, as it was not common to publish articles that included source code at that time. However, more recently the MAD-SLIP source code was discovered in the MIT archives and published on various platforms, such as the Internet Archive. The source code is of high historical interest since it demonstrates not only the specificity of programming languages and techniques at that time, but also the beginning of software layering and abstraction as a means of achieving sophisticated software programming. == Overview == Joseph Weizenbaum's ELIZA, running the DOCTOR script, created a conversational interaction somewhat similar to what might take place in the office of "a [non-directive] psychotherapist in an initial psychiatric interview" and to "demonstrate that the communication between man and machine was superficial". While ELIZA is best known for acting in the manner of a psychotherapist, the speech patterns are due to the data and instructions supplied by the DOCTOR script. ELIZA itself examined the text for keywords, applied values to said keywords, and transformed the input into an output; the script that ELIZA ran determined the keywords, set the values of keywords, and set the rules of transformation for the output. Weizenbaum chose to make the DOCTOR script in the context of psychotherapy to "sidestep the problem of giving the program a data base of real-world knowledge", allowing it to reflect back the patient's statements to carry the conversation forward. The result was a somewhat intelligent-seeming response that reportedly deceived some early users of the program. Weizenbaum named his program ELIZA after Eliza Doolittle, a working-class character in George Bernard Shaw's Pygmalion (also appearing in the musical My Fair Lady, which was based on the play and was hugely popular at the time). According to Weizenbaum, ELIZA's ability to be "incrementally improved" by various users made it similar to Eliza Doolittle, since Eliza Doolittle was taught to speak with an upper-class accent in Shaw's play. However, unlike the human character in Shaw's play, ELIZA is incapable of learning new patterns of speech or new words through interaction alone. Edits must be made directly to ELIZA's active script in order to change the manner by which the program operates. Weizenbaum first implemented ELIZA in his own SLIP list-processing language, where, depending upon the initial entries by the user, the illusion of human intelligence could appear, or be dispelled through several interchanges. Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people." In 1966, interactive computing (via a teletype) was new. It was 11 years before the personal computer became familiar to the general public, and three decades before most people encountered attempts at natural language processing in Internet services like Ask.com or PC help systems such as Microsoft Office Clippit. Although those programs included years of research and work, ELIZA remains a milestone because it was the first time a programmer had attempted such a human-machine interaction with the goal of creating the illusion (however brief) of human–human interaction. At the ICCC 1972, ELIZA was brought together with another early artificial-intelligence program named PARRY for a computer-only conversation. While ELIZA was built to speak as a doctor, PARRY was intended to simulate a patient with schizophrenia. == Design and implementation == Weizenbaum originally wrote ELIZA in MAD-SLIP for CTSS on an IBM 7094 as a program to make natural-language conversation possible with a computer. To accomplish this, Weizenbaum identified five "fundamental technical problems" for ELIZA to overcome: the identification of key words, the discovery of a minimal context, the choice of appropriate transformations, the generation of responses in the absence of key words, and the provision of an editing capability for ELIZA scripts. Weizenbaum solved these problems and made ELIZA such that it had no built-in contextual framework or universe of discourse. However, this required ELIZA to have a script of instructions on how to respond to inputs from users. ELIZA starts its process of responding to an input by a user by first examining the text input for a "keyword". A "keyword" is a word designated as important by the acting ELIZA script, which assigns to each keyword a precedence number, or a RANK, designed by the programmer. If such words are found, they are put into a "keystack", with the keyword of the highest RANK at the top. The input sentence is then manipulated and transformed as the rule associated with the keyword of the highest RANK directs. For example, when the DOCTOR script encounters words such as "alike" or "same", it would output a message pertaining to similarity, in this case "In what way?", as these words had high precedence number. This also demonstrates how certain words, as dictated by the script, can be manipulated regardless of contextual considerations, such as switching first-person pronouns and second-person pronouns and vice versa, as these too had high precedence numbers. Such words with high precedence numbers are deemed superior to conversational patterns and are treated independently of contextual patterns. Following the first examination, the next step of the process is to apply an appropriate transformation rule, which includes two parts: the "decomposition rule" and the "reassembly rule". First, the input is reviewed for syntactical patterns in order to establish the minimal context necessary to respond. Using the keywords and other nearby words from the input, different disassembly rules are tested until an appropriate pattern is found. Using the script's rules, the sentence is then "dismantled" and arranged into sections of the component parts as the "decomposition rule for the highest-ranking keyword" dictates. The example that Weizenbaum gives is the input "You are very helpful", which is transformed to "I are very helpful". This is then broken into (1) empty (2) "I" (3) "are" (4) "very helpful". The decomposition rule has broken the phrase into four small segments that contain both the keywords and the information in the sentence. The decomposition rule then designates a particular reassembly rule, or set of reassembly rules, to follow when reconstructing the sentence. The reassembly rule takes the fragments of the input that the decomposition rule had created, rearranges them, and adds in programmed words to create a response. Using Weizenbaum's example previously stated, such a reassembly rule would take the fragments and apply them to the phrase "What makes

    Read more →