AI Content Detection Tools

AI Content Detection Tools — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Speech recognition

    Speech recognition

    Speech recognition (automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT)) is a sub-field of computational linguistics concerned with methods and technologies that translate spoken language into text or other interpretable forms. Speech recognition applications include voice user interfaces, where the user speaks to a device, which "listens" and processes the audio. Common voice applications include interpreting commands for calling, call routing, home automation, and aircraft control. These applications are called direct voice input. Productivity applications include searching audio recordings, creating transcripts, and dictation. Speech recognition can be used to analyse speaker characteristics, such as identifying native language using pronunciation assessment. Voice recognition (speaker identification) refers to identifying the speaker, rather than speech contents. Recognizing the speaker can simplify the task of translating speech in systems trained on a specific person's voice. It can also be used to authenticate the speaker as part of a security process. == History == Applications for speech recognition developed over many decades, with progress accelerated due to advances in deep learning and the use of big data. These advances are reflected in an increase in academic papers, and greater system adoption. Key areas of growth include vocabulary size, more accurate recognition for unfamiliar speakers (speaker independence), and faster processing speed. === Pre-1970 === 1952 – Bell Labs researchers, Stephen Balashek, R. Biddulph, and K. H. Davis, built Audrey for single-speaker digit recognition. Their system located the formants in the power spectrum of each utterance. 1960 – Gunnar Fant developed and published the source–filter model of speech production. 1962 – IBM's 16-word "Shoebox" machine's speech recognition debuted at the 1962 World's Fair. 1966 – Linear predictive coding, a speech coding method, was proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone. 1969 – Funding at Bell Labs came to a halt for several years after the company's head engineer, John R. Pierce, wrote an open letter criticizing speech recognition research. This defunding lasted until Pierce retired and James L. Flanagan took over. Raj Reddy was the first person to work on continuous speech recognition, as a graduate student at Stanford University in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing chess. Around this time, Soviet researchers invented the dynamic time warping (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary. DTW processed speech by dividing it into short frames (e.g. 10 ms segments) and treating each frame as a unit. Speaker independence, however, remained unsolved. === 1970–1990 === 1971 – DARPA funded a five-year speech recognition research project, Speech Understanding Research, seeking a minimum vocabulary size of 1,000 words. The project considered speech understanding a key to achieving progress in speech recognition, which was later disproved. BBN, IBM, Carnegie Mellon (CMU), and Stanford Research Institute participated. 1972 – The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. 1976 – The first ICASSP was held in Philadelphia, which became a major venue for publishing on speech recognition. During the late 1960s, Leonard Baum developed the mathematics of Markov chains at the Institute for Defense Analysis. A decade later, at CMU, Raj Reddy's students James Baker and Janet M. Baker began using the hidden Markov model (HMM) for speech recognition. James Baker had learned about HMMs while at the Institute for Defense Analysis. HMMs enabled researchers to combine sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. By the mid-1980s, Fred Jelinek's team at IBM created a voice-activated typewriter called Tangora, which could handle a 20,000-word vocabulary. Jelinek's statistical approach placed less emphasis on emulating human brain processes in favor of statistical modelling. (Jelinek's group independently discovered the application of HMMs to speech.) This was controversial among linguists since HMMs are too simplistic to account for many features of human languages. However, the HMM proved to be a highly useful way for modelling speech and replaced dynamic time warping as the dominant speech recognition algorithm in the 1980s. 1982 – Dragon Systems, founded by James and Janet M. Baker, was one of IBM's few competitors. === Practical speech recognition === The 1980s also saw the introduction of the n-gram language model. 1987 – The back-off model enabled language models to use multiple-length n-grams, and CSELT used HMM to recognize languages (in software and hardware, e.g. RIPAC). At the end of the DARPA program in 1976, the best computer available to researchers was the PDP-10 with 4 MB of RAM. It could take up to 100 minutes to decode 30 seconds of speech. Practical products included: 1984 – the Apricot Portable was released with up to 4096 words support, of which only 64 could be held in RAM at a time. 1987 – a recognizer from Kurzweil Applied Intelligence 1990 – Dragon Dictate, a consumer product released in 1990. AT&T deployed the Voice Recognition Call Processing service in 1992 to route telephone calls without a human operator. The technology was developed by Lawrence Rabiner and others at Bell Labs. By the early 1990s, the vocabulary of the typical commercial speech recognition system had exceeded the average human vocabulary. Reddy's former student, Xuedong Huang, developed the Sphinx-II system at CMU. Sphinx-II was the first to do speaker-independent, large vocabulary, continuous speech recognition, and it won DARPA's 1992 evaluation. Handling continuous speech with a large vocabulary was a major milestone. Huang later founded the speech recognition group at Microsoft in 1993. Reddy's student Kai-Fu Lee joined Apple, where, in 1992, he helped develop the Casper speech interface prototype. Lernout & Hauspie, a Belgium-based speech recognition company, acquired other companies, including Kurzweil Applied Intelligence in 1997 and Dragon Systems in 2000. L&H was used in Windows XP. L&H was an industry leader until an accounting scandal destroyed it in 2001. L&H speech technology was bought by ScanSoft, which became Nuance in 2005. Apple licensed Nuance software for its digital assistant Siri. ==== 2000s ==== In the 2000s, DARPA sponsored two speech recognition programs: Effective Affordable Reusable Speech-to-Text (EARS) in 2002, followed by Global Autonomous Language Exploitation (GALE) in 2005. Four teams participated in EARS: IBM; a team led by BBN with LIMSI and the University of Pittsburgh; Cambridge University; and a team composed of ICSI, SRI, and the University of Washington. EARS funded the collection of the Switchboard telephone speech corpus, which contained 260 hours of recorded conversations from over 500 speakers. The GALE program focused on Arabic and Mandarin broadcast news. Google's first effort at speech recognition came in 2007 after recruiting Nuance researchers. Its first product, GOOG-411, was a telephone-based directory service. Since at least 2006, the U.S. National Security Agency has employed keyword spotting, allowing analysts to index large volumes of recorded conversations and identify speech containing "interesting" keywords. Other government research programs focused on intelligence applications, such as DARPA's EARS program and IARPA's Babel program. In the early 2000s, speech recognition was dominated by hidden Markov models combined with feed-forward artificial neural networks (ANN). Later, speech recognition was taken over by long short-term memory (LSTM), a recurrent neural network (RNN) published by Sepp Hochreiter & Jürgen Schmidhuber in 1997. LSTM RNNs avoid the vanishing gradient problem and can learn "Very Deep Learning" tasks that require memories of events that happened thousands of discrete time steps earlier, which is important for speech. Around 2007, LSTMs trained with Connectionist Temporal Classification (CTC) began to outperform. In 2015, Google reported a 49 percent error‑rate reduction in its speech recognition via CTC‑trained LSTM. Transformers, a type of neural network based solely on attention, were adopted in computer vision and language modelling, and then to speech recognition. Deep feed-forward (non-recurrent) networks for acoustic modelling were introduced in 2009 by Geoffrey Hinton and his students at the University of Toronto, and by Li Deng and colleagues at Microsoft Research. In contrast to the prioer incremental improvements, deep learning decreased error rates by 30%. Both shallow and deep forms (e.g., recurrent nets) of ANNs had been explored since the 1980s. Howev

    Read more →
  • Deep Learning Indaba

    Deep Learning Indaba

    The Deep Learning Indaba is an annual conference and educational event that aims to strengthen machine learning and artificial intelligence (AI) capacity across Africa. Launched in 2017, it brings together students, researchers, industry practitioners, and policymakers from across the African continent. == History == The Deep Learning Indaba began in 2017 at the University of the Witwatersrand with over 300 participants from 23 African countries, offering tutorials in advanced AI topics and featuring notable speakers like Nando de Freitas. In 2018, it expanded to 650 delegates at Stellenbosch University, introducing parallel sessions to encourage collaboration. The 2019 edition in Nairobi, Kenya, reflected further growth, with increasing sponsorship and support from major tech companies like Google and Microsoft. === Deep Learning IndabaX ===

    Read more →
  • Cleverbot

    Cleverbot

    Cleverbot is a chatterbot web application. It was created by British AI scientist Rollo Carpenter and launched in October 2008. It was preceded by Jabberwacky, a chatbot project that began in 1988 and went online in 1997. In its first decade, Cleverbot held several thousand conversations with Carpenter and his associates. Since launching on the web, the number of conversations held has exceeded 150 million. Besides the web application, Cleverbot is also available as an iOS, Android, and Windows Phone app. == Operation == Cleverbot's responses are not pre-programmed because it learns from human input: Humans type into the box below the Cleverbot logo and the system finds all keywords or an exact phrase matching the input. After searching through its saved conversations, it responds to the input by finding how a human responded to that input when it was asked, in part or in full, by Cleverbot. Cleverbot participated in a formal Turing test at the 2011 Techniche festival at the Indian Institute of Technology Guwahati on 3 September 2011. Out of the 1334 votes cast, Cleverbot was judged to be 59.3% human, compared to the rating of 63.3% human achieved by human participants. A score of 50.05% or higher is often considered to be a passing grade. The software running for the event had to handle just 1 or 2 simultaneous requests, whereas online Cleverbot is usually talking to around 10,000 to 50,000 people at once. == Developments == Cleverbot is constantly growing in data size at the rate of 4 to 7 million interactions per day. Updates to the software have been mostly behind the scenes. In 2014, Cleverbot was upgraded to use GPU serving techniques. Unlike Eliza, the program does not respond in a fixed way, instead choosing its responses heuristically using fuzzy logic, the whole of the conversation being compared to the millions that have taken place before. Cleverbot now uses over 279 million interactions, about 3-4% of the data it has already accumulated. The developers of Cleverbot are attempting to build a new version using machine learning techniques. An app that uses the Cleverscript engine to play a game of 20 Questions has been launched under the name Clevernator. Unlike other such games, the player asks the questions and it is the role of the AI to understand, and answer factually. An app that allows owners to create and talk to their own small Cleverbot-like AI has been launched, called Cleverme! for Apple products. == In popular culture == Cleverbot received media attention after being featured in the popular 2010 creepypasta ARG web serial Ben Drowned by Alexander D. Hall. In early 2017, a Twitch stream of two Google Home devices modified to talk to each other using Cleverbot garnered over 700,000 visitors and over 30,000 peak concurrent viewers.

    Read more →
  • Arabic Ontology

    Arabic Ontology

    Arabic Ontology is a website offering linguistic ontology services for the Arabic language which can be used like the online site WordNet. Users can use Arabic Ontology to classify or clarify the concepts and meanings of Arabic terms. == Ontology Structure == The ontology structure (i.e., data model) is similar to WordNet's structure. Each concept in the database is given a unique concept identifier (URI), informally described by a gloss, and lexicalized by one or more synonymous lemma terms. Each term-concept pair is called a sense, and is given a SenseID. A set of senses is called synset. Concepts and senses are described by further attributes such as era and area — to specify example usage and ontological analysis. Semantic relations are defined between concepts. Some important entities are included in the ontology, such as individual countries and bodies of water. These individuals are given separate IndividualIDs and linked with their concepts through the InstanceOf relation. == Mappings to other resources == Concepts in the Arabic Ontology are mapped to synsets in WordNet, as well as to BFO and DOLCE. Terms used in the Arabic Ontology are mapped to lemmas in the LDC's SAMA database. == Applications == Arabic Ontology can be used in many application domains, such as: Information retrieval, to enrich queries (e.g., in search engines) and improve the quality of the results, i.e. meaningful search rather than string-matching search; Machine translation and word-sense disambiguation, by finding the exact mapping of concepts across languages, especially that the Arabic ontology is also mapped to the WordNet; Data Integration and interoperability in which the Arabic ontology can be used as a semantic reference to link databases and information systems; Semantic Web and Web 3.0, by using the Arabic ontology as a semantic reference to disambiguate the meanings used in websites; among many other applications. == URLs Design == The URLs in the Arabic Ontology are designed according to the W3C's Best Practices for Publishing Linked Data, as described in the following URL schemes. This allows one to also explore the whole database like exploring a graph: Ontology Concept: Each concept in the Arabic Ontology has a ConceptID and can be accessed using: https://{domain}/concept/{ConceptID | Term}. In case of a term, the set of concepts that this term lexicalizes are all retrieved. In case of a ConceptID, the concept and its direct subtypes are retrieved, e.g. https://ontology.birzeit.edu/concept/293198 Semantic relations: Relationships between concepts can be accessed using these schemes: (i) the URL: https:// {domain}/concept/{RelationName}/{ConceptID} allows retrieval of relationships among ontology concepts. (ii) the URL: https://{domain}/lexicalconcept/{RelationName}/{lexicalConceptID} allows retrieval of relations between lexical concepts. For example, https://ontology.birzeit.edu/concept/instances/293121 retrieves the instances of the concept 293121. The relations that are currently used in our database are: {subtypes, type, instances, parts, related, similar, equivalent}.

    Read more →
  • Acoustic model

    Acoustic model

    An acoustic model is used in automatic speech recognition to represent the relationship between an audio signal and the phonemes or other linguistic units that make up speech. The model is learned from a set of audio recordings and their corresponding transcripts. It is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. == Background == Modern speech recognition systems use both an acoustic model and a language model to represent the statistical properties of speech. The acoustic model models the relationship between the audio signal and the phonetic units in the language. The language model is responsible for modeling the word sequences in the language. These two models are combined to get the top-ranked word sequences corresponding to a given audio segment. Most modern speech recognition systems operate on the audio in small chunks known as frames with an approximate duration of 10ms per frame. The raw audio signal from each frame can be transformed by applying the mel-frequency cepstrum. The coefficients from this transformation are commonly known as mel-frequency cepstral coefficients (MFCCs) and are used as an input to the acoustic model along with other features. Recently, the use of convolutional neural networks has led to major improvements in acoustic modeling. == Speech audio characteristics == Audio can be encoded at different sampling rates (i.e. samples per second – the most common being: 8, 16, 32, 44.1, 48, and 96 kHz), and different bits per sample (the most common being: 8-bits, 16-bits, 24-bits or 32-bits). Speech recognition engines work best if the acoustic model they use was trained with speech audio which was recorded at the same sampling rate/bits per sample as the speech being recognized. == Telephony-based speech recognition == The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files. In the case of voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample. == Desktop-based speech recognition == For speech recognition on a standard desktop PC, the limiting factor is the sound card. Most sound cards today can record at sampling rates of between 16–48 kHz of audio, with bit rates of 8- to 16-bits per sample, and playback at up to 96 kHz. As a general rule, a speech recognition engine works better with acoustic models trained with speech audio data recorded at higher sampling rates/bits per sample. But using audio with too high a sampling rate/bits per sample can slow the recognition engine down. A compromise is needed. Thus for desktop speech recognition, the current standard is acoustic models trained with speech audio data recorded at sampling rates of 16 kHz/16 bits per sample.

    Read more →
  • Ontology learning

    Ontology learning

    Ontology learning (ontology extraction, ontology augmentation generation, ontology generation, or ontology acquisition) is the automatic or semi-automatic creation of ontologies, including extracting the corresponding domain's terms and the relationships between the concepts that these terms represent from a corpus of natural language text, and encoding them with an ontology language for easy retrieval. As building ontologies manually is extremely labor-intensive and time-consuming, there is great motivation to automate the process. Typically, the process starts by extracting terms and concepts or noun phrases from plain text using linguistic processors such as part-of-speech tagging and phrase chunking. Then statistical or symbolic techniques are used to extract relation signatures, often based on pattern-based or definition-based hypernym extraction techniques. == Procedure == Ontology learning (OL) is used to (semi-)automatically extract whole ontologies from natural language text. The process is usually split into the following eight tasks, which are not all necessarily applied in every ontology learning system. === Domain terminology extraction === During the domain terminology extraction step, domain-specific terms are extracted, which are used in the following step (concept discovery) to derive concepts. Relevant terms can be determined, e.g., by calculation of the TF/IDF values or by application of the C-value / NC-value method. The resulting list of terms has to be filtered by a domain expert. In the subsequent step, similarly to coreference resolution in information extraction, the OL system determines synonyms, because they share the same meaning and therefore correspond to the same concept. The most common methods therefore are clustering and the application of statistical similarity measures. === Concept discovery === In the concept discovery step, terms are grouped to meaning bearing units, which correspond to an abstraction of the world and therefore to concepts. The grouped terms are these domain-specific terms and their synonyms, which were identified in the domain terminology extraction step. === Concept hierarchy derivation === In the concept hierarchy derivation step, the OL system tries to arrange the extracted concepts in a taxonomic structure. This is mostly achieved with unsupervised hierarchical clustering methods. Because the result of such methods is often noisy, a supervision step, e.g., user evaluation, is added. A further method for the derivation of a concept hierarchy exists in the usage of several patterns that should indicate a sub- or supersumption relationship. Patterns like “X, that is a Y” or “X is a Y” indicate that X is a subclass of Y. Such pattern can be analyzed efficiently, but they often occur too infrequently to extract enough sub- or supersumption relationships. Instead, bootstrapping methods are developed, which learn these patterns automatically and therefore ensure broader coverage. === Learning of non-taxonomic relations === In the learning of non-taxonomic relations step, relationships are extracted that do not express any sub- or supersumption. Such relationships are, e.g., works-for or located-in. There are two common approaches to solve this subtask. The first is based upon the extraction of anonymous associations, which are named appropriately in a second step. The second approach extracts verbs, which indicate a relationship between entities, represented by the surrounding words. The result of both approaches need to be evaluated by an ontologist to ensure accuracy. === Rule discovery === During rule discovery, axioms (formal description of concepts) are generated for the extracted concepts. This can be achieved, e.g., by analyzing the syntactic structure of a natural language definition and the application of transformation rules on the resulting dependency tree. The result of this process is a list of axioms, which, afterwards, is comprehended to a concept description. This output is then evaluated by an ontologist. === Ontology population === At this step, the ontology is augmented with instances of concepts and properties. For the augmentation with instances of concepts, methods based on the matching of lexico-syntactic patterns are used. Instances of properties are added through the application of bootstrapping methods, which collect relation tuples. === Concept hierarchy extension === In this step, the OL system tries to extend the taxonomic structure of an existing ontology with further concepts. This can be performed in a supervised manner with a trained classifier or in an unsupervised manner via the application of similarity measures. === Frame and Event detection === During frame/event detection, the OL system tries to extract complex relationships from text, e.g., who departed from where to what place and when. Approaches range from applying SVM with kernel methods to semantic role labeling (SRL) to deep semantic parsing techniques. == Tools == Dog4Dag (Dresden Ontology Generator for Directed Acyclic Graphs) is an ontology generation plugin for Protégé 4.1 and OBOEdit 2.1. It allows for term generation, sibling generation, definition generation, and relationship induction. Integrated into Protégé 4.1 and OBO-Edit 2.1, DOG4DAG allows ontology extension for all common ontology formats (e.g., OWL and OBO). Limited largely to EBI and Bio Portal lookup service extensions.

    Read more →
  • Ghana Post GPS

    Ghana Post GPS

    GhanaPostGPS is a web and smartphone application, sponsored by the government of Ghana and developed by Vokacom, to provide a digital addresses and postal codes for every 5 squared meter location in Ghana. The digital address is a composite of the postcode (region, district & area code) plus a unique address. GhanaPostGPS is the first digital addressing system created by the government of Ghana. GhanaPost GPS is a mandatory requirement for obtaining the National Identification Card and other services.

    Read more →
  • Zero-shot learning

    Zero-shot learning

    Zero-shot learning (ZSL) is a problem setup in deep learning where, at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. The name is a play on words based on the earlier concept of one-shot learning, in which classification can be learned from only one, or a few, examples. Zero-shot methods generally work by associating observed and non-observed classes through some form of auxiliary information, which encodes observable distinguishing properties of objects. For example, given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an artificial intelligence model which has been trained to recognize horses, but has never been given a zebra, can still recognize a zebra when it also knows that zebras look like striped horses. This problem is widely studied in computer vision, natural language processing, and machine perception. == Background and history == The first paper on zero-shot learning in natural language processing appeared in a 2008 paper by Chang, Ratinov, Roth, and Srikumar, at the AAAI'08, but the name given to the learning paradigm there was dataless classification. The first paper on zero-shot learning in computer vision appeared at the same conference, under the name zero-data learning. The term zero-shot learning itself first appeared in the literature in a 2009 paper from Palatucci, Hinton, Pomerleau, and Mitchell at NIPS'09. This terminology was repeated later in another computer vision paper and the term zero-shot learning caught on, as a take-off on one-shot learning that was introduced in computer vision years earlier. In computer vision, zero-shot learning models learned parameters for seen classes along with their class representations and rely on representational similarity among class labels so that, during inference, instances can be classified into new classes. In natural language processing, the key technical direction developed builds on the ability to "understand the labels"—represent the labels in the same semantic space as that of the documents to be classified. This supports the classification of a single example without observing any annotated data, the purest form of zero-shot classification. The original paper made use of the Explicit Semantic Analysis (ESA) representation but later papers made use of other representations, including dense representations. This approach was also extended to multilingual domains, fine entity typing and other problems. Moreover, beyond relying solely on representations, the computational approach has been extended to depend on transfer from other tasks, such as textual entailment and question answering. The original paper also points out that, beyond the ability to classify a single example, when a collection of examples is given, with the assumption that they come from the same distribution, it is possible to bootstrap the performance in a semi-supervised like manner (or transductive learning). Unlike standard generalization in machine learning, where classifiers are expected to correctly classify new samples to classes they have already observed during training, in ZSL, no samples from the classes have been given during training the classifier. It can therefore be viewed as an extreme case of domain adaptation. == Prerequisite information for zero-shot classes == Naturally, some form of auxiliary information has to be given about these zero-shot classes, and this type of information can be of several types. Learning with attributes: classes are accompanied by pre-defined structured description. For example, for bird descriptions, this could include "red head", "long beak". These attributes are often organized in a structured compositional way, and taking that structure into account improves learning. While this approach was used mostly in computer vision, there are some examples for it also in natural language processing. Learning from textual description. As pointed out above, this has been the key direction pursued in natural language processing. Here class labels are taken to have a meaning and are often augmented with definitions or free-text natural-language description. This could include for example a wikipedia description of the class. Class-class similarity. Here, classes are embedded in a continuous space. A zero-shot classifier can predict that a sample corresponds to some position in that space, and the nearest embedded class is used as a predicted class, even if no such samples were observed during training. == Generalized zero-shot learning == The above ZSL setup assumes that at test time, only zero-shot samples are given, namely, samples from new unseen classes. In generalized zero-shot learning, samples from both new and known classes, may appear at test time. This poses new challenges for classifiers at test time, because it is very challenging to estimate if a given sample is new or known. Some approaches to handle this include: a gating module, which is first trained to decide if a given sample comes from a new class or from an old one, and then, at inference time, outputs either a hard decision, or a soft probabilistic decision a generative module, which is trained to generate feature representation of the unseen classes—a standard classifier can then be trained on samples from all classes, seen and unseen. == Domains of application == Zero shot learning has been applied to the following fields: image classification semantic segmentation image generation object detection natural language processing computational biology abstract reasoning

    Read more →
  • INaturalist

    INaturalist

    iNaturalist is an American 501(c)(3) nonprofit social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe. iNaturalist may be accessed via its website or from its mobile applications. iNaturalist includes an automated species identification tool, and users further assist each other in identifying organisms from photographs and sound recordings. As of 5 August 2025, iNaturalist users had contributed nearly 300 million observations of plants, animals, fungi, and other organisms worldwide, and 400,000 users were active in the previous 30 days. iNaturalist serves as an important resource of open data for biodiversity research, conservation, and education, describing itself as "an online social network of people sharing biodiversity information to help each other learn about nature." It is the primary application for crowd-sourced biodiversity data in places such as Mexico, southern Africa, and Australia, and the project has been called "a standard-bearer for natural history mobile applications." Most of iNaturalist's software is open source. It has contributed to over 4,000 research papers and is widely used by scientists, land managers, and conservationists worldwide. The platform has also been active in the discovery of new species and rediscovery of species previously assumed to be extinct. == History == iNaturalist began in 2008 as a UC Berkeley School of Information Master's final project of Nate Agrin, Jessica Kline, and Ken-ichi Ueda. Agrin and Ueda continued work on the site with Sean McGregor, a web developer. In 2011, Ueda began collaboration with Scott Loarie, a research fellow at Stanford University and lecturer at UC Berkeley. Ueda and Loarie are the current co-directors of iNaturalist.org. The organization merged with the California Academy of Sciences on 24 April 2014. In 2017, iNaturalist became a joint initiative between the California Academy of Sciences and the National Geographic Society. With these collaborations and growing popularity of the site since 2012, the number of participants and observations has roughly doubled each year. In 2014, iNaturalist reached 1 million observations. Later, as of October 2023, there were 181 million observations (163 million verifiable). On 11 July 2023 iNaturalist announced its status as a newly independent 501(c)(3) nonprofit organization. === Google AI controversy === On 9 June 2025 Google announced that iNaturalist would be part of its "Generative AI Accelerator". This announcement, paired with the initial lack of information on the iNaturalist site, led to outcry from many iNaturalist users in the blog comments and forum, worrying about the consequences for the environment, volunteer engagement, reliability and raised questions about the decision making within iNaturalist, while some saw the backlash as a sign that people want to resist 'corrosive technologies'. PZ Myers, a biology professor who uses iNaturalist in his teaching, published an article on his website Pharyngula stating that "any decision that drives people away and replaces them with a hallucinating bot is a bad decision". == Platforms == Users can interact with iNaturalist in the following ways: through the iNaturalist.org website, through two mobile apps: iNaturalist (iOS/Android) and Seek by iNaturalist (iOS/Android), or through partner organizations such as the Global Biodiversity Information Facility (GBIF) website. On the iNaturalist.org website, visitors can search the public dataset and interact with other people adding observations and identifications. The website provides tools for registered users to add, identify, and discuss observations, write journal posts, explore information about species, create project pages to recruit participation, and coordinate work on their topics of interest. On the iNaturalist mobile app, users can create and share nature observations to the online dataset, explore observations both nearby and around the world, and learn about different species. Seek by iNaturalist, a separate app marketed to families, requires no online account registration and all observations may remain private. Seek incorporates features of gamification, such as providing a list of nearby organisms to find and encouraging the collection of badges and participation in challenges. Seek was initially released in the spring of 2018. == Observations == The iNaturalist platform is based on crowdsourcing of observations and identifications. An iNaturalist observation records a person's encounter with an individual organism at a particular time and place. An iNaturalist observation may also record evidence of an organism, such as animal tracks, nests, or scat. The scope of iNaturalist excludes natural but inert subjects such as geologic or hydrologic features. Users typically upload photos as evidence of their findings, though audio recordings are also accepted, and such evidence is not a strict requirement. Users may share observation locations publicly, "obscure" them to display a less precise location or make the locations completely private. iNaturalist users can add identifications to each other's observations in order to confirm or improve the identification of the observation. Observations are classified as "Casual", "Needs ID" (needs identification), or "Research Grade" based on the quality of the data provided and the community identification process. Any quality of data can be downloaded from iNaturalist and "Research Grade" observations are often incorporated into other online databases such as the Global Biodiversity Information Facility and the Atlas of Living Australia. === Automated species identification === In addition to observations being identified by others in the community, iNaturalist includes an automated species identification tool, first released in 2017. Images can be identified via a computer vision model which has been trained on the large database of the observations on iNaturalist. Multiple species suggestions are typically provided with the suggestion that the software guesses to be most likely is at the top of the list. A broader taxon such as a genus or family is commonly provided if the model is unsure of the species. It is trained once or twice a year, and the threshold for species included in the training set has changed over time. It can be difficult for the model to guess correctly if the species in question is infrequently observed or hard to identify from images alone, or if the image submitted has poor lighting, is blurry, or contains multiple subjects. In February 2023, iNaturalist released v2.1 of its computer vision model, which was trained on a new source model which performed significantly better than the previous models trained using a different source model. In April 2025 iNaturalist released an updated app for iOS, changing the original version to "iNaturalist Classic." == Projects == Users have created and contributed to tens of thousands of different projects on iNaturalist. The platform is commonly used to record observations during bioblitzes, which are biological surveying events that attempt to record all the species that occur within a designated area, and a specific project type on iNaturalist. Other project types include collections of observations by location or taxon or documenting specific types of observations such as animal tracks and signs, the spread of invasive species, roadkill, fishing catches, or discovering new species. In 2011, iNaturalist was used as a platform to power the Global Amphibian and Global Reptile BioBlitzes, in which observations were used to help monitor the occurrence and distribution of the world's reptiles and amphibian species. The US National Park Service partnered with iNaturalist to record observations from the 2016 National Parks BioBlitz. That project exceeded 100,000 observations in August 2016. In 2017, the United Nations Environment Programme teamed up with iNaturalist to celebrate World Environment Day.. In 2022, Reef Ecologic teamed up with iNaturalist to celebrate World Oceans Day. === City Nature Challenge === In 2016, Lila Higgins from the Natural History Museum of Los Angeles County and Alison Young from the California Academy of Sciences co-founded the City Nature Challenge (CNC). In the first City Nature Challenge, naturalists in Los Angeles and the San Francisco Bay Area documented over 20,000 observations with the iNaturalist platform. In 2017, the CNC expanded to 16 cities across the United States and collected over 125,000 observations of wildlife in 5 days. The CNC expanded to a global audience in 2018, with 68 cities participating from 19 countries, with some cities using community science platforms other than iNaturalist to participate. In 4 days, over 17,000 people cataloged over 440,000 nature observations in urban regions around the world. In 2019, the CNC once again expanded, with 35,000 parti

    Read more →
  • BLOOM (language model)

    BLOOM (language model)

    The BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is an open-access large language model (LLM) released in 2022. It was created by a volunteer-driven research effort to provide a transparently-created alternative to proprietary AI models. With 176 billion parameters, BLOOM is a transformer-based autoregressive model designed to generate text in 46 natural languages and 13 programming languages. The model is distributed under the project's "Responsible AI License". == Development == BLOOM is the main outcome of the BigScience initiative, a one-year-long research workshop. The project was coordinated by Hugging Face using funding from the French government and involved several hundred volunteer researchers and engineers from academia and the private sector. The model was trained between March and July 2022 on the Jean Zay public supercomputer in France, managed by GENCI and IDRIS (CNRS). Unlike GPT-3, BLOOM was trained to be multilingual. The source code is released under the Apache 2.0 license. The model's parameters are released under BigScience's "Responsible AI License" (RAIL), which grants open access and reuse rights but with some usage restrictions. BLOOM was used in the chatbots BLOOMChat and HuggingChat due to its multilingual abilities. BLOOM's training corpus, named ROOTS, combines data extracted from the then-latest version of the web-based OSCAR corpus (38% of ROOTS) and newly collected data extracted from a manually selected and documented list of language data sources. In total, the model was trained on approximately 366 billion (1.6TB) tokens. It was developed using the open-source libraries DeepSpeed Megatron. BigScience then released xP3, a multilingual dataset for LLM supervised learning. It also released BLOOMZ, a variant of BLOOM fine-tuned on xP3 to follow instructions.

    Read more →
  • Language identification

    Language identification

    In natural language processing, language identification or language guessing is the problem of determining which natural language a given content is in. Computational approaches to this problem view it as a special case of text categorization, solved with various statistical methods. == Overview == === Logical approach === A common non-statistical intuitive approach (though highly uncertain) is to look for common letter combinations, or distinctive diacritics or punctuation. === Statistical approach === There are several statistical approaches to language identification. An older statistical method by Grefenstette was based on the frequency of short n-grams, which are often function morphemes. For example, "ing" is more common in English than in French, while the sequence "que" is more common in French. Given a new page found on the Web, one counts the number of occurrences of each such short sequence and picks the language whose frequency table it matches the most. One technique is to compare the compressibility of the text to the compressibility of texts in a set of known languages. This approach is known as mutual information based distance measure. The same technique can also be used to empirically construct family trees of languages which closely correspond to the trees constructed using historical methods. Mutual information based distance measure is essentially equivalent to more conventional model-based methods and is not generally considered to be either novel or better than simpler techniques. Another technique, as described by Cavnar and Trenkle (1994) and Dunning (1994) is to create a language n-gram model from a "training text" for each of the languages. These models can be based on characters (Cavnar and Trenkle) or encoded bytes (Dunning); in the latter, language identification and character encoding detection are integrated. Then, for any piece of text needing to be identified, a similar model is made, and that model is compared to each stored language model. The most likely language is the one with the model that is most similar to the model from the text needing to be identified. This approach can be problematic when the input text is in a language for which there is no model. In that case, the method may return another, "most similar" language as its result. Also problematic for any approach are pieces of input text that are composed of several languages, as is common on the Web. As of 2025, a commonly used baseline method is via the fastText library, which has comparable classification accuracy as deep learning techniques, but much faster. == Identifying similar languages == One of the great bottlenecks of language identification systems is to distinguish between closely related languages. Similar languages like Bulgarian and Macedonian or Indonesian and Malay present significant lexical and structural overlap, making it challenging for systems to discriminate between them. In 2014 the DSL shared task has been organized providing a dataset (Tan et al., 2014) containing 13 different languages (and language varieties) in six language groups: Group A (Bosnian, Croatian, Serbian), Group B (Indonesian, Malaysian), Group C (Czech, Slovak), Group D (Brazilian Portuguese, European Portuguese), Group E (Peninsular Spanish, Argentine Spanish), Group F (American English, British English). The best system reached performance of over 95% results (Goutte et al., 2014). Results of the DSL shared task are described in Zampieri et al. 2014. == Software == Apache OpenNLP includes char n-gram based statistical detector and comes with a model that can distinguish 103 languages Apache Tika contains a language detector for 18 languages

    Read more →
  • Document mosaicing

    Document mosaicing

    Document mosaicing is a process that stitches multiple, overlapping snapshot images of a document together to produce one large, high resolution composite. The document is slid under a stationary, over-the-desk camera by hand until all parts of the document are snapshotted by the camera's field of view. As the document slid under the camera, all motion of the document is coarsely tracked by the vision system. The document is periodically snapshotted such that the successive snapshots are overlap by about 50%. The system then finds the overlapped pairs and stitches them together repeatedly until all pairs are stitched together as one piece of document. The document mosaicing can be divided into four main processes. Tracking Feature detecting Correspondences establishing Images mosaicing. == Tracking (simple correlation process) == In this process, the motion of the document slid under the camera is coarsely tracked by the system. Tracking is performed by a process called simple correlation process. In the first frame of snapshots, a small patch is extracted from the center of the image as a correlation template. The correlation process is performed in the four times size of the patch area of the next frame. The motion of the paper is indicated by the peak in the correlation function. The peak in the correlation function indicates the motion of the paper. The template is resampled from this frame and the tracking continues until the template reaches the edge of the document. After the template reaches the edge of the document, another snapshot is taken and the tracking process performs repeatedly until the whole document is imaged. The snapshots are stored in an ordered list to facilitate pairing the overlapped images in later processes. == Feature detecting for efficient matching == Feature detection is the process of finding the transformation that aligns one image with another. There are two main approaches for feature detection. Feature-based approach : Motion parameters are estimated from point correspondences. This approach is suitable for the case that there is plenty supply of stable and detectable features. Featureless approach : When the motion between the two images is small, the motion parameters are estimated using optical flow. On the other hand, when the motion between the two images is large, the motion parameters are estimated using generalised cross-correlation. However, this approach requires a computationally expensive resources. Each image is segmented into a hierarchy of columns, lines, and words to match the organised sets of features across images. Skew angle estimation and columns, lines and words finding are the examples of feature detection operations. === Skew angle estimation === Firstly, the angle that the rows of text make with the image raster lines (skew angle) is estimated. It is assumed to lie in the range of ±20°. A small patch of text in the image is selected randomly and then rotated in the range of ±20° until the variance of the pixel intensities of the patch summed along the raster lines is maximised. To ensure that the found skew angle is accurate, the document mosaic system performs calculation at many image patches and derive the final estimation by finding the average of the individual angles weighted by the variance of the pixel intensities of each patch. === Columns, lines and words finding === In this operation, the de-skewed document is intuitively segmented into a hierarchy of columns, lines and words. The sensitivity to illumination and page coloration of the de-skewed document can be removed by applying a Sobel operator to the de-skewed image and thresholding the output to obtain the binary gradient, de-skewed image. The operation can be roughly separated into 3 steps: column segmentation, line segmentation and word segmentation. Columns are easily segmented from the binary gradient, de-skewed images by summing pixels vertically. Baselines of each row are segmented in the same way as the column segmentation process but horizontally. Finally, individual words are segmented by applying the vertical process at each segmented row. These segmentations are important because the document mosaic is created by matching the lower right corners of words in overlapping images pair. Moreover, the segmentation operation can organize the list of images in the context of a hierarchy of rows and column reliably. The segmentation operation involves a considerable amount of summing in the binary gradient, de-skewed images, which done by construct a matrix of partial sums whose elements are given by p i y = ∑ u = 1 i ∑ v = 1 j b u v {\displaystyle p_{iy}=\sum _{u=1}^{i}\sum _{v=1}^{j}b_{uv}} The matrix of partial sums is calculated in one pass through the binary gradient, de-skewed image. ∑ u = u 1 u 2 ∑ v = v 1 v 2 b u v = p u 2 v 2 + p u 1 v 1 − p u 1 v 2 − p u 2 v 1 {\displaystyle \sum _{u=u_{1}}^{u_{2}}\sum _{v=v_{1}}^{v_{2}}b_{uv}=p_{u_{2}v_{2}}+p_{u_{1}v_{1}}-p_{u_{1}v_{2}}-p_{u_{2}v_{1}}} == Correspondences establishing == The two images are now organized in hierarchy of linked lists in following structure : image=list of columns row=list of words column=list of row word=length (in pixels) At the bottom of the structure, the length of each word is recorded for establishing correspondence between two images to reduce to search only the corresponding structures for the groups of words with the matching lengths. === Seed match finding === A seed match finding is done by comparing each row in image1 with each row in image2. The two rows are then compared to each other by every word. If the length (in pixel) of the two words (one from image1 and one from image2) and their immediate neighbours agree with each other within a predefined tolerance threshold (5 pixels, for example), then they are assumed to match. The row of each image is assumed a match if there are three or more word matches between the two rows. The seed match finding operation is terminated when two pairs of consecutive row match are found. === Match list building === After finishing a seed match finding operation, the next process is to build the match list to generate the correspondences points of the two images. The process is done by searching the matching pairs of rows away from the seed row. == Images mosaicing == Given the list of corresponding points of the two images, finding the transformation of the overlapping portion of the images is the next process. Assuming a pinhole camera model, the transformation between pixels (u,v) of image 1 and pixels (u0, v0) of image 2 is demonstrated by a plane-to-plane projectivity. [ s u ′ s v ′ s ] = [ p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 1 ] [ u v 1 ] E q .1 {\displaystyle \left[{\begin{array}{c}su'\\sv'\\s\end{array}}\right]=\left[{\begin{array}{ccc}p_{11}&p_{12}&p_{13}\\p_{21}&p_{22}&p_{23}\\p_{31}&p_{32}&1\end{array}}\right]\left[{\begin{array}{c}u\\v\\1\end{array}}\right]\qquad Eq.1} The parameters of the projectivity is found from four pairs of matching points. RANSAC regression technique is used to reject outlying matches and estimate the projectivity from the remaining good matches. The projectivity is fine-tuned using correlation at the corners of the overlapping portion to obtain four correspondences to sub-pixel accuracy. Therefore, image1 is then transformed into image2's coordinate system using Eq.1. The typical result of the process is shown in Figure 5. === Many images coping === Finally, the whole page composition is built up by mapping all the images into the coordinate system of an "anchor" image, which is normally the one nearest the page center. The transformations to the anchor frame are calculated by concatenating the pair-wise transformations found earlier. The raw document mosaic is shown in Figure 6. However, there might be a problem of non-consecutive images that are overlap. This problem can be solved by performing Hierarchical sub-mosaics. As shown in Figure 7, image1 and image2 are registered, as are image3 and image4, creating two sub-mosaics. These two sub-mosaics are later stitched together in another mosaicing process. == Applied areas == There are various areas that the technique of document mosaicing can be applied to such as : Text segmentation of images of documents Document Recognition Interaction with paper on the digital desk Video mosaics for virtual environments Image registration techniques == Relevant research papers == Huang, T.S.; Netravali, A.N. (1994). "Motion and structure from feature correspondences: A review". Proceedings of the IEEE. 82 (2): 252–268. doi:10.1109/5.265351. D.G. Lowe. [1] Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Boston, 1985. Irani, M.; Peleg, S. (1991). "Improving resolution by image registration". CVGIP: Graphical Models and Image Processing. 53 (3): 231–239. doi:10.1016/1049-9652(91)90045-L. S2CID 4834546. Shivakumara, P.; Kumar, G. Hemantha; Guru, D. S.; Nagabhushan, P. (2006). "

    Read more →
  • Digital art

    Digital art

    Digital art, or the digital arts, is artistic work that uses digital technology as part of the creative or presentational process. It can also refer to computational art that uses and engages with digital media. Since the 1960s, various names have been used to describe digital art, including computer art, electronic art, multimedia art, and new media art. Digital art includes pieces stored on physical media, such as with digital painting, as well as digital galleries on websites. Digital art also extends to the field of visual computing. == History == In the early 1960s, John Whitney developed the first computer-generated art using mathematical operations. In 1963, Ivan Sutherland invented the first user interactive computer-graphics interface known as Sketchpad. Between 1974 and 1977, Salvador Dalí created two big canvases of Gala Contemplating the Mediterranean Sea which at a distance of 20 meters is transformed into the portrait of Abraham Lincoln (Homage to Rothko) and prints of Lincoln in Dalivision based on a portrait of Abraham Lincoln processed on a computer by Leon Harmon published in "The Recognition of Faces". The technique is similar to what later became known as photographic mosaics. Andy Warhol created digital art using an Amiga where the computer was publicly introduced at the Lincoln Center in July 1985. An image of Debbie Harry was captured in monochrome from a video camera and digitized into a graphics program called ProPaint. Warhol manipulated the image by adding color using flood fills. == Art made for digital media == Artwork that is highly computational, presented through digital media, and explicitly engages with digital technologies are categorized as "art made for digital media". This differs from art using digital tools, which incorporate digital technology in the creation process but may exist outside the digital world. Digital art historian Christiane Paul writes that it "is highly problematic to classify all art that makes use of digital technologies somewhere in its production and dissemination process as digital art since it makes it almost impossible to arrive at any unifying statement about the art form". == Art that uses digital tools == Digital art can be purely computer-generated (such as fractals and algorithmic art) or taken from other sources, such as a scanned photograph or an image drawn using vector graphics software using a mouse or graphics tablet. Artworks are considered digital paintings when created similarly to non-digital paintings but using software on a computer platform and digitally outputting the resulting image as painted on canvas. Despite differing viewpoints on digital technology's impact on the arts, a consensus exists within the digital art community about its significant contribution to expanding the creative domain, i.e., that it has greatly broadened the creative opportunities available to professional and non-professional artists alike. == Art theorists and art historians == Notable art theorists and historians in this field include: Oliver Grau, Jon Ippolito, Christiane Paul, Frank Popper, Jasia Reichardt, Mario Costa, Christine Buci-Glucksmann, Dominique Moulon, Roy Ascott, Catherine Perret, Margot Lovejoy, Edmond Couchot, Tina Rivers Ryan, Fred Forest and Edward A. Shanken. === Digital painting === Digital painting is either a physical painting made with the use of digital electronics and spray paint robotics within the digital art fine art context or pictorial art imagery made with pixels on a computer screen that mimics artworks from the traditional histories of painting and illustration. === Artificial intelligence art === Artists have used artificial intelligence to create artwork since at least the 1960s. Since their design in 2014, some artists have created artwork using a generative adversarial network (GAN), which is a machine learning framework that allows two "algorithms" to compete with each other and iterate. It can be used to generate pictures that have visual effects similar to traditional fine art. The essential idea of image generators is that people can use text descriptions to let AI convert their text into visual picture content. Anyone can turn their language into a painting through a picture generator. == Digital art education == Digital art education has become more common with the advancement of digital hardware and software. From hardware such as graphics tablets, styluses, tablets, 3D scanners, virtual reality headsets, and digital cameras; to software such as digital art software, 3D modeling software, 3D rendering, digital sculpting, 2D graphics software, digital painting, 3D terrain generation, 2D animation software, 3D animation software, raster graphics editors, vector graphics editors, mathematical art software, and video editing software. == Scholarship and archives == In addition to the creation of original art, research methods that utilize AI have been generated to quantitatively analyze digital art collections. This has been made possible due to the large-scale digitization of artwork in the past few decades. Although the main goal of digitization was to allow for accessibility and exploration of these collections, the use of AI in analyzing them has brought about new research perspectives. Two computational methods, close reading and distant viewing, are the typical approaches used to analyze digitized art. Close reading focuses on specific visual aspects of one piece. Some tasks performed by machines in close reading methods include computational artist authentication and analysis of brushstrokes or texture properties. In contrast, through distant viewing methods, the similarity across an entire collection for a specific feature can be statistically visualized. Common tasks relating to this method include automatic classification, object detection, multimodal tasks, knowledge discovery in art history, and computational aesthetics. Whereas distant viewing includes the analysis of large collections, close reading involves one piece of artwork. Whilst 2D and 3D digital art is beneficial as it allows the preservation of history that would otherwise have been destroyed by events like natural disasters and war, there is the issue of who should own these 3D scans – i.e., who should own the digital copyrights. === Computer demos === Computer demos are based on computer programs, usually non-interactive. It produces audiovisual presentations. They are a novel form of art, which emerged as a consequence of the home computer revolution in the early 1980s. In the classification of digital art, they can be best described as real-time procedurally generated animated audio-visuals. This form of art does not concentrate only on the aesthetics of the final presentation, but also on the complexities and skills involved in creating the presentation. As such, it can be fully enjoyed only by persons with a relatively high knowledge level of relevant computer technologies. An example is that, as said by Hua Jin and Jie Yang, Using computer-aided design software to present the class content in art design teaching," is not to advocate computer-aided design instead of hand-drawn performance, but to make it serve the profession earlier through a more reasonable course arrangement." On the other hand, many of the created pieces of art are primarily aesthetic or amusing, and those can be enjoyed by the general public. === Digital installation art === Digital installation art constitutes a broad field of artistic practices and a variety of forms. Some resemble video installations, especially large-scale works involving projections and live video capture. By using projection techniques that enhance an audience's impression of sensory envelopment, many digital installations attempt to create immersive environments. While others go even further and attempt to facilitate a complete immersion in virtual realms. This type of installation is generally site-specific, scalable, and without fixed dimensionality, meaning it can be reconfigured to accommodate different presentation spaces. Scott Snibbe's "Boundary Functions" is an example of augmented reality digital installation art, which responds to people who enter the installation by drawing lines between people, indicating their personal space.Noah Wardrip-Fruin's "Screen"(2003) utilizes a Cave Automatic Virtual Environment (CAVE) to create an interactive, text-based digital experience that engages the viewer in a multi-sensory interaction. === Internet art and net.art === Internet art is digital art that uses the specific characteristics of the Internet and is exhibited on the Internet. The term "internet art" is included by "net art" for which artists assume that network will be refreshed through history. So the term "post-internet art" is used to exclude artworks outside of the internet media. A representative example is Protocols for Achievements, which is a digital photo frame that confronts the aestheti

    Read more →
  • Lenny (chatbot)

    Lenny (chatbot)

    Lenny is a chatbot designed to scam bait telemarketers, scammers, and other unwanted incoming calls using messages. == Background == Telemarketers may be perceived by some as annoying and wasting people's time, and some deliberately attempt to scam or defraud people. In April 2018, stats published by YouMail estimated the United States received over three billion robocalls that month. Attempts to block the callers have been hindered by Caller ID spoofing. == Features == The bot was written in 2011, and development taken over by an Alberta-based programmer known as "Mango" two years later. It is driven by sixteen pre-recorded audio clips, spoken in a soft and slow Australian accent in the manner of an elderly man. The bot's original creator stated on Reddit that in building the character he asked himself the question "What would be a telemarketer's worst nightmare?" He answered with this being a lonely old man who is up for a chat, proud of his family and can't focus on the telemarketer's goal. There is no speech recognition or artificial intelligence, and the bot's software is simple and straightforward. The first four clips are played sequentially in order to grab the telemarketer's interest and begin their sales pitch to Lenny, then the remaining twelve are played sequentially on loop until the telemarketer hangs up. The program waits for a gap of 1.5 seconds of silence before playing the next audio clip, to simulate natural breaks in the conversation. The messages are purposefully vague and open-ended so they can be applied to as many conversations as possible. They include references to Lenny's children, the state of the economy, and being interrupted by some ducks outside. According to research into the bot, around 75% of callers realise they are talking to a computer program within two minutes; however, some calls have lasted around an hour. == Distribution == Though other chatbots had been developed earlier, Lenny was the first one to be released for free on a public server and could be accessed by anyone. Recordings of conversations with the bot are widely shared online on websites such as Reddit and YouTube. Though "Mango" only intended Lenny to be used against dishonest telemarketers, such as scammers, he does not mind it being used against callers who are merely annoying. The bot has also been used against political campaigners, such as a supporter of Pierre Poilievre in the 2015 Canadian federal election.

    Read more →
  • Contract management software

    Contract management software

    Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

    Read more →