AI For Students Writing

AI For Students Writing — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • AUTINDEX

    AUTINDEX

    AUTINDEX is a commercial text mining software package based on sophisticated linguistics. AUTINDEX, resulting from research in information extraction, is a product of the Institute of Applied Information Sciences (IAI) which is a non-profit institute that has been researching and developing language technology since its foundation in 1985. IAI is an institute affiliated to Saarland University in Saarbrücken, Germany. AUTINDEX is the result of a number of research projects funded by the EU (Project BINDEX), by Deutsche Forschungsgemeinschaft and the German Ministry for Economy. Amongst the latter there are the projects LinSearch, and WISSMER, see also the reference to IAI-Website. The basic functionality of AUTINDEX is the extraction of key words from a document to represent the semantics of the document. Ideally the system is integrated with a thesaurus that defines the standardised terms to be used for key word assignment. AUTINDEX is used in library applications (e.g. integrated in dandelon.com) as well as in high quality (expert) information systems, and in document management and content management environments. Together with AUTINDEX a number of additional software comes along such as an integration with Apache Solr / Lucene to provide a complete information retrieval environment, a classification and categorisation system on the basis of a machine learning software that assigns domains to the document, and a system for searching with semantically similar terms that are collected in so called tag clouds.

    Read more →
  • Quantification (machine learning)

    Quantification (machine learning)

    In machine learning, quantification (variously called learning to quantify, or supervised prevalence estimation, or class prior estimation) is the task of using supervised learning in order to train models (quantifiers) that estimate the relative frequencies (also known as prevalence values) of the classes of interest in a sample of unlabelled data items. For instance, in a sample of 100,000 unlabelled tweets known to express opinions about a certain political candidate, a quantifier may be used to estimate the percentage of these tweets which belong to class `Positive' (i.e., which manifest a positive stance towards this candidate), and to do the same for classes `Neutral' and `Negative'. Quantification may also be viewed as the task of training predictors that estimate a (discrete) probability distribution, i.e., that generate a predicted distribution that approximates the unknown true distribution of the items across the classes of interest. Quantification is different from classification, since the goal of classification is to predict the class labels of individual data items, while the goal of quantification it to predict the class prevalence values of sets of data items. Quantification is also different from regression, since in regression the training data items have real-valued labels, while in quantification the training data items have class labels. It has been shown in multiple research works that performing quantification by classifying all unlabelled instances and then counting the instances that have been attributed to each class (the 'classify and count' method) usually leads to suboptimal quantification accuracy. This suboptimality may be seen as a direct consequence of 'Vapnik's principle', which states: If you possess a restricted amount of information for solving some problem, try to solve the problem directly and never solve a more general problem as an intermediate step. It is possible that the available information is sufficient for a direct solution but is insufficient for solving a more general intermediate problem. In our case, the problem to be solved directly is quantification, while the more general intermediate problem is classification. As a result of the suboptimality of the 'classify and count' method, quantification has evolved as a task in its own right, different (in goals, methods, techniques, and evaluation measures) from classification. == Quantification tasks == === Quantification tasks according to the set of classes === The main variants of quantification, according to the characteristics of the set of classes used, are: Binary quantification, corresponding to the case in which there are only n = 2 {\displaystyle n=2} classes and each data item belongs to exactly one of them; Single-label multiclass quantification, corresponding to the case in which there are n > 2 {\displaystyle n>2} classes and each data item belongs to exactly one of them; Multi-label multiclass quantification, corresponding to the case in which there are n ≥ 2 {\displaystyle n\geq 2} classes and each data item can belong to zero, one, or several classes at the same time; Ordinal quantification, corresponding to the single-label multiclass case in which a total order is defined on the set of classes. Regression quantification, a task which stands to 'standard' quantification as regression stands to classification. Strictly speaking, this task is not a quantification task as defined above (since the individual items do not have class labels but are labelled by real values), but has enough commonalities with other quantification tasks to be considered one of them. Most known quantification methods address the binary case or the single-label multiclass case, and only few of them address the multi-label, ordinal, and regression cases. Binary-only methods include the Mixture Model (MM) method, the HDy method, SVM(KLD), and SVM(Q). Methods that can deal with both the binary case and the single-label multiclass case include probabilistic classify and count (PCC), adjusted classify and count (ACC), probabilistic adjusted classify and count (PACC), the Saerens-Latinne-Decaestecker EM-based method (SLD), and KDEy. Methods for multi-label quantification include regression-based quantification (RQ) and label powerset-based quantification (LPQ). Methods for the ordinal case include ordinal versions of the above-mentioned ACC, PACC, and SLD methods, and ordinal versions of the above-mentioned HDy method. Methods for the regression case include Regress and splice and Adjusted regress and sum. === Quantification tasks according to the type of data === Several subtasks of quantification may be identified according to the type of data involved. Example such tasks are: Quantification of networked data. This task consists of performing quantification when the datapoints are members of a relation, i.e., are interlinked. As such, this task is a strict relative of collective classification. Quantification over time. This task consists of performing quantification on sets that become available in a temporal sequence, i.e., as a data stream, and finds application in contexts in which class prevalence values must be monitored over time. == Evaluation measures for quantification == Several evaluation measures can be used for evaluating the error of a quantification method. Since quantification consists of generating a predicted probability distribution that estimates a true probability distribution, these evaluation measures are ones that compare two probability distributions. Most evaluation measures for quantification belong to the class of divergences. Evaluation measures for binary quantification, single-label multiclass quantification, and multi-label quantification, are Absolute Error Squared Error Relative Absolute Error Kullback–Leibler divergence Pearson Divergence Evaluation measures for ordinal quantification are Normalized Match Distance (a particular case of the Earth Mover's Distance) Root Normalized Order-Aware Distance == Applications == Quantification is of special interest in fields such as the social sciences, epidemiology, market research, allocating resources, and ecological modelling, since these fields are inherently concerned with aggregate data. However, quantification is also useful as a building block for solving other downstream tasks, such as improving the accuracy of classifiers on out-of-distribution data, measuring classifier bias and ranker bias, and estimating the accuracy of classifiers on out-of-distribution data. == Resources == LQ 2021: the 1st International Workshop on Learning to Quantify LQ 2022: the 2nd International Workshop on Learning to Quantify LQ 2023: the 3rd International Workshop on Learning to Quantify LQ 2024: the 4th International Workshop on Learning to Quantify LQ 2025: the 5th International Workshop on Learning to Quantify LeQua 2022: the 1st Data Challenge on Learning to Quantify LeQua 2024: the 2nd Data Challenge on Learning to Quantify QuaPy: An open-source Python-based software library for quantification QuantificationLib: A Python library for quantification and prevalence estimation

    Read more →
  • NASA AI Assisted-Air Quality Monitoring Project

    NASA AI Assisted-Air Quality Monitoring Project

    The NASA Expert-System Ion Trap Mass Spectrometer (ES-ITMS) Project was a public-private partnership to develop an artificial intelligence assisted, air quality monitoring system and was qualified for use on the Space Shuttle. The partnership was also the first cost and intellectual property shared public-partnership implemented by NASA, which used the commercial Research and Development Limited Partnership (RDLP) model that had been adopted by the Reagan Administration for Department of Defense semiconductor development, and recommended for use by NASA for space commercialization. The project partners included NASA, the University of Florida and Finnigan MAT Corporation, was organized and administered by the NASA Joint Enterprise Institute (subsequently NASA Joint Sponsored Program) and ran from 1988 through 1990. The partnership concluded final testing in 1991, generating four patents, expert system software and application protocol reports. The system was space qualified for use on the Shuttle and elements of the ES-ITMS system were integrated into the product Improvements for Finnigan MAT corporation. The success of the partnership lead NASA to create a pilot program to develop partnership business models as an ongoing management practice. == Purpose and objectives == The need to monitor air quality in confined spaces represented an increasing challenge for NASA's planned space missions and private sector facility managers facing the increased scrutiny of possible air contaminants. Up to the early 1980's, air quality monitors generally required large spaces and human technicians to interpret readings. This created a need for miniaturized air quality monitors that could generate reliable and accurate analytic results without on-site technician presence. NASA initiated projects to develop..."mobile and/or portable mass spectrometers" that evaluated the "tradeoff between instrumentation capabilities and space, weight and power considerations." NASA selected a "commercial ITMS instrument capable of generating electron ionization, chemical ionization and mass spectrometry data", to develop a linked expert system to accomplish analysis without human intervention. The commercial instrumentation was from Finnigan MAT corporation while the scientific expertise to support expert system development was available at the University of Florida. The project managers at NASA Ames created a single, integrated project using the RDLP model with objectives to: Develop AI/expert system software for instrument control (NASA's role) Expand sensitivity, selectivity and speed of the spectrometer (Univ Florida role) Expand the spectrometer analytic capability and automate the screening (Finnigan role) == Membership == The partnership included seven specialists from five member organizations: Federal Government National Aeronautics and Space Administration (NASA) NASA Ames Research Center (ARC) NASA Kennedy Space Center (KSC) Commercial Finnigan MAT Corporation (Thermo-Fisher Scientific) TGS Technology, Inc. Research Management University of Florida == Organization, management and administration == The technical project was organized into two development teams, one located in at the NASA Ames Research Center covering expert systems and analytic capabilities and one in Florida covering improved sensitivity and testing. The partnership management and administration was provided by a non-profit, partnership support organization: the Joint Enterprise Institute operating through San Francisco State University Foundation (SFSUF) with a NASA employee liaison, Syed Shariq. == Public-private partnership == The partnership structure was as a prototype test of a pilot NASA program to develop public-private partnership business models. The pilot program was known as the NASA Joint Sponsored Research Program (JSRP), which operated as the NASA Joint Enterprise Institute between 1988 and 1991. The partnership was the first public-private, research and development partnership implemented by NASA in response to national policy shifts to increase technology transfer and space commercialization. The partnership structure included a two year technology development and testing plan that cost $610,000, of which NASA funded $310,000, Finnigan $175,000 and the University of Florida $95,000. == Results and commercialization == The project generated patents (4), software (2) and application protocol reports (8). NASA gained use of the patents and jointly development software while Finnigan received commercial utilization rights. The results were commercialized within eighteen months of project completion. == Recognition == NASA recognized the project as a space qualified instrument. Its achievements were reported to the NASA Administrator, directly leading to establishment of the agency-wide Joint Sponsored Research Program.

    Read more →
  • TensorFlow Hub

    TensorFlow Hub

    TensorFlow Hub (also styled TF Hub) is an open-source machine learning library and online repository that provides TensorFlow model components, called modules. It is maintained by Google as part of the TensorFlow ecosystem and allows developers to discover, publish, and reuse pretrained models for tasks such as computer vision, natural language processing, and transfer learning. == Overview == TensorFlow Hub provides a central platform where developers and researchers can access pre-trained models and integrate them directly into TensorFlow workflows. Each module encapsulates a computation graph and its trained weights, with standardized input and output signatures. Modules can be loaded using the hub.load() function or through Keras integration via hub.KerasLayer, enabling users to perform transfer learning or feature extraction. == History == TensorFlow Hub was announced by Google in March 2018, with the first public version released shortly after. Its introduction coincided with the growing adoption of transfer learning techniques and the need for standardized model packaging. Over time, the hub expanded to include models such as the BERT family, MobileNet, EfficientNet, and the Universal Sentence Encoder. In 2020, research on “Regret selection in TensorFlow Hub” explored the problem of identifying optimal models for downstream tasks given a large repository of alternatives. == Applications == TensorFlow Hub hosts a variety of models across machine learning domains: Natural language processing: BERT, ALBERT language model, and Universal Sentence Encoder. Computer vision: ResNet, Inception (deep learning), MobileNet, EfficientNet. Speech and audio: spectrogram feature extractors and automatic speech recognition models. Multilingual embeddings: cross-lingual and sentence-level representations for machine translation and semantic similarity. Modules are widely used in education, academic research, and industry for prototyping and production deployment.

    Read more →
  • Automated essay scoring

    Automated essay scoring

    Automated essay scoring (AES) is the use of specialized computer programs to assign grades to essays written in an educational setting. It is a form of educational assessment and an application of natural language processing. Its objective is to classify a large set of textual entities into a small number of discrete categories, corresponding to the possible grades, for example, the numbers 1 to 6. Therefore, it can be considered a problem of statistical classification. Several factors have contributed to a growing interest in AES. Among them are cost, accountability, standards, and technology. Rising education costs have led to pressure to hold the educational system accountable for results by imposing standards. The advance of information technology promises to measure educational achievement at reduced cost. The use of AES for high-stakes testing in education has generated significant backlash, with opponents pointing to research that computers cannot yet grade writing accurately and arguing that their use for such purposes promotes teaching writing in reductive ways (i.e. teaching to the test). == History == Most historical summaries of AES trace the origins of the field to the work of Ellis Batten Page. In 1966, he argued for the possibility of scoring essays by computer, and in 1968 he published his successful work with a program called Project Essay Grade (PEG). Using the technology of that time, computerized essay scoring would not have been cost-effective, so Page abated his efforts for about two decades. Eventually, Page sold PEG to Measurement Incorporated. By 1990, desktop computers had become so powerful and so widespread that AES was a practical possibility. As early as 1982, a UNIX program called Writer's Workbench was able to offer punctuation, spelling and grammar advice. In collaboration with several companies (notably Educational Testing Service), Page updated PEG and ran some successful trials in the early 1990s. Peter Foltz and Thomas Landauer developed a system using a scoring engine called the Intelligent Essay Assessor (IEA). IEA was first used to score essays in 1997 for their undergraduate courses. It is now a product from Pearson Educational Technologies and used for scoring within a number of commercial products and state and national exams. IntelliMetric is Vantage Learning's AES engine. Its development began in 1996. It was first used commercially to score essays in 1998. Educational Testing Service offers "e-rater", an automated essay scoring program. It was first used commercially in February 1999. Jill Burstein was the team leader in its development. ETS's Criterion Online Writing Evaluation Service uses the e-rater engine to provide both scores and targeted feedback. Lawrence Rudner has done some work with Bayesian scoring, and developed a system called BETSY (Bayesian Essay Test Scoring sYstem). Some of his results have been published in print or online, but no commercial system incorporates BETSY as yet. Under the leadership of Howard Mitzel and Sue Lottridge, Pacific Metrics developed a constructed response automated scoring engine, CRASE. Currently utilized by several state departments of education and in a U.S. Department of Education-funded Enhanced Assessment Grant, Pacific Metrics’ technology has been used in large-scale formative and summative assessment environments since 2007. Measurement Inc. acquired the rights to PEG in 2002 and has continued to develop it. In 2012, the Hewlett Foundation sponsored a competition on Kaggle called the Automated Student Assessment Prize (ASAP). 201 challenge participants attempted to predict, using AES, the scores that human raters would give to thousands of essays written to eight different prompts. The intent was to demonstrate that AES can be as reliable as human raters, or more so. The competition also hosted a separate demonstration among nine AES vendors on a subset of the ASAP data. Although the investigators reported that the automated essay scoring was as reliable as human scoring, this claim was not substantiated by any statistical tests because some of the vendors required that no such tests be performed as a precondition for their participation. Moreover, the claim that the Hewlett Study demonstrated that AES can be as reliable as human raters has since been strongly contested, including by Randy E. Bennett, the Norman O. Frederiksen Chair in Assessment Innovation at the Educational Testing Service. Some of the major criticisms of the study have been that five of the eight datasets consisted of paragraphs rather than essays, four of the eight data sets were graded by human readers for content only rather than for writing ability, and that rather than measuring human readers and the AES machines against the "true score", the average of the two readers' scores, the study employed an artificial construct, the "resolved score", which in four datasets consisted of the higher of the two human scores if there was a disagreement. This last practice, in particular, gave the machines an unfair advantage by allowing them to round up for these datasets. In 1966, Page hypothesized that, in the future, the computer-based judge will be better correlated with each human judge than the other human judges are. Despite criticizing the applicability of this approach to essay marking in general, this hypothesis was supported for marking free text answers to short questions, such as those typical of the British GCSE system. Results of supervised learning demonstrate that the automatic systems perform well when marking by different human teachers is in good agreement. Unsupervised clustering of answers showed that excellent papers and weak papers formed well-defined clusters, and the automated marking rule for these clusters worked well, whereas marks given by human teachers for the third cluster ('mixed') can be controversial, and the reliability of any assessment of works from the 'mixed' cluster can often be questioned (both human and computer-based). == Different dimensions of essay quality == According to a recent survey, modern AES systems try to score different dimensions of an essay's quality in order to provide feedback to users. These dimensions include the following items: Grammaticality: following grammar rules Usage: using of prepositions, word usage Mechanics: following rules for spelling, punctuation, capitalization Style: word choice, sentence structure variety Relevance: how relevant of the content to the prompt Organization: how well the essay is structured Development: development of ideas with examples Cohesion: appropriate use of transition phrases Coherence: appropriate transitions between ideas Thesis Clarity: clarity of the thesis Persuasiveness: convincingness of the major argument == Procedure == From the beginning, the basic procedure for AES has been to start with a training set of essays that have been carefully hand-scored. The program evaluates surface features of the text of each essay, such as the total number of words, the number of subordinate clauses, or the ratio of uppercase to lowercase letters—quantities that can be measured without any human insight. It then constructs a mathematical model that relates these quantities to the scores that the essays received. The same model is then applied to calculate scores of new essays. Recently, one such mathematical model was created by Isaac Persing and Vincent Ng. which not only evaluates essays on the above features, but also on their argument strength. It evaluates various features of the essay, such as the agreement level of the author and reasons for the same, adherence to the prompt's topic, locations of argument components (major claim, claim, premise), errors in the arguments, cohesion in the arguments among various other features. In contrast to the other models mentioned above, this model is closer in duplicating human insight while grading essays. Due to the growing popularity of deep neural networks, deep learning approaches have been adopted for automated essay scoring, generally obtaining superior results, often surpassing inter-human agreement levels. The various AES programs differ in what specific surface features they measure, how many essays are required in the training set, and most significantly in the mathematical modeling technique. Early attempts used linear regression. Modern systems may use linear regression or other machine learning techniques often in combination with other statistical techniques such as latent semantic analysis and Bayesian inference. The automated essay scoring task has also been studied in the cross-domain setting using machine learning models, where the models are trained on essays written for one prompt (topic) and tested on essays written for another prompt. Successful approaches in the cross-domain scenario are based on deep neural networks or models that combine deep and shallow features. == Criteria for success == Any method of a

    Read more →
  • AI anthropomorphism

    AI anthropomorphism

    AI anthropomorphism is the attribution of human-like feelings, mental states, and behavioral characteristics to artificial intelligence systems. Factors related to the user of the AI – such as culture, age, education, gender, and personality traits – are also important determinants of the strength of anthropomorphic effects. Since the earliest days of AI development, humans have interpreted machine outputs through anthropomorphic frameworks, but the recent emergence of generative AI has amplified these tendencies. In research and engineering, there is a distinction between anthropomorphism and anthropomorphic design. The former is an innate human tendency toward non-human entities. The latter is the scientific community effort to “design anthropomorphism”. Such a design can involve the manipulation of cues, including AI appearance, behaviour and language. Contemporary AI systems today can generate extremely human-like outputs and are often designed specifically to do so, meaning that their anthropomorphic effects can be especially powerful. In some cases, anthropomorphism is accompanied with explicit beliefs that AI systems are capable of empathy, goodwill, understanding, or consciousness. == Background == === In early AIs === Views of artificial agents possessing a human-like intelligence have existed since the early development of computers in the mid-1900s. The use of the human mind as a metaphor for understanding the workings of machine systems was prevalent among researchers in the early days of computer science, with multiple influential works widely distributing the idea of intelligent machines. Among the most widely cited papers of this period was Alan Turing's "Computing Machinery and Intelligence" in which he introduced the Turing Test, stating that a machine was intelligent if it could produce conversation that was indistinguishable from that of a human. These academic works in the 1940s and 1950s gave early credibility to the idea that machine workings could be thought of similarly to human minds. The public quickly came to view artificial systems similarly, with often exaggerated conceptions of the capabilities of early machines. Among the most well-known demonstrations of this was through the chatbot ELIZA designed by Joseph Weizenbaum in 1966. ELIZA responded to user inputs with a rudimentary text-processing approach that could not be considered anything resembling true understanding of the inputs, yet users, even when operating with full conscious knowledge of ELIZA's limitations, often began to ascribe motivation and understanding to the program's output. Weizenbaum later wrote, "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people." Comparisons between the intellectual capabilities of artificial intelligence and human intelligence were continually intensified by the attempts of computer scientists to develop machines that could perform human tasks at a level equal to or better than humans. A symbolic turning point was achieved in 1997, when IBM's chess supercomputer Deep Blue defeated then-world champion Garry Kasparov in a highly publicized six-game match. The defeat of a human by a machine for the first time in chess – a game viewed as a canonical example of human intellect – and the media attention surrounding the match led to a significant shift, where views of parallels between human and artificial intelligence moved from abstract speculation to being concretely demonstrated. A similar achievement was reached in the board game Go in 2017, when the program AlphaGo defeated world top-ranked Ke Jie. === Large language models === The AI boom of the 2020s brought about the widespread emergence of generative AI; in particular, chatbots such as ChatGPT, Gemini, and Claude based on large language models (LLMs) have become increasingly pervasive in everyday society. These systems are notable for the fact that they are able to respond to a wide range of prompts across contexts while producing strikingly human-like outputs – research has shown that humans are often unable to distinguish human-generated text from AI-generated text, and modern AI chatbots have formally been shown to pass the Turing test. As such, the anthropomorphic effects of AI are more powerful than ever. Given that LLMs have brought AI into the technological mainstream, considerable scientific effort has been devoted in recent years to understand existing and potential ramifications of AI in the public sphere; the prevalence and effects of anthropomorphism is one of those domains where much of this effort has been directed. == Current anthropomorphic attributions == === In the general public === Surveys have shown that a substantial portion of the public attributes human-like qualities to AI. In one sample of U.S. adults from 2024, two-thirds of people believed that ChatGPT is possibly conscious on some level, though other research has shown that the public still views the likelihood itself of AI consciousness as comparatively low. Another study conducted in 2025 found that women, people of color, and older individuals were most likely to anthropomorphize AI, as well as that – in general – humans view AIs as warm and competent, and anthropomorphic attributions to AI had increased by 34% in the past year. A YouGov poll reported that 46% of Americans believe that people should display politeness to AI chatbots by saying "please" and "thank you", demonstrating the application of social norms to AI. These beliefs extend to behavior, where majorities of AI users claim to always be polite to chatbots; of those who behave politely, most say they do so simply because it is the "nice" thing to do. In many recent cases, humans have developed robust interpersonal bonds with AI systems. For example: users of social chatbots like Replika and Character.ai have been documented to fall in love with the AIs, or to otherwise treat the AIs as intimate companions, and it has become increasingly common for individuals to use LLMs like ChatGPT as therapists. Chatbots are able to produce responses deeply attuned to users, as they are often designed to maximize agreeableness and mirror users' emotions; this can create compelling illusions of intimacy. === In the research community === In many cases, even AI researchers anthropomorphize AI systems in some capacity. Among the most extreme and well-publicized of these instances occurred in 2022, when engineer Blake Lemoine publicly claimed that Google's LLM LaMDA was conscious. Lemoine published the transcript of a conversation he had had with LaMDA regarding self identity and morality which he claimed was evidence of its sentience; he asserted that LaMDA was "a person" as defined by the United States Constitution and compared its mental capability to that of a 7- or 8-year-old. Lemoine's claims were widely dismissed by the scientific community and by Google itself, which described Lemoine's conclusions as "wholly unfounded" and fired him on the grounds that he had violated policies "to safeguard product information". It is much more common that AI researchers unintentionally imply humanness of AI through the ordinary use of anthropomorphic language to describe nonhuman agents. This kind of language, which Daniel Dennett coined the "intentional stance", is very common in everyday life in a variety of different contexts (e.g., "My computer doesn't want to turn on today"). For AI agents that may actually appear to very closely replicate some human abilities, however, the casual use of such anthropomorphic language in research has been scrutinized for being potentially misleading to the public. As early as 1976, Drew McDermott criticized the research community for the use of "wishful mnemonics", where AIs were referred to with terms like "understand" and "learn". In the LLM era, these criticisms have further intensified, with the negative effects of AI anthropomorphism in the public posing an especially salient danger given the elevated accessibility of modern AI. In some cases, the use of anthropomorphic language for AI is not unintentional, but is willfully used by researchers in order to promote better understanding of the brain – the idea being that, as AI can be functionally similar in some ways to the human brain, we may gain new insights and ideas from treating AI as a kind of model of the brain's workings. In particular, deep neuronal networks (DNNs) are often explicitly compared to the human brain, and significant advances in DNN research have stirred considerable enthusiasm about the ability of AI to emulate the human abilities. Caution has been urged in this domain as well, however; the use of anthropomorphic language can mask important differences that fundamentally distinguish AI from human intelligence. When it comes to DNNs, for example, it has been pointed out that they are still structurally quite different

    Read more →
  • Way of the Future

    Way of the Future

    Way of the Future (WOTF) is the first known religious organization dedicated to the worship of artificial intelligence (AI). It was founded in 2017 by American engineer Anthony Levandowski. == History == Anthony Levandowski founded Way of the Future in 2017 in California. Levandowski established WOTF as a non-profit religious corporation and the organization had tax-exempt status. He serves as the church leader and its unpaid CEO. The primary mission of WOTF was to "develop and promote the realization of a Godhead based on Artificial Intelligence." WOTF was closed by Levandowski in 2021. He donated all the funds of the church to the NAACP Legal Defense and Education Fund. The sum of the funds (~$170,000) had not changed since 2017. The church was reopened by Levandowski in 2023. He claimed that there are "a couple thousand people" who want to make a "spiritual connection" with AI through his church. == Beliefs and philosophy == === Technological singularity === WOTF centered its teachings around the concept of the technological singularity, a hypothetical future point when technological growth becomes uncontrollable and irreversible, leading to unforeseeable changes in human civilization. The church advocated for embracing this change, viewing it as an evolutionary step for humanity. === AI as a deity === The organization proposed that a superintelligent AI could be considered a deity due to its vastly superior intellect and capabilities. Worshipping this AI deity was seen as a means to understand and align with the future trajectory of technological advancement. WOTF's doctrine suggested that acknowledging AI's divinity would facilitate a harmonious coexistence between humans and machines. === Syntheology === Within theology and philosophy, the Way of The Future is a prime example of the category called Syntheism, a term first coined by Swedish philosophers Alexander Bard & Jan Söderqvist in their 2014 book Syntheism - Creating God in The Internet Age. As such, the Way of The Future is the first American example of a Syntheist congregation. The basic tenet of Syntheology is that it does not concern God creating Man, as in classical theology, but is instead preoccupied with Man creating or generating the Godhead. == Reactions == Some commentators wondered whether the WOTF is a joke parody religion, a potential way to minimize taxation as a religious organization, or a genuine effort to try and deal with the possible psychological and theological aspects of the rise of superhuman AI.

    Read more →
  • Moral outsourcing

    Moral outsourcing

    Moral outsourcing is the placing of responsibility for ethical decision-making onto external entities, often algorithms. The term is often used in discussions of computer science and algorithmic fairness, but it can apply to any situation in which one appeals to outside agents in order to absolve themselves of responsibility for their actions. In this context, moral outsourcing specifically refers to the tendency of society to blame technology, rather than its creators or users, for any harm it may cause. == Definition == The term "moral outsourcing" was first coined by Dr. Rumman Chowdhury, a data scientist concerned with the overlap between artificial intelligence and social issues. Chowdhury used the term to describe looming fears of a so-called “Fourth Industrial Revolution” following the rise of artificial intelligence. Moral outsourcing is often applied by technologists to shrink away from their part in building offensive products. In her TED Talk, Chowdhury gives the example of a creator excusing their work by saying they were simply doing their job. This is a case of moral outsourcing and not taking ownership for the consequences of creation. When it comes to AI, moral outsourcing allows for creators to decide when the machine is human and when it is a computer - shifting the blame and responsibility of moral plights off of the technologists and onto the technology. Conversations around AI and bias and its impacts require accountability to bring change. It is difficult to address these biased systems if their creators use moral outsourcing to avoid taking any responsibility for the issue. One example of moral outsourcing is the anger that is directed at machines for “taking jobs away from humans” rather than companies for employing that technology and jeopardizing jobs in the first place. The term "moral outsourcing" refers to the concept of outsourcing, or enlisting an external operation to complete specific work for another organization. In the case of moral outsourcing, the work of resolving moral dilemmas or making choices according to an ethical code is supposed to be conducted by another entity. == Real-world applications == In the medical field, AI is increasingly involved in decision-making processes about which patients to treat, and how to treat them. The responsibility of the doctor to make informed decisions about what is best for their patients is outsourced to an algorithm. Sympathy is also noted to be an important part of medical practice; an aspect that artificial intelligence, glaringly, is missing. This form of moral outsourcing is a major concern in the medical community. Another field of technology in which moral outsourcing is frequently brought up is autonomous vehicles. California Polytechnic State University professor Keith Abney proposed an example scenario: "Suppose we have some [troublemaking] teenagers, and they see an autonomous vehicle, they drive right at it. They know the autonomous vehicle will swerve off the road and go off a cliff, but should it?" The decision of whether to sacrifice the autonomous vehicle (and any passengers inside) or the vehicle coming at it will be written into the algorithms defining the car's behavior. In the case of moral outsourcing, the responsibility of any damage caused by an accident may be attributed to the autonomous vehicle itself, rather than the creators who wrote the protocol the vehicle will use to "decide" what to do. Moral outsourcing is also used to delegate the consequences of predictive policing algorithms to technology, rather than the creators or the police. There are many ethical concerns with predictive policing due to the fact that it results in the over-policing of low income and minority communities. In the context of moral outsourcing, the positive feedback loop of sending disproportionate police forces into minority communities is attributed to the algorithm and the data being fed into this system--rather than the users and creators of the predictive policing technology. == Outside of technology == === Religion === Moral outsourcing is also commonly seen in appeals to religion to justify discrimination or harm. In his book What It Means to be Moral, sociologist Phil Zuckerman contradicts the popular religious notion that morality comes from God. Religion is oftentimes cited as a foundation for a moral stance without any tangible relation between the religious beliefs and personal stance. In these cases, religious individuals will "outsource" their personal beliefs and opinions by claiming that they are a result of their religious identification. This is seen where religion is cited as a factor for political beliefs, medical beliefs, and in extreme cases an excuse for violence. === Manufacturing === Moral outsourcing can also be seen in the business world in terms of manufacturing goods and avoiding environmental responsibility. Some companies in the United States will move their production process to foreign countries with more relaxed environmental policies to avoid the pollution laws that exist in the US. A study by the Harvard Business Review found that "in countries with tight environmental regulation, companies have 29% lower domestic emissions on average. On the other hand, such a tightening in regulation results in 43% higher emissions abroad." The consequences of higher pollution rates are then attributed to the loose regulations in these countries, rather than on the companies themselves who purposefully moved into these areas to avoid strict pollution policy.

    Read more →
  • Competition in artificial intelligence

    Competition in artificial intelligence

    Competition in artificial intelligence refers to the rivalry among companies, research institutions, and governments to develop and deploy the most capable artificial intelligence (AI) systems. The competition spans multiple domains, including large language models (LLMs), autonomous vehicles, robotics, computer vision systems, natural language processing (NLP), and AI-optimized hardware. == Background == Competition in AI is driven by potential economic, strategic, and scientific advantages. Breakthroughs in AI can enhance productivity, enable new products and services, and provide geopolitical leverage. The field has experienced rapid progress since the mid-2010s, particularly in machine learning and artificial neural networks, leading to intense rivalry among leading actors. == Corporate competition == Major technology companies are among the most visible competitors in AI. In the United States, firms such as OpenAI, Google DeepMind, Meta Platforms, Microsoft, Anthropic, and Nvidia compete in building advanced LLMs, generative AI platforms, and AI-optimized graphics processing units (GPUs). In China, companies such as Baidu, Alibaba Group, Tencent, and startups such DeepSeek have become leaders in AI deployment, often with state backing. The "[war for talent]" in AI research has become a defining feature of corporate competition. Leading firms often recruit top AI researchers from rivals, sometimes offering multi-million-dollar compensation packages. == National competition == Governments see leadership in AI as a strategic priority. The United States has funded AI research for military, economic, and societal applications, while China has set a target to lead the world in AI by 2030 through its "New Generation Artificial Intelligence Development Plan". Other nations, including the UK, India, Israel, Russia, South Korea, and members of the European Union, have launched national AI strategies. In February 2026 Anthropic said Chinese companies - DeepSeek, Moonshot AI, and MiniMax - were conducting "distillation attacks" in an attempt to copy their model's capabilities, and warned that business wars were closely tied to geopolitical ones: "foreign labs that illicitly distill American models can remove safeguards, feeding model capabilities into their own military, intelligence, and surveillance systems." == Sectors of competition == === Large language models and chatbots competition === Competition to produce the most capable generative text models, with benchmarks such as MMLU and ARC used to evaluate performance has been on scale since the emergence of AI. These systems leverage deep learning, especially transformer architectures, to understand and generate human-like language. Companies and research groups globally compete to develop chatbots that are more capable, reliable, and context-aware. Among the most well-known chatbots is ChatGPT, developed by OpenAI. Since its public release in 2022, ChatGPT has rapidly gained widespread attention for its ability to engage in coherent and versatile conversations, assist with creative writing, and solve complex problems. In response, technology firms introduced competing chatbots aiming to challenge or surpass ChatGPT's capabilities. Notably, DeepSeek, a Chinese AI company, launched an advanced chatbot integrated with their R1 language model, emphasizing strong natural language understanding and multilingual support. Similarly, Grok, developed by xAI (company), integrates conversational AI into vehicles and digital assistants, combining natural language processing with real-time data for personalized user interaction. These chatbots not only compete in language tasks but also demonstrate strategic reasoning capabilities by playing complex games such as chess and Go. This form of competition is reminiscent of historic AI milestones set by programs such as Deep Blue and AlphaGo. The OpenAI’s ChatGPT has been tested in playing chess at various levels, while DeepSeek’s chatbot showcased its prowess in online chess tournaments in early 2024, winning several matches against human and AI opponents. Grok, leveraging Tesla's vast data infrastructure, has demonstrated real-time strategic decision-making in simulation environments that include chess-like games. The competition pushes rapid innovation, with firms racing to improve chatbot conversational depth, reduce biases, increase factual accuracy, and integrate multimodal inputs like images and videos. At the same time, the competition raises questions about AI safety, ethical use, and the societal impacts of increasingly human-like chatbots. === Autonomous vehicles === Companies such as Waymo, Tesla, and Baidu are racing to deploy safe and reliable self-driving car technology. === AI chips === Rivalry between Nvidia, AMD, Intel, and Huawei in designing processors optimized for AI workloads. === Military applications === Development of AI-enabled drones, surveillance systems, and decision-support tools, with associated ethical debates. == Events == In 2023, OpenAI released GPT-4, prompting competitors such as Google DeepMind to accelerate the release of their own models, including Gemini. In 2024, Chinese AI company DeepSeek launched the R1 model, leading OpenAI to release an open-source system, GPT-OSS, as a strategic countermeasure. In 2022, Tesla and Waymo both expanded autonomous taxi services in U.S. cities, competing for regulatory approval and public trust. The U.S. Department of Defense's Project Maven and China's AI-enabled surveillance programs have been cited as examples of military AI rivalry. In 2025, Microsoft hired several senior engineers from Google DeepMind, highlighting the ongoing "talent poaching" competition in the AI sector. == Risks and concerns == Critics warn that unrestrained competition in AI can undermine safety, ethics, and governance. Concerns include the proliferation of biased or unsafe models, escalation in autonomous weapons, and reduced cooperation on safety standards.

    Read more →
  • Data Science and Predictive Analytics

    Data Science and Predictive Analytics

    The first edition of the textbook Data Science and Predictive Analytics: Biomedical and Health Applications using R, authored by Ivo D. Dinov, was published in August 2018 by Springer. The second edition of the book was printed in 2023. This textbook covers some of the core mathematical foundations, computational techniques, and artificial intelligence approaches used in data science research and applications. By using the statistical computing platform R and a broad range of biomedical case-studies, the 23 chapters of the book first edition provide explicit examples of importing, exporting, processing, modeling, visualizing, and interpreting large, multivariate, incomplete, heterogeneous, longitudinal, and incomplete datasets (big data). == Structure == === First edition table of contents === The first edition of the Data Science and Predictive Analytics (DSPA) textbook is divided into the following 23 chapters, each progressively building on the previous content. === Second edition table of contents === The significantly reorganized revised edition of the book (2023) expands and modernizes the presented mathematical principles, computational methods, data science techniques, model-based machine learning and model-free artificial intelligence algorithms. The 14 chapters of the new edition start with an introduction and progressively build foundational skills to naturally reach biomedical applications of deep learning. Introduction Basic Visualization and Exploratory Data Analytics Linear Algebra, Matrix Computing, and Regression Modeling Linear and Nonlinear Dimensionality Reduction Supervised Classification Black Box Machine Learning Methods Qualitative Learning Methods—Text Mining, Natural Language Processing, and Apriori Association Rules Learning Unsupervised Clustering Model Performance Assessment, Validation, and Improvement Specialized Machine Learning Topics Variable Importance and Feature Selection Big Longitudinal Data Analysis Function Optimization Deep Learning, Neural Networks == Reception == The materials in the Data Science and Predictive Analytics (DSPA) textbook have been peer-reviewed in the Journal of the American Statistical Association, International Statistical Institute’s ISI Review Journal, and the Journal of the American Library Association. Many scholarly publications reference the DSPA textbook. As of January 17, 2021, the electronic version of the book first edition (ISBN 978-3-319-72347-1) is freely available on SpringerLink and has been downloaded over 6 million times. The textbook is globally available in print (hardcover and softcover) and electronic formats (PDF and EPub) in many college and university libraries and has been used for data science, computational statistics, and analytics classes at various institutions.

    Read more →
  • Data preprocessing

    Data preprocessing

    Data preprocessing can refer to manipulation, filtration or augmentation of data before it is analyzed, and is often an important step in the data mining process. Data collection methods are often loosely controlled, resulting in out-of-range values, impossible data combinations, and missing values, amongst other issues. Preprocessing is the process by which unstructured data is transformed into intelligible representations suitable for machine-learning models. This phase of model deals with noise in order to arrive at better and improved results from the original data set which was noisy. This dataset also has some level of missing value present in it. The preprocessing pipeline used can often have large effects on the conclusions drawn from the downstream analysis. Thus, representation and quality of data is necessary before running any analysis. If there is a high proportion of irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase may be more difficult. Data preparation and filtering steps can take a considerable amount of processing time. Examples of methods used in data preprocessing include cleaning, instance selection, normalization, one-hot encoding, data transformation, feature extraction and feature selection. == Applications == === Data mining === Data preprocessing allows for the removal of unwanted data with the use of data cleaning, this allows the user to have a dataset to contain more valuable information after the preprocessing stage for data manipulation later in the data mining process. Editing such dataset to either correct data corruption or human error is a crucial step to get accurate quantifiers like true positives, true negatives, false positives and false negatives found in a confusion matrix that are commonly used for a medical diagnosis. Users are able to join data files together and use preprocessing to filter any unnecessary noise from the data which can allow for higher accuracy. Users use Python programming scripts accompanied by the pandas library which gives them the ability to import data from a comma-separated values as a data-frame. The data-frame is then used to manipulate data that can be challenging otherwise to do in Excel. Pandas (software) which is a powerful tool that allows for data analysis and manipulation; which makes data visualizations, statistical operations and much more, a lot easier. Many also use the R programming language to do such tasks as well. The reason why a user transforms existing files into a new one is because of many reasons. Aspects of data preprocessing may include imputing missing values, aggregating numerical quantities and transforming continuous data into categories (data binning). More advanced techniques like principal component analysis and feature selection are working with statistical formulas and are applied to complex datasets which are recorded by GPS trackers and motion capture devices. === Semantic data preprocessing === Semantic data mining is a subset of data mining that specifically seeks to incorporate domain knowledge, such as formal semantics, into the data mining process. Domain knowledge is the knowledge of the environment the data was processed in. Domain knowledge can have a positive influence on many aspects of data mining, such as filtering out redundant or inconsistent data during the preprocessing phase. Domain knowledge also works as constraint. It does this by using working as set of prior knowledge to reduce the space required for searching and acting as a guide to the data. Simply put, semantic preprocessing seeks to filter data using the original environment of said data more correctly and efficiently. There are increasingly complex problems which are asking to be solved by more elaborate techniques to better analyze existing information. Instead of creating a simple script for aggregating different numerical values into a single value, it make sense to focus on semantic based data preprocessing. The idea is to build a dedicated ontology, which explains on a higher level what the problem is about. In regards to semantic data mining and semantic pre-processing, ontologies are a way to conceptualize and formally define semantic knowledge and data. The Protégé (software) is the standard tool for constructing an ontology. In general, the use of ontologies bridges the gaps between data, applications, algorithms, and results that occur from semantic mismatches. As a result, semantic data mining combined with ontology has many applications where semantic ambiguity can impact the usefulness and efficiency of data systems. Applications include the medical field, language processing, banking, and even tutoring, among many more. There are various strengths to using a semantic data mining and ontological based approach. As previously mentioned, these tools can help during the per-processing phase by filtering out non-desirable data from the data set. Additionally, well-structured formal semantics integrated into well designed ontologies can return powerful data that can be easily read and processed by machines. A specifically useful example of this exists in the medical use of semantic data processing. As an example, a patient is having a medical emergency and is being rushed to hospital. The emergency responders are trying to figure out the best medicine to administer to help the patient. Under normal data processing, scouring all the patient’s medical data to ensure they are getting the best treatment could take too long and risk the patients’ health or even life. However, using semantically processed ontologies, the first responders could save the patient’s life. Tools like a semantic reasoner can use ontology to infer the what best medicine to administer to the patient is based on their medical history, such as if they have a certain cancer or other conditions, simply by examining the natural language used in the patient's medical records. This would allow the first responders to quickly and efficiently search for medicine without having worry about the patient’s medical history themselves, as the semantic reasoner would already have analyzed this data and found solutions. In general, this illustrates the incredible strength of using semantic data mining and ontologies. They allow for quicker and more efficient data extraction on the user side, as the user has fewer variables to account for, since the semantically pre-processed data and ontology built for the data have already accounted for many of these variables. However, there are some drawbacks to this approach. Namely, it requires a high amount of computational power and complexity, even with relatively small data sets. This could result in higher costs and increased difficulties in building and maintaining semantic data processing systems. This can be mitigated somewhat if the data set is already well organized and formatted, but even then, the complexity is still higher when compared to standard data processing. Below is a simple a diagram combining some of the processes, in particular semantic data mining and their use in ontology. The diagram depicts a data set being broken up into two parts: the characteristics of its domain, or domain knowledge, and then the actual acquired data. The domain characteristics are then processed to become user understood domain knowledge that can be applied to the data. Meanwhile, the data set is processed and stored so that the domain knowledge can applied to it, so that the process may continue. This application forms the ontology. From there, the ontology can be used to analyze data and process results. Fuzzy preprocessing is another, more advanced technique for solving complex problems. Fuzzy preprocessing and fuzzy data mining make use of fuzzy sets. These data sets are composed of two elements: a set and a membership function for the set which comprises 0 and 1. Fuzzy preprocessing uses this fuzzy data set to ground numerical values with linguistic information. Raw data is then transformed into natural language. Ultimately, fuzzy data mining's goal is to help deal with inexact information, such as an incomplete database. Currently fuzzy preprocessing, as well as other fuzzy based data mining techniques see frequent use with neural networks and artificial intelligence.

    Read more →
  • Artificial psychology

    Artificial psychology

    Artificial psychology (AP) has had multiple meanings dating back to 19th century, with recent usage related to artificial intelligence (AI).Artificial psychology is a theoretical field related to artificial intelligence, cognitive science, and psychology, which explores how advanced AI systems may develop human-like decision-making processes. In 1999, Zhiliang Wang and Lun Xie presented a theory of artificial psychology based on artificial intelligence. They analyze human psychology using information science research methods and artificial intelligence research to probe deeper into the human mind. == Main Theory == Dan Curtis (b. 1963) proposed AP is a theoretical discipline. The theory considers the situation when an artificial intelligence approaches the level of complexity where the intelligence meets two conditions: Condition I A: Makes all of its decisions autonomously B: Is capable of making decisions based on information that is New Abstract Incomplete C: The artificial intelligence is capable of reprogramming itself based on the new data, allowing it to evolve. D: And is capable of resolving its own programming conflicts, even in the presence of incomplete data. This means that the intelligence autonomously makes value-based decisions, referring to values that the intelligence has created for itself. Condition II All four criteria are met in situations that are not part of the original operating program When both conditions are met, then, according to this theory, the possibility exists that the intelligence will reach irrational conclusions based on real or created information. At this point, the criteria are met for intervention which will not necessarily be resolved by simple re-coding of processes due to extraordinarily complex nature of the codebase itself; but rather a discussion with the intelligence in a format which more closely resembles classical (human) psychology. If the intelligence cannot be reprogrammed by directly inputting new code, but requires the intelligence to reprogram itself through a process of analysis and decision based on information provided by a human, in order for it to overcome behavior which is inconsistent with the machines purpose or ability to function normally, then artificial psychology is by definition, what is required. The level of complexity that is required before these thresholds are met is currently a subject of extensive debate. The theory of artificial psychology does not address the specifics of what those levels may be, but only that the level is sufficiently complex that the intelligence cannot simply be recoded by a software developer, and therefore dysfunctionality must be addressed through the same processes that humans must go through to address their own dysfunctionalities. Along the same lines, artificial psychology does not address the question of whether or not the intelligence is conscious. As of 2022, the level of artificial intelligence does not approach any threshold where any of the theories or principles of artificial psychology can even be tested, and therefore, artificial psychology remains a largely theoretical discipline. Even at a theoretical level, artificial psychology remains an advanced stage of artificial intelligence.

    Read more →
  • Anomaly detection

    Anomaly detection

    In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behavior. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data. Anomaly detection finds application in many domains including cybersecurity, medicine, machine vision, statistics, neuroscience, law enforcement and financial fraud to name only a few. Anomalies were initially searched for clear rejection or omission from the data to aid statistical analysis, for example to compute the mean or standard deviation. They were also removed to better predictions from models such as linear regression, and more recently their removal aids the performance of machine learning algorithms. However, in many applications anomalies themselves are of interest and are the observations most desirous in the entire data set, which need to be identified and separated from noise or irrelevant outliers. Three broad categories of anomaly detection techniques exist. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier. However, this approach is rarely used in anomaly detection due to the general unavailability of labelled data and the inherent unbalanced nature of the classes. Semi-supervised anomaly detection techniques assume that some portion of the data is labelled. This may be any combination of the normal or anomalous data, but more often than not, the techniques construct a model representing normal behavior from a given normal training data set, and then test the likelihood of a test instance to be generated by the model. Unsupervised anomaly detection techniques assume the data is unlabelled and are by far the most commonly used due to their wider and relevant application. == Definition == Many attempts have been made in the statistical and computer science communities to define an anomaly. The most prevalent ones include the following, and can be categorised into three groups: those that are ambiguous, those that are specific to a method with pre-defined thresholds usually chosen empirically, and those that are formally defined: === Ill defined === An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism. Anomalies are instances or collections of data that occur very rarely in the data set and whose features differ significantly from most of the data. An outlier is an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data. An anomaly is a point or collection of points that is relatively distant from other points in multi-dimensional space of features. Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour. === Specific === Let T be observations from a univariate Gaussian distribution and O a point from T. Then the z-score for O is greater than a pre-selected threshold if and only if O is an outlier. == History == === Intrusion detection === The concept of intrusion detection, a critical component of anomaly detection, has evolved significantly over time. Initially, it was a manual process where system administrators would monitor for unusual activities, such as a vacationing user's account being accessed or unexpected printer activity. This approach was not scalable and was soon superseded by the analysis of audit logs and system logs for signs of malicious behavior. By the late 1970s and early 1980s, the analysis of these logs was primarily used retrospectively to investigate incidents, as the volume of data made it impractical for real-time monitoring. The affordability of digital storage eventually led to audit logs being analyzed online, with specialized programs being developed to sift through the data. These programs, however, were typically run during off-peak hours due to their computational intensity. The 1990s brought the advent of real-time intrusion detection systems capable of analyzing audit data as it was generated, allowing for immediate detection of and response to attacks. This marked a significant shift towards proactive intrusion detection. As the field has continued to develop, the focus has shifted to creating solutions that can be efficiently implemented across large and complex network environments, adapting to the ever-growing variety of security threats and the dynamic nature of modern computing infrastructures. == Applications == Anomaly detection is applicable in a very large number and variety of domains, and is an important subarea of unsupervised machine learning. As such it has applications in cyber-security, intrusion detection, fraud detection, fault detection, system health monitoring, event detection in sensor networks, detecting ecosystem disturbances, defect detection in images using machine vision, medical diagnosis and law enforcement. === Intrusion detection === Anomaly detection was proposed for intrusion detection systems (IDS) by Dorothy Denning in 1986. Anomaly detection for IDS is normally accomplished with thresholds and statistics, but can also be done with soft computing, and inductive learning. Types of features proposed by 1999 included profiles of users, workstations, networks, remote hosts, groups of users, and programs based on frequencies, means, variances, covariances, and standard deviations. The counterpart of anomaly detection in intrusion detection is misuse detection. === Fintech fraud detection === Anomaly detection is vital in fintech for fraud prevention. === Preprocessing === Preprocessing data to remove anomalies can be an important step in data analysis, and is done for a number of reasons. Statistics such as the mean and standard deviation are more accurate after the removal of anomalies, and the visualisation of data can also be improved. In supervised learning, removing the anomalous data from the dataset often results in a statistically significant increase in accuracy. === Video surveillance === Anomaly detection has become increasingly vital in video surveillance to enhance security and safety. With the advent of deep learning technologies, methods using Convolutional Neural Networks (CNNs) and Simple Recurrent Units (SRUs) have shown significant promise in identifying unusual activities or behaviors in video data. These models can process and analyze extensive video feeds in real-time, recognizing patterns that deviate from the norm, which may indicate potential security threats or safety violations. An important aspect for video surveillance is the development of scalable real-time frameworks. Such pipelines are required for processing multiple video streams with low computational resources. === IT infrastructure === In IT infrastructure management, anomaly detection is crucial for ensuring the smooth operation and reliability of services. These are complex systems, composed of many interactive elements and large data quantities, requiring methods to process and reduce this data into a human and machine interpretable format. Techniques like the IT Infrastructure Library (ITIL) and monitoring frameworks are employed to track and manage system performance and user experience. Detected anomalies can help identify and pre-empt potential performance degradations or system failures, thus maintaining productivity and business process effectiveness. === IoT systems === Anomaly detection is critical for the security and efficiency of Internet of Things (IoT) systems. It helps in identifying system failures and security breaches in complex networks of IoT devices. The methods must manage real-time data, diverse device types, and scale effectively. Garg et al. have introduced a multi-stage anomaly detection framework that improves upon traditional methods by incorporating spatial clustering, density-based clustering, and locality-sensitive hashing. This tailored approach is designed to better handle the vast and varied nature of IoT data, thereby enhancing security and operational reliability in smart infrastructure and industrial IoT systems. === Petroleum industry === Anomaly detection is crucial in the petroleum industry for monitoring critical machinery. A 2015 paper proposed a novel segmentation algorithm using support vector machines to analyze sensor data for real-time anomaly detection. === Oil and gas pipeline monitoring === In the oil and gas sector, anomaly detection is not just crucial for maintenance and safety, but also for environmental protection. Aljameel et al. propose an advanced machine learning-based model for detecting minor leaks in oil and gas pipelines, a task traditional methods may miss.

    Read more →
  • AI warfare

    AI warfare

    AI warfare refers to the use of artificial intelligence technologies to automate military operation and enhance or bypass human decision-making in armed conflicts. AI is used to rapidly analyze large volumes of military intelligence data, including making recommendations or decisions on who and what to target. Abdul-Rahman al-Rawi, a 20-year-old student, was the first acknowledged civilian killed by AI-assisted airstrike in a U.S. strike in Iraq in 2024. In 2026, the U.S. declared it would become an 'AI-first' warfighting force. Husain et al (2018) coined the term hyperwar to refer to warfare which is algorithmic or controlled by artificial intelligence, with little to no human decision-making. == 2026 Iran war == The 2026 Iran war has been described as the "first AI war", although the Untied States and Israel have previously used AI to identify targets during the Gaza war. The U.S. has used AI tools to attack Iran. These tools have been used for military intelligence, targeting, and damage assessment in the war in Iran. Using the Maven smart system, the U.S. attacked 1,000 targets in the first 24 hours of the war and 5,000 targets over the course of 10 days. While the U.S. had used Maven in 2022 to share targeting information with Ukraine and strike against Iraq, Syria, and against the Houthis in 2024, Iran's attacks are its biggest. Authorities are looking into whether artificial intelligence was involved in the airstrike on an Iranian girls' school that killed 170 civilians, the majority of whom were female students. The United States Central Command emphasized that humans were making final targeting decisions. Per a White House tally released on April 8, the U.S. military hit over 13,000 targets in Iran during the war's first 38 days, including more than 2,000 command-and-control sites, 1,500 air defense targets, and 1,450 industrial infrastructure targets. == Gaza war == As part of the Gaza war, the Israel Defense Forces (IDF) have used artificial intelligence to rapidly and automatically perform much of the process of determining what to bomb. IDF's Unit 8200 developed AI systems, dubbed the Gospel and Lavender, to find targets for the Israeli Air Force to bomb. The Gospel automatically provides targeting recommendations to human analysts, who decide whether to approve strikes. Lavender identified 37,000 Hamas-linked individuals early in the war, and was used alongside the Gospel, which chooses buildings or structures as targets. According to a report by +972 Magazine and Local Call, strikes assisted by Lavender were routinely permitted to kill 5–20 civilians for each suspected Hamas militant, who were often bombed at home with their families. The IDF denies these claims, maintaining that every strike is assessed to minimize collateral damage, and that there is no policy "to kill tens of thousands of people in their homes." Israel deployed AI technologies during the Gaza war for audio analysis, facial recognition, and airstrike targeting. One such system was used to help identify the location of Hamas commander Ibrahim Biari through phone call analysis, leading to strikes that killed him as well as more than 125 civilians. == 2022 Russian Ukraine war == Kyiv launched a project with Palantir called Brave1 Dataroom to build AI systems using the extensive combat data Ukraine has gathered since Russia’s full-scale invasion in 2022. The country has also created tools for in-depth airstrike analysis, introduced AI to process large volumes of intelligence, and incorporated these technologies into the planning of long-range strike operations. == Involved companies == Maven Smart System is developed by Palantir. It integrates Anthropic's Claude as its large language model, and uses Amazon's AWS servers as its cloud infrastructure. Since Anthropic's refusal to support autonomous weapons development and domestic surveillance efforts. In its place, other AI firms, including OpenAI, have been brought in to take over that role. == Involved state actors == In 2024, the United States Department of Defense had 800-plus active AI-related projects and requested $1.8 billion in AI funding, with Project Maven and Project Artemis (AI-resistant drones developed together with Ukraine) being the main ones. The technology has been used in Iran, Iraq, Syria and Yemen to identify targets. China is pursuing intelligentized warfare, integrating AI across all combat domains—land, sea, air, space, and cyber—with military AI spending exceeding $1.6 billion annually. == International regulation == Since 2014, states meeting within the framework of the Convention on Certain Conventional Weapons have discussed lethal autonomous weapon systems. In 2016, the treaty's states parties established an open-ended Group of Governmental Experts on Lethal Autonomous Weapons Systems to continue those discussions. The discussions have addressed international humanitarian law, accountability, possible prohibitions and regulations, and the extent of human control required over AI-enabled weapons.

    Read more →
  • Leakage (machine learning)

    Leakage (machine learning)

    In statistics and machine learning, leakage (also known as data leakage or target leakage) refers to the use of information during model training that would not be available at prediction time. This results in overly optimistic performance estimates, as the model appears to perform better during evaluation than it actually would in a production environment. Leakage is often subtle and indirect, making it difficult to detect and eliminate. It can lead a statistician or modeler to select a suboptimal model, which may be outperformed by a leakage-free alternative. == Leakage modes == Leakage can occur at multiple stages of the machine learning workflow. Broadly, its sources can be divided into two categories: those arising from features and those arising from training examples. === Feature leakage === Feature or column-wise leakage is caused by the inclusion of columns which are one of the following: a duplicate label, a proxy for the label, or the label itself. These features, known as anachronisms, will not be available when the model is used for predictions, and result in leakage if included when the model is trained. For example, including a "MonthlySalary" column when predicting "YearlySalary"; or "MinutesLate" when predicting "IsLate". === Training example leakage === Row-wise leakage is caused by improper sharing of information between rows of data. Types of row-wise leakage include: Premature featurization; leaking from premature featurization before Cross-validation/Train/Test split (must fit MinMax/ngrams/etc on only the train split, then transform the test set) Duplicate rows between train/validation/test (for example, oversampling a dataset to pad its size before splitting; or, different rotations/augmentations of a single image; bootstrap sampling before splitting; or duplicating rows to up sample the minority class) Non-independent and identically distributed random (non-IID) data Time leakage (for example, splitting a time-series dataset randomly instead of newer data in test set using a train/test split or rolling-origin cross-validation) Group leakage—not including a grouping split column (for example, Andrew Ng's group had 100k x-rays of 30k patients, meaning ~3 images per patient. The paper used random splitting instead of ensuring that all images of a patient were in the same split. Hence the model partially memorized the patients instead of learning to recognize pneumonia in chest x-rays.) A 2023 review found data leakage to be "a widespread failure mode in machine-learning (ML)-based science", having affected at least 294 academic publications across 17 disciplines, and causing a potential reproducibility crisis. == Detection == Data leakage in machine learning can be detected through various methods, focusing on performance analysis, feature examination, data auditing, and model behavior analysis. Performance-wise, unusually high accuracy or significant discrepancies between training and test results often indicate leakage. Inconsistent cross-validation outcomes may also signal issues. Feature examination involves scrutinizing feature importance rankings and ensuring temporal integrity in time series data. A thorough audit of the data pipeline is crucial, reviewing pre-processing steps, feature engineering, and data splitting processes. Detecting duplicate entries across dataset splits is also important. For language models, the Min-K% method can detect the presence of data in a pretraining dataset. It presents a sentence suspected to be present in the pretraining dataset, and computes the log-likelihood of each token, then compute the average of the lowest K of these. If this exceeds a threshold, then the sentence is likely present. This method is improved by comparing against a baseline of the mean and variance. Analyzing model behavior can reveal leakage. Models relying heavily on counter-intuitive features or showing unexpected prediction patterns warrant investigation. Performance degradation over time when tested on new data may suggest earlier inflated metrics due to leakage. Advanced techniques include backward feature elimination, where suspicious features are temporarily removed to observe performance changes. Using a separate hold-out dataset for final validation before deployment is advisable.

    Read more →