AI Art That Looks Real

AI Art That Looks Real — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Luminoso

    Luminoso

    Luminoso is a Cambridge, MA-based text analytics and artificial intelligence company. It spun out of the MIT Media Lab and its crowd-sourced Open Mind Common Sense (OMCS) project. The company has raised $20.6 million in financing, and its clients include Sony, Autodesk, Scotts Miracle-Gro, and GlaxoSmithKline. == History == Luminoso was co-founded in 2010 by Dennis Clark, Jason Alonso, Robyn Speer, and Catherine Havasi, a research scientist at MIT in artificial intelligence and computational linguistics. The company builds on the knowledge base of MIT’s Open Mind Common Sense (OMCS) project, co-founded in 1999 by Havasi, who continues to serve as its director. The OCMS knowledge base has since been combined with knowledge from other crowdsourced resources to become ConceptNet. ConceptNet consists of approximately 28 million statements in 304 languages, with full support for 10 languages and moderate support for 77 languages. ConceptNet is a resource for making an AI that understands the meanings of the words people use. During the World Cup in June 2014, the company provided a widely reported real-time sentiment analysis of the U.S. vs. Germany match, analyzing 900,000 posts on Twitter, Facebook and Google+. == Applications == The company uses artificial intelligence, natural language processing, and machine learning to derive insights from unstructured data such as contact center interactions, chatbot and live chat transcripts, product reviews, open-ended survey responses, and email. Luminoso's software identifies and quantifies patterns and relationships in text-based data, including domain-specific or creative language. Rather than human-powered keyword searches of data, the software automates taxonomy creation around concepts, allowing related words and phrases to be dynamically generated and tracked. Commercial applications include analyzing, prioritizing, and routing contact center interactions; identifying consumer complaints before they begin to trend; and tracking sentiment during product launches. The software natively analyzes text in fourteen languages, as well as emoji. == Products == Luminoso's technology can be accessed via two products: Luminoso Daylight and Luminoso Compass. Luminoso Daylight enables a deep-dive analysis into batch or real-time data, whereas Luminoso Compass automates the categorization of real-time data. Both products offer a user interface as well as an API. Luminoso's products can be implemented through either a cloud-based or an on-premise solution. == Research == Luminoso continues to actively conduct research in natural language processing and word embeddings and regularly participates in evaluations such as SemEval. At SemEval 2017, Luminoso participated in Task 2, measuring the semantic similarity of word pairs within and across five languages. Its solution outperformed all competing systems in every language pair tested, with the exception of Persian. == Recognition == Luminoso has been listed as a "Cool Vendor in AI for Marketing" by Gartner, and has also been named a "Boston Artificial Intelligence Startup to Watch" by BostInno. In May 2017, Luminoso was recognized as having the Best Application for AI in the Enterprise by AI Business, and was also shortlisted as the Best AI Breakthrough and Best Innovation in NLP. == Competitors == Major competitors include Clarabridge and Lexalytics. == Investors == The company raised $1.5 million from angel investors led by Basis Technology in 2012. Its first institutional funding round of $6.5 was completed in July 2014, led by Acadia Woods with participation from Japan’s Digital Garage. The company followed that with a $10M series B funding round in December 2018, led by DVI Equity Partners, with participation from Liberty Global Ventures, DF Enterprises, Raptor Holdco, Acadia Woods Partners, and Accord Ventures, among others.

    Read more →
  • Croissant (metadata format)

    Croissant (metadata format)

    Croissant is a metadata format design to support sharing of datasets for machine learning applications. It is a platform-agnostic schema used to standardize metadata in data repositories like Hugging Face, kaggle, Dataverse and OpenML. == Structure == Croissant builds upon schema.org, uses primarily JSON-LD, and divides metadata in four "layers": Dataset Metadata, Resource, Structure and Semantic: The Dataset Metadata layer constrains which schema.org properties should be used, including additional properties, linking together the resources (files) of the dataset with general metadata, like licensing and citation information. The Resource layer describes the individual files and sets of those using two new classes, FileObject and FileSet. A FileSet may be a collection of related images. The Structure layer specifies how the files are organized in the dataset. A RecordSet class describes how resources are present, configurations that may very a lot between modality. This specification facilitates interoperability of the datasets. Finally, the Semantic layer adds information for practical reuse of the dataset, such as splits for train, test and validation subsets. It also provides a default extension for metadata related to responsible AI. The use of a standard machine-readable structure increases, for example, the discoverability of datasets in search engines such as Google Dataset Search. == History == Croissant was shared in arXiv in March 2024 and published in the proceedings of NeurIPS 2024. It started as community driven as a MLCommons Croissant Working Group, including stakeholders organizations from academia and industry, including Google, the open data institute, Sage Bionetworks and King's College London. Variations of Croissant are developed to support datasets in different areas of research, such as Geo-Croissant for geospatial datasets. Other technical extensions, such as support for RDF, soon followed.

    Read more →
  • Commonsense knowledge (artificial intelligence)

    Commonsense knowledge (artificial intelligence)

    In artificial intelligence research, commonsense knowledge consists of facts about the everyday world, such as "Lemons are sour" or "Cows say moo", that all humans are expected to know. It is currently an unsolved problem in artificial general intelligence. The first AI program to address common sense knowledge was Advice Taker in 1959 by John McCarthy. Commonsense knowledge can underpin a commonsense reasoning process, to attempt inferences such as "You might bake a cake because you want people to eat the cake." A natural language processing process can be attached to the commonsense knowledge base to allow the knowledge base to attempt to answer questions about the world. Common sense knowledge also helps to solve problems in the face of incomplete information. Using widely held beliefs about everyday objects, or common sense knowledge, AI systems make common sense assumptions or default assumptions about the unknown similar to the way people do. In an AI system or in English, this is expressed as "Normally P holds", "Usually P" or "Typically P so Assume P". For example, if we know the fact "Tweety is a bird", because we know the commonly held belief about birds, "typically birds fly," without knowing anything else about Tweety, we may reasonably assume the fact that "Tweety can fly." As more knowledge of the world is discovered or learned over time, the AI system can revise its assumptions about Tweety using a truth maintenance process. If we later learn that "Tweety is a penguin" then truth maintenance revises this assumption because we also know "penguins do not fly". == Commonsense reasoning == Commonsense reasoning simulates the human ability to use commonsense knowledge to make presumptions about the type and essence of ordinary situations they encounter every day, and to change their "minds" should new information come to light. This includes time, missing or incomplete information and cause and effect. The ability to explain cause and effect is an important aspect of explainable AI. Truth maintenance algorithms automatically provide an explanation facility because they create elaborate records of presumptions. Compared with humans, all existing computer programs that attempt human-level AI perform extremely poorly on modern "commonsense reasoning" benchmark tests such as the Winograd Schema Challenge. The problem of attaining human-level competency at "commonsense knowledge" tasks is considered to probably be "AI complete" (that is, solving it would require the ability to synthesize a fully human-level intelligence), although some oppose this notion and believe compassionate intelligence is also required for human-level AI. Common sense reasoning has been applied successfully in more limited domains such as natural language processing and automated diagnosis or analysis. == Commonsense knowledge base construction == Compiling comprehensive knowledge bases of commonsense assertions (CSKBs) is a long-standing challenge in AI research. From early expert-driven efforts like CYC and WordNet, significant advances were achieved via the crowdsourced OpenMind Commonsense project, which led to the crowdsourced ConceptNet KB. Several approaches have attempted to automate CSKB construction, most notably, via text mining (WebChild, Quasimodo, TransOMCS, Ascent), as well as harvesting these directly from pre-trained language models (AutoTOMIC). These resources are significantly larger than ConceptNet, though the automated construction mostly makes them of moderately lower quality. Challenges also remain on the representation of commonsense knowledge: Most CSKB projects follow a triple data model, which is not necessarily best suited for breaking more complex natural language assertions. A notable exception here is GenericsKB, which applies no further normalization to sentences, but retains them in full. == Applications == Around 2013, MIT researchers developed BullySpace, an extension of the commonsense knowledgebase ConceptNet, to catch taunting social media comments. BullySpace included over 200 semantic assertions based around stereotypes, to help the system infer that comments like "Put on a wig and lipstick and be who you really are" are more likely to be an insult if directed at a boy than a girl. ConceptNet has also been used by chatbots and by computers that compose original fiction. At Lawrence Livermore National Laboratory, common sense knowledge was used in an intelligent software agent to detect violations of a comprehensive nuclear test ban treaty. == Data == As an example, as of 2012 ConceptNet includes these 21 language-independent relations: IsA (An "RV" is a "vehicle" | X is an instance of a Y) UsedFor (a "cake tin" is used for "making cakes" | X is used for the purpose Y) HasA (A "rabbit" has a "tail" | X possesses Y element or feature) CapableOf (a "cook" is capable of "making baked goods" | X is capable of doing Y) Desires (a "child" desires "the aroma of baking" | X has a desire for Y) CreatedBy ("cake" is created by a "baker" | X is created by Y) PartOf (a "knife" is be part of a "knife set" | X is a part of Y) Causes ("Heat" causes "cooking"| X is what causes Y) LocatedNear (the "oven" is located near the "refrigerator" | X is located near Y) AtLocation (Somewhere a "Cook" can be at a "restaurant" | X is at the location of Y) DefinedAs (a "Cupcake" is defined as a "cake" that also has the qualities of being "small", "baked within a wrapper", and "containing only one area of frosting or icing" | X is defined as Y that also has the properties A, B & C) SymbolOf (a "heart" is a symbol of "affection" | X is a symbolic representation of Y) ReceivesAction ("cake" can receive the action of being "eaten" | X is capable of receiving action Y) HasPrerequisite ("baking" has the prerequisite of obtaining the "ingredients" | X cannot do Y unless A does B) MotivatedByGoal ("baking" is motivated by the goal of "consumption"/"eating" | X has the motivation of Y goal) CausesDesire ("baking" makesYou want to "follow recipe" | X causes the desire to do Y) MadeOf ("Cake" is made of "flour"/"eggs"/"sugar"/"oil"/etc | X is made of Y) HasFirstSubevent ("baking" has first subevent "make batter" | To do X the first thing that needs to be done is Y) HasSubevent ("eat" has subevent "swallow" | Doing X will lead to Y event following) HasLastSubevent ("sleeping" has last subevent of "waking" | Doing X ends with the event Y) == Commonsense knowledge bases == Cyc Open Mind Common Sense (data source) and ConceptNet (datastore and NLP engine) Evi Graphiq

    Read more →
  • Embodied cognitive science

    Embodied cognitive science

    Embodied cognitive science is an interdisciplinary field of research, the aim of which is to explain the mechanisms underlying intelligent behavior. It comprises three main methodologies: the modeling of psychological and biological systems in a holistic manner that considers the mind and body as a single entity; the formation of a common set of general principles of intelligent behavior; and the experimental use of robotic agents in controlled environments. == Contributors == Embodied cognitive science borrows heavily from embodied philosophy and the related research fields of cognitive science, psychology, neuroscience and artificial intelligence. Contributors to the field include: From the perspective of neuroscience, Gerald Edelman of the Neurosciences Institute at La Jolla, Francisco Varela of CNRS in France, and J. A. Scott Kelso of Florida Atlantic University From the perspective of psychology, Lawrence Barsalou, Michael Turvey, Vittorio Guidano and Eleanor Rosch From the perspective of linguistics, Gilles Fauconnier, George Lakoff, Mark Johnson, Leonard Talmy and Mark Turner From the perspective of language acquisition, Eric Lenneberg and Philip Rubin at Haskins Laboratories From the perspective of anthropology, Edwin Hutchins, Bradd Shore, James Wertsch and Merlin Donald. From the perspective of autonomous agent design, early work is sometimes attributed to Rodney Brooks or Valentino Braitenberg From the perspective of artificial intelligence, Understanding Intelligence by Rolf Pfeifer and Christian Scheier or How the Body Shapes the Way We Think, by Rolf Pfeifer and Josh C. Bongard From the perspective of philosophy, Andy Clark, Dan Zahavi, Shaun Gallagher, and Evan Thompson In 1950, Alan Turing proposed that a machine may need a human-like body to think and speak: It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English. That process could follow the normal teaching of a child. Things would be pointed out and named, etc. Again, I do not know what the right answer is, but I think both approaches should be tried. == Traditional cognitive theory == Embodied cognitive science is an alternative theory to cognition in which it minimizes appeals to computational theory of mind in favor of greater emphasis on how an organism's body determines how and what it thinks. Traditional cognitive theory is based mainly around symbol manipulation, in which certain inputs are fed into a processing unit that produces an output. These inputs follow certain rules of syntax, from which the processing unit finds semantic meaning. Thus, an appropriate output is produced. For example, a human's sensory organs are its input devices, and the stimuli obtained from the external environment are fed into the nervous system which serves as the processing unit. From here, the nervous system is able to read the sensory information because it follows a syntactic structure, thus an output is created. This output then creates bodily motions and brings forth behavior and cognition. Of particular note is that cognition is sealed away in the brain, meaning that mental cognition is cut off from the external world and is only possible by the input of sensory information. == The embodied cognitive approach == Embodied cognitive science differs from the traditionalist approach in that it denies the input-output system. This is chiefly due to the problems presented by the Homunculus argument, which concluded that semantic meaning could not be derived from symbols without some kind of inner interpretation. If some little man in a person's head interpreted incoming symbols, then who would interpret the little man's inputs? Because of the specter of an infinite regress, the traditionalist model began to seem less plausible. Thus, embodied cognitive science aims to avoid this problem by defining cognition in three ways. === Physical attributes of the body === The first aspect of embodied cognition examines the role of the physical body, particularly how its properties affect its ability to think. This part attempts to overcome the symbol manipulation component that is a feature of the traditionalist model. Depth perception, for instance, can be better explained under the embodied approach due to the sheer complexity of the action. Depth perception requires that the brain detect the disparate retinal images obtained by the distance of the two eyes. In addition, body and head cues complicate this further. When the head is turned in a given direction, objects in the foreground will appear to move against objects in the background. From this, it is said that some kind of visual processing is occurring without the need of any kind of symbol manipulation. This is because the objects appearing to move the foreground are simply appearing to move. This observation concludes then that depth can be perceived with no intermediate symbol manipulation necessary. A more poignant example exists through examining auditory perception. Generally speaking the greater the distance between the ears, the greater the possible auditory acuity. Also relevant is the amount of density in between the ears, for the strength of the frequency wave alters as it passes through a given medium. The brain's auditory system takes these factors into account as it process information, but again without any need for a symbolic manipulation system. This is because the distance between the ears for example does not need symbols to represent it. The distance itself creates the necessary opportunity for greater auditory acuity. The amount of density between the ears is similar, in that it is the actual amount itself that simply forms the opportunity for frequency alteration. Thus under consideration of the physical properties of the body, a symbolic system is unnecessary and an unhelpful metaphor. === The body's role in the cognitive process === The second aspect draws heavily from George Lakoff's and Mark Johnson's work on concepts. They argued that humans use metaphors whenever possible to better explain their external world. Humans also have a basic stock of concepts in which other concepts can be derived from. These basic concepts include spatial orientations such as up, down, front, and back. Humans can understand what these concepts mean because they can directly experience them from their own bodies. For example, because human movement revolves around standing erect and moving the body in an up-down motion, humans innately have these concepts of up and down. Lakoff and Johnson contend this is similar with other spatial orientations such as front and back too. As mentioned earlier, these basic stocks of spatial concepts are the basis in which other concepts are constructed. Happy and sad for instance are seen now as being up or down respectively. When someone says they are feeling down, what they are really saying is that they feel sad for example. Thus the point here is that true understanding of these concepts is contingent on whether one can have an understanding of the human body. So the argument goes that if one lacked a human body, they could not possibly know what up or down could mean, or how it could relate to emotional states. [I]magine a spherical being living outside of any gravitational field, with no knowledge or imagination of any other kind of experience. What could UP possibly mean to such a being? While this does not mean that such beings would be incapable of expressing emotions in other words, it does mean that they would express emotions differently from humans. Human concepts of happiness and sadness would be different because human would have different bodies. So then an organism's body directly affects how it can think, because it uses metaphors related to its body as the basis of concepts. === Interaction of local environment === A third component of the embodied approach looks at how agents use their immediate environment in cognitive processing. Meaning, the local environment is seen as an actual extension of the body's cognitive process. The example of a personal digital assistant (PDA) is used to better imagine this. Echoing functionalism (philosophy of mind), this point claims that mental states are individuated by their role in a much larger system. So under this premise, the information on a PDA is similar to the information stored in the brain. So then if one thinks information in the brain constitutes mental states, then it must follow that information in the PDA is a cognitive state too. Consider also the role of pen and paper in a complex multiplication problem. The pen and paper are so involved in the cognitive process of solving the problem that it seems ridiculous to say they are somehow different from the process, in very much the same way the PDA is used for information like the brain. Another example examines how humans control and manipulate their environment

    Read more →
  • Threat actor

    Threat actor

    In cybersecurity and risk assessment, a threat actor (or threat agents, attackers, or adversaries) is a person, group, organisation, state, or other entity with the ability to cause, carry, transmit, support, or exploit a threat. Threat actors are commonly analysed according to their motivations, resources, technical capability, access to systems, relationship to a target, and degree of connection to state authority. They may exploit vulnerabilities, conduct social engineering, steal or monetise data, disrupt operations, or support other actors who carry out such activity. Because the term covers a wide range of actors, researchers and security organisations use taxonomies that distinguish between groups such as cybercriminals, state-linked actors, ideologically motivated actors, thrill seekers or trolls, insiders, and competitors. Threat actor classifications are used in risk management, cyber threat intelligence, and incident response to connect observed behaviour with possible objectives and likely future activity. The categories are not always mutually exclusive: the same actor may combine criminal, ideological, commercial, or state-linked motivations, and different organisations may use different names for similar actors. == Risk assessment and security management == In risk assessment, threat actor analysis is used to identify who or what may create, carry, transmit, support, or exploit a threat, and how that actor relates to the system being assessed. Rausand and Haugen classify threat actors by their relationship to the system, distinguishing between internal and external actors, and by intent, distinguishing between intentional and unintentional actors. Threat actor classification may also support incident investigation. Rogers argued that actor categories could be inferred from observable case points, such as tools used, messages left, data targeted, forensic knowledge, and the degree of damage, allowing investigators to assess likely motivation and skill level. Later work similarly linked actor classification to operational analysis. Chng, Lu, Kumar and Yau proposed a framework connecting hacker types, motivations and typical strategies, arguing that observed behaviour before or during an attack can help analysts infer the likely type of actor involved. At the strategic level, actor analysis may consider an actor's resources, capabilities, degree of state involvement, motivations and objectives. == Landscape == The United Nations Institute for Disarmament Research has described the contemporary cyberthreat landscape as involving an increasingly diverse and interconnected set of actors, including state-led operations, cybercriminal syndicates, ideological hacktivists, commercial cyber mercenaries, private companies and civilian volunteers. Its 2026 report argued that these actors vary in resources, technical sophistication and relationships with states, making it traditional distinctions between state, civilian combatant roles, and legitimate and illegitimate conduct harder to apply. == Academic taxonomies == Early taxonomies classified hackers by activity, skill, motivation, or criminal profile. Landreth proposed six categories based on activity: novice, student, tourist, crasher, and thief. Hollinger classified computer misuse into pirates, browsers, and crackers, describing a progression from less-skilled activity to more technically serious offences. Chantler used attributes including activity, skill, knowledge, motivation, and duration of involvement to distinguish between an elite group, neophytes, and "losers and lamers". Parker proposed seven profiles of cybercriminals: pranksters, hacksters, malicious hackers, personal problem solvers, career criminals, extreme advocates, and malcontents, addicts, and irrational or incompetent people. In 2000, Marc Rogers proposed a taxonomy of hackers with seven, non-mutually-exclusive categories: newbie/tool kit users, cyber-punks, internals, coders, old guard hackers, professional criminals, and cyber-terrorists. Rausand and Haugen distinguish between internal and external threat actors, and between intentional and unintentional threat actors. Internal actors have some relationship with, access to, or position inside the system or organisation, while external actors operate from outside it. Intentional actors seek to create, exploit, or support a threat event, whereas unintentional actors may cause or enable a threat event through error, negligence, accident, or lack of awareness. Rogers later revised his hacker taxonomy into Novices, Cyber-punks, Internals, Petty Thieves, Virus Writers, Old Guard hackers, Professional Criminals, Information Warriors, and, more tentatively, Political Activists. In the model, motivation is grouped into four broad domains: curiosity, notoriety, revenge, and financial gain. A 2022 review by Chng, Lu, Kumar and Yau examined 11 hacker typologies published over three decades and proposed a unified framework linking hacker types, motivations, and strategies. The framework identified 13 hacker types and seven motivations, and argued that observed strategies during an attack can help analysts infer the likely type of actor involved. == Government taxonomies == Taxonomies of threat actors by governments are much more likely to include state-level threat actors. In the United States the National Institute of Standards and Technology (NIST) uses the term threat source in its risk-assessment guidance: organisations are directed to identify and characterise threat sources of concern, including capability, intent and targeting for adversarial threat sources, and the range of effects for non-adversarial threat sources. NIST treats threat-source identification as part of the risk-assessment process, alongside identifying threat events, vulnerabilities, likelihood and impact. In the EU, European Union Agency for Cybersecurity publishes the annual ENISA Threat Landscape, which analyses cyber incidents and adversary behaviour affecting the European Union. The 2025 report analysed selected incidents from the previous year and grouped activity around cybercrime, state-aligned activity, foreign information manipulation and interference, and hacktivism. In ENISA's 2025 analysis, hacktivist activity dominated reporting, representing almost 80% of recorded incidents and consisting mainly of low-level distributed denial-of-service operations. ENISA also reported increasing convergence between hacktivism, cybercrime and state-nexus activity, including state-aligned use of hacktivist personas, hacktivist adoption of ransomware, and false-flag or impersonation activity. At the UN level, A 2026 report by the United Nations Institute for Disarmament Research described the cyberthreat landscape as involving state-led operations, cybercriminal syndicates, ideological hacktivists, commercial cyber mercenaries, and civilian volunteers, with actors varying in resources, technical sophistication, and links to states. Canada defines threat actors as states, groups, or individuals who aim to cause harm by exploiting a vulnerability with malicious intent. A threat actor must be trying to gain access to information systems to access or alter data, devices, systems, or networks. The Japanese government's National Centre of Incident Readiness and Strategy (NISC) was established in 2015 to create a "free, fair and secure cyberspace" in Japan. The NICS created a cybersecurity strategy in 2018 that outlines nation-states and cybercrime to be some of the most key threats. It also indicates that terrorist usage of the cyberspace needs to be monitored and understood. The Security Council of the Russian Federation published the cyber security strategy doctrine in 2016. This strategy highlights the following threat actors as a risk to cyber security measures: nation-state actors, cyber criminals, and terrorists. == Techniques == Threat actors use techniques like Social engineering (security), and Phishing, alongside technical exploits like Cross-site scripting, SQL injection, and denial-of-service attacks. == Limitations == In practice, actor categories may overlap (Edward Snowden for example), and the same activity may combine features associated with hacktivism, cybercrime and state-linked operations. The lines between hacktivism, cybercrime and state-nexus activity had continued to blur, with shared toolsets, overlapping methods, fake personas, hacktivist adoption of ransomware, and cybercriminal or state-linked actors masquerading as other groups. Threat actor analysis also has limits as a risk-management method. NIST notes that risk assessments depend on their purpose, scope, assumptions, constraints, information sources, risk model and analytic approach, and that assessments are tied to particular time frames and organisational contexts. NIST also warns that simple threat-vulnerability pairing may be undesirable or problematic where there are many threats and vulnerabilities, and recom

    Read more →
  • AI literacy

    AI literacy

    AI literacy or artificial intelligence literacy is "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate effectively with AI; and use AI as a tool online, at home, and in the workplace." AI is employed in a variety of applications, including self-driving automobiles, virtual assistants and text generation by generative AI models. Users of these tools should be able to make informed decisions. AI literacy may have an impact on students' future employment prospects. With the rise of generative AI platforms, AI literacy has become a topic of conversation in the field of education. Some think AI literacy is essential for school and college students, while others restrict or prohibit the use of AI in assignments, viewing it as a form of academic dishonesty. However, many researchers and educational institutions promote a more nuanced approach, encouraging critical engagement with AI while developing policies that balance academic integrity with opportunities for learning. == Definitions == Other definitions of AI literacy include the ability to understand, use, monitor, and critically reflect on AI applications. That use of the term usually refers to teaching skills and knowledge to the general public, particularly those who are not adept in AI and the ability to understand, use, evaluate, and ethically navigate AI. As research into AI literacy is still emerging and focused on developing context-specific skills, there is not yet a single, broadly agreed-upon definition. AI literacy is linked to other forms of literacy. AI literacy requires digital literacy, whereas scientific and computational literacy may inform it. Data literacy also significantly overlaps with it. == Categories == AI literacy encompasses multiple categories, including a theoretical understanding of how artificial intelligence works, the usage of artificial intelligence technologies, and the critical appraisal of artificial intelligence, and its ethics. === Know and understand AI === Knowledge and understanding of AI refers to a basic understanding of what artificial intelligence is and how it works. This includes familiarity with machine learning algorithms and the limitations and biases present in AI systems. Users who know and understand AI should be familiar with various technologies that use artificial intelligence, including cognitive systems, robotics and machine learning. This includes recognizing that large language models (LLMs) are machine learning models trained on extensive datasets which generate new text rather than retrieving pre-written responses. === Use and apply AI === Using and applying AI refers to the ability to use AI tools to solve problems and perform tasks such as programming and analyzing big data. Some consider prompt engineering, the practice of designing effective prompts to guide generative AI platforms more effectively, as another competency within AI literacy. === Evaluate and create AI === Evaluation and creation refers to the ability to critically evaluate the quality and reliability of AI systems. It also refers to designing and building fair and ethical AI systems. To evaluate correctly, users should also learn in which areas AI is strong, and in which areas it is weak. === AI ethics === AI ethics refers to understanding the moral implications of AI, and the making informed decisions regarding the use of AI tools. This area includes considerations such as: Accountability: Hold AI actors accountable for the operation of AI systems and adherence to ethical ideals. Accuracy: Identify and report sources of error and uncertainty in algorithms and data. Auditability: Enable other parties to audit and assess algorithm behavior via transparent information sharing. Explainability: Make sure that algorithmic judgments and the underlying data can be presented in simple language. Fairness: Prevent biases and consider varied viewpoints. To do so, increase the diversity of researchers in the field. Human Centricity and Well-being: Prioritize human well-being in AI development and deployment. Human rights Alignment: Ensure that technology do not infringe internationally recognized human rights. Inclusivity: Make AI accessible to everyone. Progress: Choose high value initiatives. Responsibility, accountability, and transparency: Foster trust via responsibility, accountability, and fairness. Robustness and Security: Make AI systems safe, secure, and resistant to manipulation or data breach. Sustainability: Choose implementations that generate long-term, useful benefits. Environmental Implications: How this tool impacts the environment, any restrictions or laws, if this impact is worth the effects or not. === Enabling AI === Support AI by developing associated knowledge and skills such as programming and statistics. == Promoting AI literacy == Several governments have recognized the need to promote AI literacy, including among adults. Such programs have been published in the United States, China, Germany and Finland. Programs intended for the general public usually consist of short and easy to understand online study units. Programs intended for children are usually project-based. Programs for students at colleges and universities often address the specific professional needs of the student, depending on their field of study. Beyond the education system, AI literacy can also be developed in the community, for example in museums. === Schools === Schools use diverse pedagogies to promote AI literacy. These include: Performing a Turing test with an intelligent agent Creating chatbots Building apps using Blockly-based programming Project-based learning Building robots Data visualization Training AI models Artificial intelligence curricula can improve students' understanding of topics such as machine learning, neural networks, and deep learning. === Higher education === Before the second decade of the 21st century, artificial intelligence was studied mainly in STEM courses. Later, projects emerged to increase artificial intelligence education, specifically to promote AI literacy. Most courses start with one or more study units that deal with basic questions such as what artificial intelligence is, where it comes from, what it can do and what it can't do. Most courses also refer to machine learning and deep learning. Some of the courses deal with moral issues in artificial intelligence. In Ireland, the Higher Education Authority published Generative AI in Higher Education Teaching & Learning: Policy Framework in December 2025, which encouraged higher education institutions to embed AI literacy across programmes as a core graduate attribute. ==== Disciplinary policy ==== As a response to the increase of generative AI use in education, several disciplines formed committees or task forces to examine context-specific approaches toward AI literacy. In spring 2025, the Modern Language Association and Conference on College Composition and Communication Joint Task Force finished development of three working papers, a guide on AI literacy for students, and a collection of resources addressing AI use in writing. The task force emphasized the need for "a culture of critical AI literacy" and included guidelines not only for students but also educators and institutions, highlighting the need for modeling ethical AI use in planning processes. Similarly, a committee formed by the American Historical Association Council published "Guiding Principles for Artificial Intelligence in History Education" which encouraged "clear and transparent engagement with generative AI." The guidelines demonstrate the value of criticality when working with generative AI in thinking and research.

    Read more →
  • Random feature

    Random feature

    Random features (RF) are a technique used in machine learning to approximate kernel methods, introduced by Ali Rahimi and Ben Recht in their 2007 paper "Random Features for Large-Scale Kernel Machines", and extended by. RF uses a Monte Carlo approximation to kernel functions by randomly sampled feature maps. It is used for datasets that are too large for traditional kernel methods like support vector machine, kernel ridge regression, and gaussian process. == Mathematics == === Kernel method === Given a feature map ϕ : R d → V {\textstyle \phi :\mathbb {R} ^{d}\to V} , where V {\textstyle V} is a Hilbert space (more specifically, a reproducing kernel Hilbert space), the kernel trick replaces inner products in feature space ⟨ ϕ ( x i ) , ϕ ( x j ) ⟩ V {\displaystyle \langle \phi (x_{i}),\phi (x_{j})\rangle _{V}} by a kernel function k ( x i , x j ) : R d × R d → R {\displaystyle k(x_{i},x_{j}):\mathbb {R} ^{d}\times \mathbb {R} ^{d}\to \mathbb {R} } Kernel methods replaces linear operations in high-dimensional space by operations on the kernel matrix: K X := [ k ( x i , x j ) ] i , j ∈ 1 : N {\displaystyle K_{X}:=[k(x_{i},x_{j})]_{i,j\in 1:N}} where N {\textstyle N} is the number of data points. === Random kernel method === The problem with kernel methods is that the kernel matrix K X {\textstyle K_{X}} has size N × N {\textstyle N\times N} . This becomes computationally infeasible when N {\textstyle N} reaches the order of a million. The random kernel method replaces the kernel function k {\textstyle k} by an inner product in low-dimensional feature space R D {\textstyle \mathbb {R} ^{D}} : k ( x , y ) ≈ ⟨ z ( x ) , z ( y ) ⟩ {\displaystyle k(x,y)\approx \langle z(x),z(y)\rangle } where z {\textstyle z} is a randomly sampled feature map z : R d → R D {\textstyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{D}} . This converts kernel linear regression into linear regression in feature space, kernel SVM into SVM in feature space, etc. Since we have K X ≈ Z X T Z X {\displaystyle K_{X}\approx Z_{X}^{T}Z_{X}} where Z X = [ z ( x 1 ) , … , z ( x N ) ] {\displaystyle Z_{X}=[z(x_{1}),\dots ,z(x_{N})]} , these methods no longer involve matrices of size O ( N 2 ) {\textstyle O(N^{2})} , but only random feature matrices of size O ( D N ) {\textstyle O(DN)} . == Random Fourier feature == === Radial basis function kernel === The radial basis function (RBF) kernel on two samples x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} is defined as k ( x i , x j ) = exp ⁡ ( − ‖ x i − x j ‖ 2 2 σ 2 ) {\displaystyle k(x_{i},x_{j})=\exp \left(-{\frac {\|x_{i}-x_{j}\|^{2}}{2\sigma ^{2}}}\right)} where ‖ x i − x j ‖ 2 {\displaystyle \|x_{i}-x_{j}\|^{2}} is the squared Euclidean distance and σ {\displaystyle \sigma } is a free parameter defining the shape of the kernel. It can be approximated by a random Fourier feature map z : R d → R 2 D {\displaystyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{2D}} : z ( x ) := 1 D [ cos ⁡ ⟨ ω 1 , x ⟩ , sin ⁡ ⟨ ω 1 , x ⟩ , … , cos ⁡ ⟨ ω D , x ⟩ , sin ⁡ ⟨ ω D , x ⟩ ] T {\displaystyle z(x):={\frac {1}{\sqrt {D}}}[\cos \langle \omega _{1},x\rangle ,\sin \langle \omega _{1},x\rangle ,\ldots ,\cos \langle \omega _{D},x\rangle ,\sin \langle \omega _{D},x\rangle ]^{T}} where ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are IID samples from the multidimensional normal distribution N ( 0 , σ − 2 I ) {\displaystyle N(0,\sigma ^{-2}I)} . Since cos , sin {\displaystyle \cos ,\sin } are bounded, there is a stronger convergence guarantee by Hoeffding's inequality. === Random Fourier features === By Bochner's theorem, the above construction can be generalized to arbitrary positive definite shift-invariant kernel k ( x , y ) = k ( x − y ) {\displaystyle k(x,y)=k(x-y)} . Define its Fourier transform p ( ω ) = 1 2 π ∫ R d e − j ⟨ ω , Δ ⟩ k ( Δ ) d Δ {\displaystyle p(\omega )={\frac {1}{2\pi }}\int _{\mathbb {R} ^{d}}e^{-j\langle \omega ,\Delta \rangle }k(\Delta )d\Delta } then ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are sampled IID from the probability distribution with probability density p {\displaystyle p} . This applies for other kernels like the Laplace kernel and the Cauchy kernel. === Neural network interpretation === Given a random Fourier feature map z {\displaystyle z} , training the feature on a dataset by featurized linear regression is equivalent to fitting complex parameters θ 1 , … , θ D ∈ C {\displaystyle \theta _{1},\dots ,\theta _{D}\in \mathbb {C} } such that f θ ( x ) = R e ( ∑ k θ k e i ⟨ ω k , x ⟩ ) {\displaystyle f_{\theta }(x)=\mathrm {Re} \left(\sum _{k}\theta _{k}e^{i\langle \omega _{k},x\rangle }\right)} which is a neural network with a single hidden layer, with activation function t ↦ e i t {\displaystyle t\mapsto e^{it}} , zero bias, and the parameters in the first layer frozen. In the overparameterized case, when 2 D ≥ N {\displaystyle 2D\geq N} , the network linearly interpolates the dataset { ( x i , y i ) } i ∈ 1 : N {\displaystyle \{(x_{i},y_{i})\}_{i\in 1:N}} , and the network parameters is the least-norm solution: θ ^ = arg ⁡ min θ ∈ C D , f θ ( x k ) = y k ∀ k ∈ 1 : N ‖ θ ‖ {\displaystyle {\hat {\theta }}=\arg \min _{\theta \in \mathbb {C} ^{D},f_{\theta }(x_{k})=y_{k}\forall k\in 1:N}\|\theta \|} At the limit of D → ∞ {\displaystyle D\to \infty } , the L2 norm ‖ θ ^ ‖ → ‖ f K ‖ H {\displaystyle \|{\hat {\theta }}\|\to \|f_{K}\|_{H}} where f K {\displaystyle f_{K}} is the interpolating function obtained by the kernel regression with the original kernel, and ‖ ⋅ ‖ H {\displaystyle \|\cdot \|_{H}} is the norm in the reproducing kernel Hilbert space for the kernel. == Other examples == === Random binning features === A random binning features map partitions the input space using randomly shifted grids at randomly chosen resolutions and assigns to an input point a binary bit string that corresponds to the bins in which it falls. The grids are constructed so that the probability that two points x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} are assigned to the same bin is proportional to K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . The inner product between a pair of transformed points is proportional to the number of times the two points are binned together, and is therefore an unbiased estimate of K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . Since this mapping is not smooth and uses the proximity between input points, Random Binning Features works well for approximating kernels that depend only on the L 1 {\displaystyle L_{1}} distance between datapoints. === Orthogonal random features === Orthogonal random features uses a random orthogonal matrix instead of a random Fourier matrix. == Historical context == In NIPS 2006, deep learning had just become competitive with linear models like PCA and linear SVMs for large datasets, and people speculated about whether it could compete with kernel SVMs. However, there was no way to train kernel SVM on large datasets. The two authors developed the random feature method to train those. It was then found that the O ( 1 / D ) {\displaystyle O(1/D)} variance bound did not match practice: the variance bound predicts that approximation to within 0.01 {\displaystyle 0.01} requires D ∼ 10 4 {\displaystyle D\sim 10^{4}} , but in practice required only ∼ 10 2 {\displaystyle \sim 10^{2}} . Attempting to discover what caused this led to the subsequent two papers.

    Read more →
  • Learning curve (machine learning)

    Learning curve (machine learning)

    In machine learning (ML), a learning curve (or training curve) is a graphical representation that shows how a model's performance on a training set (and usually a validation set) changes with the number of training iterations (epochs) or the amount of training data. Typically, the number of training epochs or training set size is plotted on the x-axis, and the value of the loss function (and possibly some other metric such as the cross-validation score) on the y-axis. Synonyms include error curve, experience curve, improvement curve and generalization curve. More abstractly, learning curves plot the difference between learning effort and predictive performance, where "learning effort" usually means the number of training samples, and "predictive performance" means accuracy on testing samples. Learning curves have many useful purposes in ML, including: choosing model parameters during design, adjusting optimization to improve convergence, and diagnosing problems such as overfitting (or underfitting). Learning curves can also be tools for determining how much a model benefits from adding more training data, and whether the model suffers more from a variance error or a bias error. If both the validation score and the training score converge to a certain value, then the model will no longer significantly benefit from more training data. == Formal definition == When creating a function to approximate the distribution of some data, it is necessary to define a loss function L ( f θ ( X ) , Y ) {\displaystyle L(f_{\theta }(X),Y)} to measure how good the model output is (e.g., accuracy for classification tasks or mean squared error for regression). We then define an optimization process which finds model parameters θ {\displaystyle \theta } such that L ( f θ ( X ) , Y ) {\displaystyle L(f_{\theta }(X),Y)} is minimized, referred to as θ ∗ {\displaystyle \theta ^{}} . === Training curve for amount of data === If the training data is { x 1 , x 2 , … , x n } , { y 1 , y 2 , … y n } {\displaystyle \{x_{1},x_{2},\dots ,x_{n}\},\{y_{1},y_{2},\dots y_{n}\}} and the validation data is { x 1 ′ , x 2 ′ , … x m ′ } , { y 1 ′ , y 2 ′ , … y m ′ } {\displaystyle \{x_{1}',x_{2}',\dots x_{m}'\},\{y_{1}',y_{2}',\dots y_{m}'\}} , a learning curve is the plot of the two curves i ↦ L ( f θ ∗ ( X i , Y i ) ( X i ) , Y i ) {\displaystyle i\mapsto L(f_{\theta ^{}(X_{i},Y_{i})}(X_{i}),Y_{i})} i ↦ L ( f θ ∗ ( X i , Y i ) ( X i ′ ) , Y i ′ ) {\displaystyle i\mapsto L(f_{\theta ^{}(X_{i},Y_{i})}(X_{i}'),Y_{i}')} where X i = { x 1 , x 2 , … x i } {\displaystyle X_{i}=\{x_{1},x_{2},\dots x_{i}\}} === Training curve for number of iterations === Many optimization algorithms are iterative, repeating the same step (such as backpropagation) until the process converges to an optimal value. Gradient descent is one such algorithm. If θ i ∗ {\displaystyle \theta _{i}^{}} is the approximation of the optimal θ {\displaystyle \theta } after i {\displaystyle i} steps, a learning curve is the plot of i ↦ L ( f θ i ∗ ( X , Y ) ( X ) , Y ) {\displaystyle i\mapsto L(f_{\theta _{i}^{}(X,Y)}(X),Y)} i ↦ L ( f θ i ∗ ( X , Y ) ( X ′ ) , Y ′ ) {\displaystyle i\mapsto L(f_{\theta _{i}^{}(X,Y)}(X'),Y')}

    Read more →
  • Cross-language information retrieval

    Cross-language information retrieval

    Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects (text, and other media) of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources: Dictionary-based CLIR techniques Parallel corpora based CLIR techniques Comparable corpora based CLIR techniques Machine translator based CLIR techniques CLIR systems have improved so much that the most accurate multi-lingual and cross-lingual adhoc information retrieval systems today are nearly as effective as monolingual systems. Other related information access tasks, such as media monitoring, information filtering and routing, sentiment analysis, and information extraction require more sophisticated models and typically more processing and analysis of the information items of interest. Much of that processing needs to be aware of the specifics of the target languages it is deployed in. Mostly, the various mechanisms of variation in human language pose coverage challenges for information retrieval systems: texts in a collection may treat a topic of interest but use terms or expressions which do not match the expression of information need given by the user. This can be true even in a mono-lingual case, but this is especially true in cross-lingual information retrieval, where users may know the target language only to some extent. The benefits of CLIR technology for users with poor to moderate competence in the target language has been found to be greater than for those who are fluent. Specific technologies in place for CLIR services include morphological analysis to handle inflection, decompounding or compound splitting to handle compound terms, and translations mechanisms to translate a query from one language to another. The first workshop on CLIR was held in Zürich during the SIGIR-96 conference. Workshops have been held yearly since 2000 at the meetings of the Cross Language Evaluation Forum (CLEF). Researchers also convene at the annual Text Retrieval Conference (TREC) to discuss their findings regarding different systems and methods of information retrieval, and the conference has served as a point of reference for the CLIR subfield. Early CLIR experiments were conducted at TREC-6, held at the National Institute of Standards and Technology (NIST) on November 19–21, 1997. Google Search had a cross-language search feature that was removed in 2013.

    Read more →
  • Spatial embedding

    Spatial embedding

    Spatial embedding is one of feature learning techniques used in spatial analysis where points, lines, polygons or other spatial data types. representing geographic locations are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per geographic object to a continuous vector space with a much lower dimension. Such embedding methods allow complex spatial data to be used in neural networks and have been shown to improve performance in spatial analysis tasks == Embedded data types == Geographic data can take many forms: text, images, graphs, trajectories, polygons. Depending on the task, there may be a need to combine multimodal data from different sources. The next section describes examples of different types of data and their uses. === Text === Geolocated posts on social media can be used to acquire a library of documents bound to a given place that can be later transformed to embedded vectors using word embedding techniques. === Image === Satellites and aircraft collect digital spatial data acquired from remotely sensed images which can be used in machine learning. They are sometimes hard to analyse using basic image analysis methods and convolutional neural networks can be used to acquire an embedding of images bound to a given geographical object or a region. === Point === A single point of interest (POI) can be assigned multiple features that can be used in machine learning. These could be demographic, transportation, meteorological, or economic data, for example. When embedding single points, it is common to consider the entire set of available points as nodes in a graph. === Line / multiline === Among other things, motion trajectories are represented as lines (multilines). Individual trajectories are embedded taking into account travel time, distances and also features of points visited along the way. Embedding of trajectories allows to improve performance of such tasks as clustering and also categorization. === Polygon === The geographic areas analyzed in machine learning are defined by both administrative boundaries and top-down division into grids of regular shapes such as rectangles, for example. Both types are represented as polygons and, like points, can be assigned different demographic, transportation, or economic features. A polygon can also have features related to the size of the area or shape it represents. === Graph === An example domain where graph representation is used is the street layout in a city, where vertices can be intersections and edges can be roads. The vertices can also be destination points like public transport stops or important points in the city, and the edges represent the flow between them. Embedding graphs or single vertices allows to improve accuracy of analysis methods in which the treated geographical domain can be represented as a network. == Usage == POI recommendation - generating personalized point of interest recommendations based on user preferences. Next/future location prediction - prediction of the next location a person will go to based on their historical trajectory. Zone functions classification - based on different mobility of people or POI distribution a function of a given area in a city can be predicted. Crime prediction - estimation of crime rate in different regions of a city. Local event detection - studying spatio-temporal changes in embeddings can provide valuable information in detection of local event occurring in specific location. Regional mobility popularity prediction - analysis of mobility can show patterns in popularity of different regions in a city. Shape matching - finding a similar shape of given polygon, for example finding building with the same shape as input building. Travel time estimation - predicting estimated travel time given current traffic conditions and special occurring events. Time estimation for on-demand food delivery - estimation of delivery time when placing an order through the website. == Temporal aspect == Some of the data analyzed has a timestamp associated with it. In some cases of data analysis this information is omitted and in others it is used to divide the set into groups. The most common division is the separation of weekdays from weekends or division into hours of the day. This is particularly important in the analysis of mobility data, because the characteristics of mobility during the week and at different times of the day are very different from each other. Another area in which time division into, for example, individual months can be used is in the analysis of tourism of a given region. In order to take such a split into account, embedding methods treat the time stamp specifically or separate versions of the model are developed for different subgroups of the analyzed set.

    Read more →
  • Time series

    Time series

    In mathematics, a time series is a sequence of data points indexed, listed, or graphed in chronological order. Most commonly, a time series consists of observations recorded at successive equally spaced points in time. Thus, it represents a form of discrete-time data. A time series may describe measurements collected over seconds, days, years, or even centuries. Common examples include heights of ocean tides, counts of sunspots, daily temperature readings, and the closing values of stock market indices such as the Dow Jones Industrial Average. A time series is often visualized using a run chart (a type of temporal line chart), which helps identify patterns such as trends, seasonal effects, and irregular fluctuations. Time series are widely used in statistics, actuarial science, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and many other areas of applied science and engineering that involve temporal measurements. Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. Generally, time series data is modeled as a stochastic process. While regression analysis is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series. Time series data have a natural temporal ordering. This makes time series analysis distinct from cross-sectional studies, in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order). Time series analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A stochastic model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see time reversibility). Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the English language). == Methods for analysis == Methods for time series analysis may be divided into two classes: frequency-domain methods and time-domain methods. The former include spectral analysis and wavelet analysis; the latter include auto-correlation and cross-correlation analysis. In the time domain, correlation and analysis can be made in a filter-like manner using scaled correlation, thereby mitigating the need to operate in the frequency domain. Additionally, time series analysis techniques may be divided into parametric and non-parametric methods. The parametric approaches assume that the underlying stationary stochastic process has a certain structure which can be described using a small number of parameters (for example, using an autoregressive or moving-average model). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, non-parametric approaches explicitly estimate the covariance or the spectrum of the process without assuming that the process has any particular structure. Methods of time series analysis may also be divided into linear and non-linear, and univariate and multivariate. == Panel data == A time series is one type of panel data. Panel data is the general class, a multidimensional data set, whereas a time series data set is a one-dimensional panel (as is a cross-sectional dataset). A data set may exhibit characteristics of both panel data and time series data. One way to tell is to ask what makes one data record unique from the other records. If the answer is the time data field, then this is a time series data set candidate. If determining a unique record requires a time data field and an additional identifier which is unrelated to time (e.g. student ID, stock symbol, country code), then it is panel data candidate. If the differentiation lies on the non-time identifier, then the data set is a cross-sectional data set candidate. == Analysis == There are several types of motivation and data analysis available for time series which are appropriate for different purposes. === Motivation === In the context of statistics, econometrics, quantitative finance, seismology, meteorology, and geophysics the primary goal of time series analysis is forecasting. In the context of signal processing, control engineering and communication engineering it is used for signal detection. Other applications are in data mining, pattern recognition and machine learning, where time series analysis can be used for clustering, classification, query by content, anomaly detection as well as forecasting. === Exploratory analysis === A simple way to examine a regular time series is manually with a line chart. The datagraphic shows tuberculosis deaths in the United States, along with the yearly change and the percentage change from year to year. The total number of deaths declined in every year until the mid-1980s, after which there were occasional increases, often proportionately - but not absolutely - quite large. A study of corporate data analysts found two challenges to exploratory time series analysis: discovering the shape of interesting patterns, and finding an explanation for these patterns. Visual tools that represent time series data as heat map matrices can help overcome these challenges. === Estimation, filtering, and smoothing === This approach may be based on harmonic analysis and filtering of signals in the frequency domain using the Fourier transform, and spectral density estimation. Its development was significantly accelerated during World War II by mathematician Norbert Wiener, electrical engineers Rudolf E. Kálmán, Dennis Gabor and others for filtering signals from noise and predicting signal values at a certain point in time. An equivalent effect may be achieved in the time domain, as in a Kalman filter; see filtering and smoothing for more techniques. Other related techniques include: Autocorrelation analysis to examine serial dependence Spectral analysis to examine cyclic behavior which need not be related to seasonality. For example, sunspot activity varies over 11 year cycles. Other common examples include celestial phenomena, weather patterns, neural activity, commodity prices, and economic activity. Separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity: see trend estimation and decomposition of time series === Curve fitting === Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function is constructed that approximately fits the data. A related topic is regression analysis, which focuses more on questions of statistical inference such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization, to infer values of a function where no data are available, and to summarize the relationships among two or more variables. Extrapolation refers to the use of a fitted curve beyond the range of the observed data, and is subject to a degree of uncertainty since it may reflect the method used to construct the curve as much as it reflects the observed data. For processes that are expected to generally grow in magnitude one of the curves in the graphic (and many others) can be fitted by estimating their parameters. The construction of economic time series involves the estimation of some components for some dates by interpolation between values ("benchmarks") for earlier and later dates. Interpolation is estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available information ("reading between the lines"). Interpolation is useful where the data surrounding the missing data is available and its trend, seasonality, and longer-term cycles are known. This is often done by using a relat

    Read more →
  • Three-factor learning

    Three-factor learning

    In neuroscience and machine learning, three-factor learning is the combination of Hebbian plasticity with a third modulatory factor to stabilise and enhance synaptic learning. This third factor can represent various signals such as reward, punishment, error, surprise, or novelty, often implemented through neuromodulators. == Description == Three-factor learning introduces the concept of eligibility traces, which flag synapses for potential modification pending the arrival of the third factor, and helps temporal credit assignement by bridging the gap between rapid neuronal firing and slower behavioral timescales, from which learning can be done. Biological basis for Three-factor learning rules have been supported by experimental evidence. This approach addresses the instability of classical Hebbian learning by minimizing autocorrelation and maximizing cross-correlation between inputs.

    Read more →
  • Outline of computer security

    Outline of computer security

    The following outline is provided as an overview of and topical guide to computer security: Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It focuses on protecting computer software, systems, and networks from threats that can lead to unauthorized information disclosure, theft, or damage to hardware, software, or data, as well as to the disruption or misdirection of the services they provide. The growing significance of computer security reflects the increasing dependence on computer systems, the Internet, and evolving wireless network standards. This reliance has expanded with the proliferation of smart devices, including smartphones, televisions, and other components of the Internet of things (IoT). (yes) == Essence of computer security == Computer security can be described as all of the following: a branch of security Network security application security == Areas of computer security == Access control – selective restriction of access to a place or other resource. The act of accessing may mean consuming, entering, or using. Permission to access a resource is called authorization. Computer access control – includes authorization, authentication, access approval, and audit. Authentication Knowledge-based authentication Integrated Windows Authentication Password Password length parameter Secure Password Authentication Secure Shell Kerberos (protocol) SPNEGO NTLMSSP AEGIS SecureConnect TACACS Cyber security and countermeasure Device fingerprint Physical security – protecting property and people from damage or harm (such as from theft, espionage, or terrorist attacks). It includes security measures designed to deny unauthorized access to facilities, (such as a computer room), equipment (such as your computer), and resources (like the data storage devices, and data, in your computer). If a computer gets stolen, then the data goes with it. In addition to theft, physical access to a computer allows for ongoing espionage, like the installment of a hardware keylogger device, and so on. Data security – protecting data, such as a database, from destructive forces and the unwanted actions of unauthorized users. Information privacy – relationship between collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them. Privacy concerns exist wherever personally identifiable information or other sensitive information is collected and stored – in digital form or otherwise. Improper or non-existent disclosure control can be the root cause for privacy issues. Internet privacy – involves the right or mandate of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to oneself via the Internet. Privacy can entail either Personally Identifying Information (PII) or non-PII information such as a site visitor's behavior on a website. PII refers to any information that can be used to identify an individual. For example, age and physical address alone could identify who an individual is without explicitly disclosing their name, as these two factors relate to a specific person. Mobile security – security pertaining to smartphones, especially with respect to the personal and business information stored on them. Network security – provisions and policies adopted by a network administrator to prevent and monitor unauthorized access, misuse, modification, or denial of a computer network and network-accessible resources. Network security involves the authorization of access to data in a network, which is controlled by the network administrator. Network Security Toolkit Internet security – computer security specifically related to the Internet, often involving browser security but also network security on a more general level as it applies to other applications or operating systems on a whole. Its objective is to establish rules and measures to use against attacks over the Internet. The Internet represents an insecure channel for exchanging information leading to a high risk of intrusion or fraud, such as phishing. Different methods have been used to protect the transfer of data, including encryption. World Wide Web Security – dealing with the vulnerabilities of users who visit websites. Cybercrime on the Web can include identity theft, fraud, espionage and intelligence gathering. For criminals, the Web has become the preferred way to spread malware. == Computer security threats == Methods of Computer Network Attack and Computer Network Exploitation Social engineering is a frequent method of attack, and can take the form of phishing, or spear phishing in the corporate or government world, as well as counterfeit websites. Password sharing and insecure password practices Poor patch management Computer crime – Computer criminals – Hackers – in the context of computer security, a hacker is someone who seeks and exploits weaknesses in a computer system or computer network. Password cracking – Software cracking – Script kiddies – List of computer criminals – Identity theft – Computer malfunction – Operating system failure and vulnerabilities Hard disk drive failure – occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer. A disk failure may occur in the course of normal operation, or due to an external factor such as exposure to fire or water or high magnetic fields, or suffering a sharp impact or environmental contamination, which can lead to a head crash. Data recovery from a failed hard disk is problematic and expensive. Backups are essential Computer and network surveillance – Man in the Middle Loss of anonymity – when one's identity becomes known. Identification of people or their computers allows their activity to be tracked. For example, when a person's name is matched with the IP address they are using, their activity can be tracked thereafter by monitoring the IP address. HTTP Cookie Local Shared Object Web bug Spyware Adware Cyber spying – obtaining secrets without the permission of the holder of the information (personal, sensitive, proprietary or of classified nature), from individuals, competitors, rivals, groups, governments and enemies for personal, economic, political or military advantage using methods on the Internet, networks or individual computers through the use of cracking techniques and malicious software including Trojan horses and spyware. It may be done online from by professionals sitting at their computer desks on bases in far away countries, or it may involve infiltration at home by computer trained conventional spies and moles, or it may be the criminal handiwork of amateur malicious hackers, software programmers, or thieves. Computer and network eavesdropping Lawful Interception War Driving Packet analyzer (aka packet sniffer) – mainly used as a security tool (in many ways, including for the detection of network intrusion attempts), packet analyzers can also be used for spying, to collect sensitive information (e.g., login details, cookies, personal communications) sent through a network, or to reverse engineer proprietary protocols used over a network. One way to protect data sent over a network such as the Internet is by using encryption software. Cyberwarfare – Exploit – piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic (usually computerized). Such behavior frequently includes things like gaining control of a computer system, allowing privilege escalation, or a denial-of-service attack. Trojan Computer virus Computer worm Denial-of-service attack – an attempt to make a machine or network resource unavailable to its intended users, usually consisting of efforts to temporarily or indefinitely interrupt or suspend services of a host connected to the Internet. One common method of attack involves saturating the target machine with external communications requests, so much so that it cannot respond to legitimate traffic, or responds so slowly as to be rendered essentially unavailable. Distributed denial-of-service attack (DDoS) – DoS attack sent by two or more persons. Hacking tool Malware Computer virus Computer worm Keylogger – program that does keystroke logging, which is the action of recording (or logging) the keys struck on a keyboard, typically in a covert manner so that the person using the keyboard is unaware that their actions are being monitored. There are also HID spoofing hardware keyloggers, like a USB device inserting stored keystores when connected. Rootkit – stealthy type of software, typically malicious, designed to hide the existence of certain processes or programs from normal methods of detection and enable contin

    Read more →
  • Hallucination (artificial intelligence)

    Hallucination (artificial intelligence)

    In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where a hallucination typically involves false percepts. For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Detecting and mitigating errors and hallucinations pose significant challenges for practical deployment and reliability of LLMs in high-stakes scenarios, such as chip design, supply chain logistics, and medical diagnostics. Some software engineers and statisticians have criticized the specific term "AI hallucination" for unreasonably anthropomorphizing computers. Symbolic artificial intelligence models generally do not produce hallucinations, unlike large language models. == Term == === Origin === Since the 1980s, the term "hallucination" has been used in computer vision with a positive connotation to describe the process of adding detail to an image. For example, the task of generating high-resolution face images from low-resolution inputs is called face hallucination. The first documented use of the term "hallucination" in this sense is in the PhD thesis of Eric Mjolsness in 1986. A notable work is the face hallucination algorithm by Simon Baker and Takeo Kanade published in 1999. In the 2000s, hallucinations were described in statistical machine translation as a failure mode. Since the 2010s, the term has undergone a semantic shift to signify the generation of factually incorrect or misleading outputs by AI systems in tasks like machine translation and object detection. In 2015, hallucinations were identified in visual semantic role labeling tasks by Saurabh Gupta and Jitendra Malik. In 2015, computer scientist Andrej Karpathy used the term "hallucinated" in a blog post to describe his recurrent neural network (RNN) language model generating an incorrect citation link. In 2017, Google researchers used the term to describe the responses generated by neural machine translation (NMT) models when they are not related to the source text, and in 2018, the term was used in computer vision to describe instances where non-existent objects are erroneously detected because of adversarial attacks. In July 2021, Meta warned during its release of BlenderBot 2 that the system is prone to "hallucinations", which Meta defined as "confident statements that are not true". Following OpenAI's ChatGPT release in beta version in November 2022, some users complained that such chatbots often seem to pointlessly embed plausible-sounding random falsehoods within their generated content. Many news outlets, including The New York Times, started to use the term "hallucinations" to describe these models' frequently incorrect or inconsistent responses. In 2023, the Cambridge dictionary updated its definition of hallucination to include this new sense specific to the field of AI. Some researchers have highlighted a lack of consistency in how the term is used, but also identified several alternative terms in the literature, such as confabulations, fabrications, and factual errors. === Definitions and alternatives === Uses, definitions and characterizations of the term "hallucination" in the context of LLMs include: "a tendency to invent facts in moments of uncertainty" (OpenAI, May 2023) "a model's logical mistakes" (OpenAI, May 2023) "fabricating information entirely, but behaving as if spouting facts" (CNBC, May 2023) "making up information" (The Verge, February 2023) "probability distributions" (in scientific contexts) Journalist Benj Edwards, in Ars Technica, writes that the term "hallucination" is controversial, but that some form of metaphor remains necessary; Edwards suggests "confabulation" as an analogy for processes that involve "creative gap-filling". In July 2024, a White House report on fostering public trust in AI research mentioned hallucinations only in the context of reducing them. Notably, when acknowledging David Baker's Nobel Prize-winning work with AI-generated proteins, the Nobel committee avoided the term entirely, instead referring to "imaginative protein creation". Hicks, Humphries, and Slater, in their article in Ethics and Information Technology, argue that the output of LLMs is "bullshit" under Harry Frankfurt's definition of the term, and that the models are "in an important way indifferent to the truth of their outputs", with true statements only accidentally true, and false ones accidentally false. Some researchers also use the derogatory term "botshit", often referring to uncritical use of AI. === Criticism === In the scientific community, some researchers avoid the term "hallucination", seeing it as potentially misleading. It has been criticized by Usama Fayyad, executive director of the Institute for Experimental Artificial Intelligence at Northeastern University, on the grounds that it misleadingly personifies large language models and is vague. Mary Shaw said, "The current fashion for calling generative AI's errors 'hallucinations' is appalling. It anthropomorphizes the software, and it spins actual errors as somehow being idiosyncratic quirks of the system even when they're objectively incorrect." In Salon, statistician Gary Smith argues that LLMs "do not understand what words mean" and consequently that the term "hallucination" unreasonably anthropomorphizes the machine. Murray Shanahan argues that anthropomorphic framing of LLM capabilities, including terms like "hallucination", encourages users and researchers to attribute cognitive processes to systems that operate through statistical pattern completion, and advocates for more careful linguistic practices when discussing LLM behavior. Kristina Šekrst argues that applying psychological vocabulary to LLM outputs obscures the difference between the appearance of mental properties and their genuine presence. Förster & Skop assert that tech companies use the hallucination metaphor to anthropomorphize models and deflect responsibility for non-factual outputs. Some see the AI outputs not as illusory but as prospective—that is, having some chance of being true, similar to early-stage scientific conjectures. The term has also been criticized for its association with psychedelic drug experiences. == In natural language generation == In natural language generation, there are several reasons why natural language models hallucinate: === Hallucination from data === Hallucinations can stem from incomplete, inaccurate or unrepresentative data sets. === Modeling-related causes === The pre-training of generative pretrained transformers (GPT) involves predicting the next word. It incentivizes GPT models to "give a guess" about what the next word is, even when they lack information. Some researchers take an anthropomorphic perspective and posit that hallucinations arise from a tension between novelty and usefulness. For instance, Amabile and Pratt define human creativity as the production of novel and useful ideas. By extension, a focus on novelty in machine creativity can lead to the production of original but inaccurate responses—that is, falsehoods—whereas a focus on usefulness may result in memorized content lacking originality. By 2022, newspapers such as The New York Times expressed concern that, as the adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems. === Interpretability research === In 2025, interpretability research by Anthropic on the LLM Claude identified internal circuits that cause it to decline to answer questions unless it knows the answer. By default, the circuit is active and the LLM doesn't answer. When the LLM has sufficient information, these circuits are inhibited and the LLM answers the question. Hallucinations were found to occur when this inhibition happens incorrectly, such as when Claude recognizes a name but lacks sufficient information about that person, causing it to generate plausible but untrue responses. === Examples === On 15 November 2022, researchers from Meta AI published Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning: "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. OpenAI's ChatGPT, released in beta version to the public on November 30, 2022, was based on the foundation model GPT-3.5 (a revision of GPT-3). Professor Ethan Mollick of Wharton called it an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kuba

    Read more →
  • AIXI

    AIXI

    AIXI is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000 and several results regarding AIXI are proved in Hutter's 2005 book Universal Artificial Intelligence. AIXI is a reinforcement learning (RL) agent. It maximizes the expected total rewards received from the environment. Intuitively, it simultaneously considers every computable hypothesis (or environment). In each time step, it looks at every possible program and evaluates how many rewards that program generates depending on the next action taken. The promised rewards are then weighted by the subjective belief that this program constitutes the true environment. This belief is computed from the length of the program: longer programs are considered less likely, in line with Occam's razor. AIXI then selects the action that has the highest expected total reward in the weighted sum of all these programs. == Etymology == According to Hutter, the word "AIXI" can have several interpretations. AIXI can stand for AI based on Solomonoff's distribution, denoted by ξ {\displaystyle \xi } (which is the Greek letter xi), or e.g. it can stand for AI "crossed" (X) with induction (I). There are other interpretations. == Definition == AIXI is a reinforcement learning agent that interacts with some stochastic and unknown but computable environment μ {\displaystyle \mu } . The interaction proceeds in time steps, from t = 1 {\displaystyle t=1} to t = m {\displaystyle t=m} , where m ∈ N {\displaystyle m\in \mathbb {N} } is the lifespan of the AIXI agent. At time step t, the agent chooses an action a t ∈ A {\displaystyle a_{t}\in {\mathcal {A}}} (e.g. a limb movement) and executes it in the environment, and the environment responds with a "percept" e t ∈ E = O × R {\displaystyle e_{t}\in {\mathcal {E}}={\mathcal {O}}\times \mathbb {R} } , which consists of an "observation" o t ∈ O {\displaystyle o_{t}\in {\mathcal {O}}} (e.g., a camera image) and a reward r t ∈ R {\displaystyle r_{t}\in \mathbb {R} } , distributed according to the conditional probability μ ( o t r t | a 1 o 1 r 1 . . . a t − 1 o t − 1 r t − 1 a t ) {\displaystyle \mu (o_{t}r_{t}|a_{1}o_{1}r_{1}...a_{t-1}o_{t-1}r_{t-1}a_{t})} , where a 1 o 1 r 1 . . . a t − 1 o t − 1 r t − 1 a t {\displaystyle a_{1}o_{1}r_{1}...a_{t-1}o_{t-1}r_{t-1}a_{t}} is the "history" of actions, observations and rewards. The environment μ {\displaystyle \mu } is thus mathematically represented as a probability distribution over "percepts" (observations and rewards) which depend on the full history, so there is no Markov assumption (as opposed to other RL algorithms). Note again that this probability distribution is unknown to the AIXI agent. Furthermore, note again that μ {\displaystyle \mu } is computable, that is, the observations and rewards received by the agent from the environment μ {\displaystyle \mu } can be computed by some program (which runs on a Turing machine), given the past actions of the AIXI agent. The only goal of the AIXI agent is to maximize ∑ t = 1 m r t {\displaystyle \sum _{t=1}^{m}r_{t}} , that is, the sum of rewards from time step 1 to m. The AIXI agent is associated with a stochastic policy π : ( A × E ) ∗ → A {\displaystyle \pi :({\mathcal {A}}\times {\mathcal {E}})^{}\rightarrow {\mathcal {A}}} , which is the function it uses to choose actions at every time step, where A {\displaystyle {\mathcal {A}}} is the space of all possible actions that AIXI can take and E {\displaystyle {\mathcal {E}}} is the space of all possible "percepts" that can be produced by the environment. The environment (or probability distribution) μ {\displaystyle \mu } can also be thought of as a stochastic policy (which is a function): μ : ( A × E ) ∗ × A → E {\displaystyle \mu :({\mathcal {A}}\times {\mathcal {E}})^{}\times {\mathcal {A}}\rightarrow {\mathcal {E}}} , where the ∗ {\displaystyle } is the Kleene star operation. In general, at time step t {\displaystyle t} (which ranges from 1 to m), AIXI, having previously executed actions a 1 … a t − 1 {\displaystyle a_{1}\dots a_{t-1}} (which is often abbreviated in the literature as a < t {\displaystyle a_{ Read more →