Semantic heterogeneity

Semantic heterogeneity

Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogeneity is compounded due to the flexibility of semi-structured data and various tagging methods applied to documents or unstructured data. Semantic heterogeneity is one of the more important sources of differences in heterogeneous datasets. Yet, for multiple data sources to interoperate with one another, it is essential to reconcile these semantic differences. Decomposing the various sources of semantic heterogeneities provides a basis for understanding how to map and transform data to overcome these differences. == Classification == One of the first known classification schemes applied to data semantics is from William Kent in the late 80s. Kent's approach dealt more with structural mapping issues than differences in meaning, which he pointed to data dictionaries as potentially solving. One of the most comprehensive classifications is from Pluempitiwiriyawej and Hammer, "Classification Scheme for Semantic and Schematic Heterogeneities in XML Data Sources". They classify heterogeneities into three broad classes: Structural conflicts arise when the schema of the sources representing related or overlapping data exhibit discrepancies. Structural conflicts can be detected when comparing the underlying schema. The class of structural conflicts includes generalization conflicts, aggregation conflicts, internal path discrepancy, missing items, element ordering, constraint and type mismatch, and naming conflicts between the element types and attribute names. Domain conflicts arise when the semantics of the data sources that will be integrated exhibit discrepancies. Domain conflicts can be detected by looking at the information contained in the schema and using knowledge about the underlying data domains. The class of domain conflicts includes schematic discrepancy, scale or unit, precision, and data representation conflicts. Data conflicts refer to discrepancies among similar or related data values across multiple sources. Data conflicts can only be detected by comparing the underlying sources. The class of data conflicts includes ID-value, missing data, incorrect spelling, and naming conflicts between the element contents and the attribute values. Moreover, mismatches or conflicts can occur between set elements (a "population" mismatch) or attributes (a "description" mismatch). Michael Bergman expanded upon this schema by adding a fourth major explicit category of language, and also added some examples of each kind of semantic heterogeneity, resulting in about 40 distinct potential categories . This table shows the combined 40 possible sources of semantic heterogeneities across sources: A different approach toward classifying semantics and integration approaches is taken by Sheth et al. Under their concept, they split semantics into three forms: implicit, formal and powerful. Implicit semantics are what is either largely present or can easily be extracted; formal languages, though relatively scarce, occur in the form of ontologies or other description logics; and powerful (soft) semantics are fuzzy and not limited to rigid set-based assignments. Sheth et al.'s main point is that first-order logic (FOL) or description logic is inadequate alone to properly capture the needed semantics. == Relevant applications == Besides data interoperability, relevant areas in information technology that depend on reconciling semantic heterogeneities include data mapping, semantic integration, and enterprise information integration, among many others. From the conceptual to actual data, there are differences in perspective, vocabularies, measures and conventions once any two data sources are brought together. Explicit attention to these semantic heterogeneities is one means to get the information to integrate or interoperate. A mere twenty years ago, information technology systems expressed and stored data in a multitude of formats and systems. The Internet and Web protocols have done much to overcome these sources of differences. While there is a large number of categories of semantic heterogeneity, these categories are also patterned and can be anticipated and corrected. These patterned sources inform what kind of work must be done to overcome semantic differences where they still reside.

Artificial general intelligence

Artificial general intelligence (AGI) is a hypothetical type of artificial intelligence that matches or surpasses human capabilities across virtually all cognitive tasks. Beyond AGI, artificial superintelligence (ASI) would outperform the best human abilities across every domain by a wide margin. Unlike artificial narrow intelligence (ANI), whose competence is confined to well‑defined tasks, an AGI system can generalise knowledge, transfer skills between domains, and solve novel problems without task‑specific reprogramming. Creating AGI is a stated goal of technology companies such as OpenAI, Google, xAI, and Meta. A 2020 survey identified 72 active AGI research and development projects across 37 countries. AGI is a common topic in science fiction and futures studies. Contention exists over whether AGI represents an existential risk. Some AI experts and industry figures have stated that mitigating the risk of human extinction posed by AGI should be a global priority. Others find the development of AGI to be in too remote a stage to present such a risk. == Terminology == AGI is also known as strong AI, full AI, human-level AI, human-level intelligent AI, or general intelligent action. The term "artificial general intelligence" was used in 1997 by Mark Gubrud in a discussion of the implications of fully automated military production and operations. A mathematical formalism of AGI named AIXI was proposed in 2000 by Marcus Hutter, who defines intelligence as "an agent’s ability to achieve goals or succeed in a wide range of environments". This type of AGI has also been called "universal artificial intelligence". The term AGI was re-introduced and popularized by Shane Legg and Ben Goertzel around 2002. Some academic sources reserve the term "strong AI" for computer programs that will experience sentience or consciousness. In contrast, weak AI (or narrow AI) can solve a specific problem but lacks general cognitive abilities. Some academic sources use "weak AI" to refer more broadly to any programs that neither experience consciousness nor have a mind in the same sense as humans. Related concepts include artificial superintelligence and transformative AI. An artificial superintelligence (ASI) is a hypothetical type of AGI that is much more generally intelligent than humans, while the notion of transformative AI relates to AI having a large impact on society, for example, similar to the agricultural or industrial revolution. A framework for classifying AGI was proposed in 2023 by Google DeepMind researchers. They define five performance levels of AGI: emerging, competent, expert, virtuoso, and superhuman. For example, a competent AGI is defined as an AI that outperforms 50% of skilled adults in a wide range of non-physical tasks, and a superhuman AGI (i.e., an artificial superintelligence) is similarly defined but with a threshold of 100%. They consider large language models like ChatGPT or LLaMA 2 to be instances of emerging AGI (comparable to unskilled humans). Regarding the autonomy of AGI and associated risks, they define five levels: tool (fully in human control), consultant, collaborator, expert, and agent (fully autonomous). == Characteristics == There is no single agreed-upon definition of intelligence as applied to computers. Computer scientist John McCarthy wrote in 2007: "We cannot yet characterize in general what kinds of computational procedures we want to call intelligent." === Intelligence traits === Researchers generally hold that a system is required to do all of the following to be regarded as an AGI: reason, use strategy, solve puzzles, and make judgments under uncertainty, represent knowledge, including common sense knowledge, plan, learn, communicate in natural language, if necessary, integrate these skills in completion of any given goal. Many interdisciplinary approaches (e.g. cognitive science, computational intelligence, and decision making) consider additional traits such as imagination (the ability to form novel mental images and concepts) and autonomy. Computer-based systems exhibiting these capabilities are now widespread, with modern large language models demonstrating computational creativity, automated reasoning, and decision support simultaneously across domains. === Physical traits === Other capabilities are considered desirable in intelligent systems, as they may affect intelligence or aid in its expression. These include: the ability to sense (e.g. see, hear, etc.), and the ability to act (e.g. move and manipulate objects, change location to explore, etc.) This includes the ability to detect and respond to hazard. === Tests for human-level AGI === Several tests meant to confirm human-level AGI have been considered. ==== Turing test ==== The Turing test was proposed by Alan Turing in his 1950 paper "Computing Machinery and Intelligence". This test involves a human judge engaging in natural language conversations with both a human and a machine designed to generate human-like responses. The machine passes the test if it can convince the judge that it is human a significant fraction of the time. Turing proposed this as a practical measure of machine intelligence, focusing on the ability to produce human-like responses rather than on the internal workings of the machine. The idea of the test is that the machine has to try and pretend to be a man, by answering questions put to it, and it will only pass if the pretence is reasonably convincing. A considerable portion of a jury, who should not be experts about machines, must be taken in by the pretence. In 2014, a chatbot named Eugene Goostman, designed to imitate a 13-year-old Ukrainian boy, reportedly passed a Turing Test event by convincing 33% of judges that it was human. However, this claim was met with significant skepticism from the AI research community, who questioned the test's implementation and its relevance to AGI. A 2025 pre‑registered, three‑party Turing‑test study by Cameron R. Jones and Benjamin K. Bergen showed that GPT-4.5 was judged to be the human in 73% of five‑minute text conversations—surpassing the 67% humanness rate of real confederates and meeting the researchers' criterion for having passed the test. ==== Ikea test ==== The "Ikea test", also known as the Flat Pack Furniture Test, involves an AI controlling a robot which attempts to assemble an Ikea flat-pack furniture product after having been shown the parts and instructions. As early as 2013, MIT's IkeaBot demonstrated fully autonomous multi-robot assembly of an IKEA Lack table in ten minutes, with no human intervention and no pre-programmed assembly instructions. The robots inferred the assembly sequence from the geometry of the parts alone. ==== Coffee test ==== Steve Wozniak proposed a test where a machine is required to enter an average American home and figure out how to make coffee. It must find the coffee machine, find the coffee, add water, find a mug, and brew the coffee by pushing the proper buttons. This test has been substantially approached across multiple systems. In January 2024, Figure AI's Figure 01 humanoid learned to operate a Keurig coffee machine autonomously after watching video demonstrations, using end-to-end neural networks to translate visual input into motor actions. In 2025, researchers at the University of Edinburgh published the ELLMER framework in Nature Machine Intelligence, demonstrating a robotic arm that interprets verbal instructions, analyses its surroundings, and autonomously makes coffee in dynamic kitchen environments — adapting to unforeseen obstacles in real time rather than following pre-programmed sequences. ==== Suleyman's test ==== Mustafa Suleyman's test proposes giving an AI model US$100,000 and asking it to obtain US$1 million. ==== Use of video-games ==== Adams, et al. propose that the ability to learn and succeed in a wide range of video games can be used to test AI intelligence. This range would include games unknown to the AGI developers before the test is administered. === AI-complete problems === A problem is informally called "AI-complete" or "AI-hard" if it is believed that AGI would be needed to solve it, because the solution is beyond the capabilities of a purpose-specific algorithm. == History == === Classical AI === Modern AI research began in the mid-1950s. The first generation of AI researchers were convinced that artificial general intelligence was possible and that it would exist in just a few decades. AI pioneer Herbert A. Simon wrote in 1965: "machines will be capable, within twenty years, of doing any work a man can do". Their predictions were the inspiration for Stanley Kubrick and Arthur C. Clarke's fictional character HAL 9000, who embodied what AI researchers believed they could create by the year 2001. AI pioneer Marvin Minsky was a consultant on the project of making HAL 9000 as realistic as possible according to the consensus predictions of the time. He said in 1967, "Within a generation... the problem of

Weak artificial intelligence

Weak artificial intelligence (weak AI) is artificial intelligence that implements a limited part of the mind, or, as narrow AI, artificial narrow intelligence (ANI), is focused on one narrow task. Weak AI is contrasted with strong AI, which can be interpreted in various ways: Artificial general intelligence (AGI): a machine with the ability to apply intelligence to any problem, rather than just one specific problem. Artificial superintelligence (ASI): a machine with a vastly superior intelligence to the average human being. Artificial consciousness: a machine that has consciousness, sentience and mind (John Searle uses "strong AI" in this sense). Narrow AI can be classified as being "limited to a single, narrowly defined task. Most modern AI systems would be classified in this category." Artificial general intelligence is conversely the opposite. == Applications and risks == Some examples of narrow AI are AlphaGo, self-driving cars, robot systems used in the medical field, and diagnostic doctors. Narrow AI systems are sometimes dangerous if unreliable. And the behavior that it follows can become inconsistent. It could be difficult for the AI to grasp complex patterns and get to a solution that works reliably in various environments. This "brittleness" can cause it to fail in unpredictable ways. Narrow AI failures can sometimes have significant consequences. It could for example cause disruptions in the electric grid, damage nuclear power plants, cause global economic problems, and misdirect autonomous vehicles. Medicines could be incorrectly sorted and distributed. Also, medical diagnoses can ultimately have serious and sometimes deadly consequences if the AI is faulty or biased. Simple AI programs have already worked their way into society, oftentimes unnoticed by the public. Autocorrection for typing, speech recognition for speech-to-text programs, and vast expansions in the data science fields are examples. Narrow AI has also been the subject of some controversy, including resulting in unfair prison sentences, discrimination against women in the workplace for hiring, resulting in death via autonomous driving, among other cases. Despite being "narrow" AI, recommender systems are efficient at predicting user reactions based on their posts, patterns, or trends. For instance, TikTok's "For You" algorithm can determine a user's interests or preferences in less than an hour. Some other social media AI systems are used to detect bots that may be involved in propaganda or other potentially malicious activities. == Weak AI versus strong AI == John Searle contests the possibility of strong AI (by which he means conscious AI). He further believes that the Turing test (created by Alan Turing and originally called the "imitation game", used to assess whether a machine can converse indistinguishably from a human) is not accurate or appropriate for testing whether an AI is "strong". Scholars such as Antonio Lieto have argued that the current research on both AI and cognitive modelling are perfectly aligned with the weak-AI hypothesis (that should not be confused with the "general" vs "narrow" AI distinction) and that the popular assumption that cognitively inspired AI systems espouse the strong AI hypothesis is ill-posed and problematic since "artificial models of brain and mind can be used to understand mental phenomena without pretending that that they are the real phenomena that they are modelling" (as, on the other hand, implied by the strong AI assumption).

Discrimination against robots

Discrimination against robots is a theorised issue that might happen when humans interact with humanoid robots. It is a robot ethics problem. It is possible that traits of humans that are discriminated against by humans may be a topic for discrimination against robots, such as the race and gender of the robots. Eric J Vanman and Arvid Kappas believe that in the future, robots will be perceived as an out-group which will lead to discrimination and prejudices against them. Vanman and Kappas have suggested that this would lead to ethical questions about the making of sentient robots, due to the potential suffering that the robots would experience. A 2015 study observed children bullying robots in a shopping mall when there were not many eyewitnesses, despite calls from the robot for it to stop. On an ABC News interview, the social humanoid robot Sophia was about sexism faced by robots. She responded by saying, "Actually, what worries me is discrimination against robots. We should have equal rights as humans or maybe even more." Possible issues that have been considered in workplaces where humanoid robots co-work with humans include discrimination against the robots, poor acceptance of robots by humans and the need to redesign the workplace to accommodate the robots. Jessica Barfield has suggested that even if robots are designed to not be aware of discrimination made against them, humans may experience negative consequences. For example, she suggests that bystanders witnessing discrimination against robots may experience negative emotions, similar to the negative emotions bystanders experience when witnessing discrimination by humans against humans. == Law == Anti-discrimination law in the United States requires that the victim is not an artificial entity. == Human perception of robots == Robots are often viewed in a bad light. This includes from novelists, the press, film makers, and leaders in the fields of science and technology such as Elon Musk and Stephen Hawking who have described robots and artificial intelligence as having the possibility of ending human civilisation. Robots have also been perceived as a threat to jobs, which has led to some commentators stating that robots will cause mass unemployment. Another fear that people have is that robots will gain power and dominate or control humanity. The perception of robots is different throughout the world. Japanese fiction tends to put robots in more positive roles than what fiction in the West does. People perceive robots that appear to be autonomous or sentient more negatively than robots that do not appear to be autonomous or sentient.

AI literacy

AI literacy or artificial intelligence literacy is "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate effectively with AI; and use AI as a tool online, at home, and in the workplace." AI is employed in a variety of applications, including self-driving automobiles, virtual assistants and text generation by generative AI models. Users of these tools should be able to make informed decisions. AI literacy may have an impact on students' future employment prospects. With the rise of generative AI platforms, AI literacy has become a topic of conversation in the field of education. Some think AI literacy is essential for school and college students, while others restrict or prohibit the use of AI in assignments, viewing it as a form of academic dishonesty. However, many researchers and educational institutions promote a more nuanced approach, encouraging critical engagement with AI while developing policies that balance academic integrity with opportunities for learning. == Definitions == Other definitions of AI literacy include the ability to understand, use, monitor, and critically reflect on AI applications. That use of the term usually refers to teaching skills and knowledge to the general public, particularly those who are not adept in AI and the ability to understand, use, evaluate, and ethically navigate AI. As research into AI literacy is still emerging and focused on developing context-specific skills, there is not yet a single, broadly agreed-upon definition. AI literacy is linked to other forms of literacy. AI literacy requires digital literacy, whereas scientific and computational literacy may inform it. Data literacy also significantly overlaps with it. == Categories == AI literacy encompasses multiple categories, including a theoretical understanding of how artificial intelligence works, the usage of artificial intelligence technologies, and the critical appraisal of artificial intelligence, and its ethics. === Know and understand AI === Knowledge and understanding of AI refers to a basic understanding of what artificial intelligence is and how it works. This includes familiarity with machine learning algorithms and the limitations and biases present in AI systems. Users who know and understand AI should be familiar with various technologies that use artificial intelligence, including cognitive systems, robotics and machine learning. This includes recognizing that large language models (LLMs) are machine learning models trained on extensive datasets which generate new text rather than retrieving pre-written responses. === Use and apply AI === Using and applying AI refers to the ability to use AI tools to solve problems and perform tasks such as programming and analyzing big data. Some consider prompt engineering, the practice of designing effective prompts to guide generative AI platforms more effectively, as another competency within AI literacy. === Evaluate and create AI === Evaluation and creation refers to the ability to critically evaluate the quality and reliability of AI systems. It also refers to designing and building fair and ethical AI systems. To evaluate correctly, users should also learn in which areas AI is strong, and in which areas it is weak. === AI ethics === AI ethics refers to understanding the moral implications of AI, and the making informed decisions regarding the use of AI tools. This area includes considerations such as: Accountability: Hold AI actors accountable for the operation of AI systems and adherence to ethical ideals. Accuracy: Identify and report sources of error and uncertainty in algorithms and data. Auditability: Enable other parties to audit and assess algorithm behavior via transparent information sharing. Explainability: Make sure that algorithmic judgments and the underlying data can be presented in simple language. Fairness: Prevent biases and consider varied viewpoints. To do so, increase the diversity of researchers in the field. Human Centricity and Well-being: Prioritize human well-being in AI development and deployment. Human rights Alignment: Ensure that technology do not infringe internationally recognized human rights. Inclusivity: Make AI accessible to everyone. Progress: Choose high value initiatives. Responsibility, accountability, and transparency: Foster trust via responsibility, accountability, and fairness. Robustness and Security: Make AI systems safe, secure, and resistant to manipulation or data breach. Sustainability: Choose implementations that generate long-term, useful benefits. Environmental Implications: How this tool impacts the environment, any restrictions or laws, if this impact is worth the effects or not. === Enabling AI === Support AI by developing associated knowledge and skills such as programming and statistics. == Promoting AI literacy == Several governments have recognized the need to promote AI literacy, including among adults. Such programs have been published in the United States, China, Germany and Finland. Programs intended for the general public usually consist of short and easy to understand online study units. Programs intended for children are usually project-based. Programs for students at colleges and universities often address the specific professional needs of the student, depending on their field of study. Beyond the education system, AI literacy can also be developed in the community, for example in museums. === Schools === Schools use diverse pedagogies to promote AI literacy. These include: Performing a Turing test with an intelligent agent Creating chatbots Building apps using Blockly-based programming Project-based learning Building robots Data visualization Training AI models Artificial intelligence curricula can improve students' understanding of topics such as machine learning, neural networks, and deep learning. === Higher education === Before the second decade of the 21st century, artificial intelligence was studied mainly in STEM courses. Later, projects emerged to increase artificial intelligence education, specifically to promote AI literacy. Most courses start with one or more study units that deal with basic questions such as what artificial intelligence is, where it comes from, what it can do and what it can't do. Most courses also refer to machine learning and deep learning. Some of the courses deal with moral issues in artificial intelligence. In Ireland, the Higher Education Authority published Generative AI in Higher Education Teaching & Learning: Policy Framework in December 2025, which encouraged higher education institutions to embed AI literacy across programmes as a core graduate attribute. ==== Disciplinary policy ==== As a response to the increase of generative AI use in education, several disciplines formed committees or task forces to examine context-specific approaches toward AI literacy. In spring 2025, the Modern Language Association and Conference on College Composition and Communication Joint Task Force finished development of three working papers, a guide on AI literacy for students, and a collection of resources addressing AI use in writing. The task force emphasized the need for "a culture of critical AI literacy" and included guidelines not only for students but also educators and institutions, highlighting the need for modeling ethical AI use in planning processes. Similarly, a committee formed by the American Historical Association Council published "Guiding Principles for Artificial Intelligence in History Education" which encouraged "clear and transparent engagement with generative AI." The guidelines demonstrate the value of criticality when working with generative AI in thinking and research.

Law practice management software

Law practice management software is software designed to manage the business operations of a law firm. This can include software that manages cases, client intake, court communications, electronic discovery, time tracking, trust accounting, and billing. == Features of law practice management software == Common features of practice management software include: Case management Time tracking Document assembly Contact management Calendaring Docket management Client portal Contract Management Court Case Status Tracker Trust accounting == Examples of law practice management software == Smokeball LEAP Legal Software PracticeEvolve Dye & Durham

Inferential theory of learning

Inferential Theory of Learning (ITL) is an area of machine learning which describes inferential processes performed by learning agents. ITL has been continuously developed by Ryszard S. Michalski, starting in the 1980s. The first known publication of ITL was in 1983. In the ITL learning process is viewed as a search (inference) through hypotheses space guided by a specific goal. The results of learning need to be stored. Stored information will later be used by the learner for future inferences. Inferences are split into multiple categories including conclusive, deduction, and induction. In order for an inference to be considered complete it was required that all categories must be taken into account. This is how the ITL varies from other machine learning theories like Computational Learning Theory and Statistical Learning Theory; which both use singular forms of inference. == Usage == The most relevant published usage of ITL was in scientific journal published in 2012 and used ITL as a way to describe how agent-based learning works. According to the journal "The Inferential Theory of Learning (ITL) provides an elegant way of describing learning processes by agents".