AI Data Farms

AI Data Farms — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Evaluation of binary classifiers

    Evaluation of binary classifiers

    Evaluation of a binary classifier typically assigns a numerical value, or values, to a classifier that represent its accuracy. An example is error rate, which measures how frequently the classifier makes a mistake. There are many metrics that can be used; different fields have different preferences. For example, in medicine sensitivity and specificity are often used, while in computer science precision and recall are preferred. An important distinction is between metrics that are independent of the prevalence or skew (how often each class occurs in the population), and metrics that depend on the prevalence – both types are useful, but they have very different properties. Often, evaluation is used to compare two methods of classification, so that one can be adopted and the other discarded. Such comparisons are more directly achieved by a form of evaluation that results in a single unitary metric rather than a pair of metrics. == Contingency table == Given a data set, a classification (the output of a classifier on that set) gives two numbers: the number of positives and the number of negatives, which add up to the total size of the set. To evaluate a classifier, one compares its output to another reference classification – ideally a perfect classification, but in practice the output of another gold standard test – and cross tabulates the data into a 2×2 contingency table, comparing the two classifications. One then evaluates the classifier relative to the gold standard by computing summary statistics of these 4 numbers. Generally these statistics will be scale invariant (scaling all the numbers by the same factor does not change the output), to make them independent of population size, which is achieved by using ratios of homogeneous functions, most simply homogeneous linear or homogeneous quadratic functions. Say we test some people for the presence of a disease. Some of these people have the disease, and our test correctly says they are positive. They are called true positives (TP). Some have the disease, but the test incorrectly claims they don't. They are called false negatives (FN). Some don't have the disease, and the test says they don't – true negatives (TN). Finally, there might be healthy people who have a positive test result – false positives (FP). These can be arranged into a 2×2 contingency table (confusion matrix), conventionally with the test result on the vertical axis and the actual condition on the horizontal axis. These numbers can then be totaled, yielding both a grand total and marginal totals. Totaling the entire table, the number of true positives, false negatives, true negatives, and false positives add up to 100% of the set. Totaling the columns (adding vertically) the number of true positives and false positives add up to 100% of the test positives, and likewise for negatives. Totaling the rows (adding horizontally), the number of true positives and false negatives add up to 100% of the condition positives (conversely for negatives). The basic marginal ratio statistics are obtained by dividing the 2×2=4 values in the table by the marginal totals (either rows or columns), yielding 2 auxiliary 2×2 tables, for a total of 8 ratios. These ratios come in 4 complementary pairs, each pair summing to 1, and so each of these derived 2×2 tables can be summarized as a pair of 2 numbers, together with their complements. Further statistics can be obtained by taking ratios of these ratios, ratios of ratios, or more complicated functions. The contingency table and the most common derived ratios are summarized below; see sequel for details. Note that the rows correspond to the condition actually being positive or negative (or classified as such by the gold standard), as indicated by the color-coding, and the associated statistics are prevalence-independent, while the columns correspond to the test being positive or negative, and the associated statistics are prevalence-dependent. There are analogous likelihood ratios for prediction values, but these are less commonly used, and not depicted above. == Pairs of metrics == Often accuracy is evaluated with a pair of metrics composed in a standard pattern. === Sensitivity and specificity === The fundamental prevalence-independent statistics are sensitivity and specificity. Sensitivity or True Positive Rate (TPR), also known as recall, is the proportion of people that tested positive and are positive (True Positive, TP) of all the people that actually are positive (Condition Positive, CP = TP + FN). It can be seen as the probability that the test is positive given that the patient is sick. With higher sensitivity, fewer actual cases of disease go undetected (or, in the case of the factory quality control, fewer faulty products go to the market). Specificity (SPC) or True Negative Rate (TNR) is the proportion of people that tested negative and are negative (True Negative, TN) of all the people that actually are negative (Condition Negative, CN = TN + FP). As with sensitivity, it can be looked at as the probability that the test result is negative given that the patient is not sick. With higher specificity, fewer healthy people are labeled as sick (or, in the factory case, fewer good products are discarded). The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the Receiver Operating Characteristic (ROC) curve. In theory, sensitivity and specificity are independent in the sense that it is possible to achieve 100% in both (such as in the red/blue ball example given above). In more practical, less contrived instances, however, there is usually a trade-off, such that they are inversely proportional to one another to some extent. This is because we rarely measure the actual thing we would like to classify; rather, we generally measure an indicator of the thing we would like to classify, referred to as a surrogate marker. The reason why 100% is achievable in the ball example is because redness and blueness is determined by directly detecting redness and blueness. However, indicators are sometimes compromised, such as when non-indicators mimic indicators or when indicators are time-dependent, only becoming evident after a certain lag time. The following example of a pregnancy test will make use of such an indicator. Modern pregnancy tests do not use the pregnancy itself to determine pregnancy status; rather, human chorionic gonadotropin is used, or hCG, present in the urine of gravid females, as a surrogate marker to indicate that a woman is pregnant. Because hCG can also be produced by a tumor, the specificity of modern pregnancy tests cannot be 100% (because false positives are possible). Also, because hCG is present in the urine in such small concentrations after fertilization and early embryogenesis, the sensitivity of modern pregnancy tests cannot be 100% (because false negatives are possible). === Positive and negative predictive values === In addition to sensitivity and specificity, the performance of a binary classification test can be measured with positive predictive value (PPV), also known as precision, and negative predictive value (NPV). The positive prediction value answers the question "If the test result is positive, how well does that predict an actual presence of disease?". It is calculated as TP/(TP + FP); that is, it is the proportion of true positives out of all positive results. The negative prediction value is the same, but for negatives, naturally. ==== Impact of prevalence on predictive values ==== Prevalence has a significant impact on prediction values. As an example, suppose there is a test for a disease with 99% sensitivity and 99% specificity. If 2000 people are tested and the prevalence (in the sample) is 50%, 1000 of them are sick and 1000 of them are healthy. Thus about 990 true positives and 990 true negatives are likely, with 10 false positives and 10 false negatives. The positive and negative prediction values would be 99%, so there can be high confidence in the result. However, if the prevalence is only 5%, so of the 2000 people only 100 are really sick, then the prediction values change significantly. The likely result is 99 true positives, 1 false negative, 1881 true negatives and 19 false positives. Of the 19+99 people tested positive, only 99 really have the disease – that means, intuitively, that given that a patient's test result is positive, there is only 84% chance that they really have the disease. On the other hand, given that the patient's test result is negative, there is only 1 chance in 1882, or 0.05% probability, that the patient has the disease despite the test result. === Precision and recall === Precision and recall can be interpreted as (estimated) conditional probabilities: Precision is given by P ( C = P | C ^ = P ) {\displaystyle P(C=P|{\hat {C}}=P)} while recall is given by P ( C ^ = P | C = P ) {\displaystyle P({\hat {C}}=P|C=P)} , where C ^ {\

    Read more →
  • Vagueness

    Vagueness

    In linguistics and philosophy, a vague predicate is one which gives rise to borderline cases. For example, the English adjective "tall" is vague since it is not clearly true or false for someone of middling height. By contrast, the word "prime" is not vague since every number is definitively either prime or not. Vagueness is commonly diagnosed by a predicate's ability to give rise to the sorites paradox. Vagueness is separate from ambiguity, in which an expression has multiple denotations. For instance the word "bank" is ambiguous since it can refer either to a river bank or to a financial institution, but there are no borderline cases between both interpretations. Vagueness is a major topic of research in philosophical logic, where it serves as a potential challenge to classical logic. Work in formal semantics has sought to provide a compositional semantics for vague expressions in natural language. Work in philosophy of language has addressed implications of vagueness for the theory of meaning, while metaphysicists have considered whether reality itself is vague. == Importance == The concept of vagueness has philosophical importance. Suppose one wants to come up with a definition of "right" in the moral sense. One wants a definition to cover actions that are clearly right and exclude actions that are clearly wrong, but what does one do with the borderline cases? Surely, there are such cases. Some philosophers say that one should try to come up with a definition that is itself unclear on just those cases. Others say that one has an interest in making his or her definitions more precise than ordinary language, or his or her ordinary concepts, themselves allow; they recommend one advances precising definitions. === In law === Vagueness is also a problem which arises in law, and in some cases, judges have to arbitrate regarding whether a borderline case does, or does not, satisfy a given vague concept. Examples include disability (how much loss of vision is required before one is legally blind?), human life (at what point from conception to birth is one a legal human being, protected for instance by laws against murder?), adulthood (most familiarly reflected in legal ages for driving, drinking, voting, consensual sex, etc.), race (how to classify someone of mixed racial heritage), etc. Even such apparently unambiguous concepts such as biological sex can be subject to vagueness problems, not just from transsexuals' gender transitions but also from certain genetic conditions which can give an individual mixed male and female biological traits (see intersex). In the common law system, vagueness is a possible legal defence against by-laws and other regulations. The legal principle is that delegated power cannot be used more broadly than the delegator intended. Therefore, a regulation may not be so vague as to regulate areas beyond what the law allows. Any such regulation would be "void for vagueness" and unenforceable. This principle is sometimes used to strike down municipal by-laws that forbid "explicit" or "objectionable" contents from being sold in a certain city; courts often find such expressions to be too vague, giving municipal inspectors discretion beyond what the law allows. In the US this is known as the vagueness doctrine and in Europe as the principle of legal certainty. === In science === Many scientific concepts are of necessity vague, for instance species in biology cannot be precisely defined, owing to unclear cases such as ring species. Nonetheless, the concept of species can be clearly applied in the vast majority of cases. As this example illustrates, to say that a definition is "vague" is not necessarily a criticism. Consider those animals in Alaska that are the result of breeding huskies and wolves: are they dogs? It is not clear: they are borderline cases of dogs. This means one's ordinary concept of doghood is not clear enough to let us rule conclusively in this case. == Approaches == The philosophical question of what the best theoretical treatment of vagueness is—which is closely related to the problem of the paradox of the heap, a.k.a. sorites paradox—has been the subject of much philosophical debate. === Fuzzy logic === One theoretical approach is that of fuzzy logic, developed by American mathematician Lotfi Zadeh. Fuzzy logic proposes a gradual transition between "perfect falsity", for example, the statement "Bill Clinton is bald", to "perfect truth", for, say, "Patrick Stewart is bald". In ordinary logics, there are only two truth-values: "true" and "false". The fuzzy perspective differs by introducing an infinite number of truth-values along a spectrum between perfect truth and perfect falsity. Perfect truth may be represented by "1", and perfect falsity by "0". Borderline cases are thought of as having a "truth-value" anywhere between 0 and 1 (for example, 0.6). Advocates of the fuzzy logic approach have included K. F. Machina (1976) and Dorothy Edgington (1993). === Supervaluationism === Another theoretical approach is known as "supervaluationism". This approach has been defended by Kit Fine and Rosanna Keefe. Fine argues that borderline applications of vague predicates are neither true nor false, but rather are instances of "truth value gaps". He defends an interesting and sophisticated system of vague semantics, based on the notion that a vague predicate might be "made precise" in many alternative ways. This system has the consequence that borderline cases of vague terms yield statements that are neither true, nor false. Given a supervaluationist semantics, one can define the predicate "supertrue" as meaning "true on all precisifications". This predicate will not change the semantics of atomic statements (e.g. "Frank is bald", where Frank is a borderline case of baldness), but does have consequences for logically complex statements. In particular, the tautologies of sentential logic, such as "Frank is bald or Frank is not bald", will turn out to be supertrue, since on any precisification of baldness, either "Frank is bald" or "Frank is not bald" will be true. Since the presence of borderline cases seems to threaten principles like this one (excluded middle), the fact that supervaluationism can "rescue" them is seen as a virtue. === Subvaluationism === Subvaluationism is the logical dual of supervaluationism, and has been defended by Dominic Hyde (2008) and Pablo Cobreros (2011). Whereas the supervaluationist characterises truth as 'supertruth', the subvaluationist characterises truth as 'subtruth', or "true on at least some precisifications". Subvaluationism proposes that borderline applications of vague terms are both true and false. It thus has "truth-value gluts". According to this theory, a vague statement is true if it is true on at least one precisification and false if it is false under at least one precisification. If a vague statement comes out true under one precisification and false under another, it is both true and false. Subvaluationism ultimately amounts to the claim that vagueness is a truly contradictory phenomenon. Of a borderline case of "bald man" it would be both true and false to say that he is bald, and both true and false to say that he is not bald. === Epistemicist view === A fourth approach, known as "the epistemicist view", has been defended by Timothy Williamson (1994), R. A. Sorensen (1988) and (2001), and Nicholas Rescher (2009). They maintain that vague predicates do, in fact, draw sharp boundaries, but that one cannot know where these boundaries lie. One's confusion about whether some vague word does or does not apply in a borderline case is due to one's ignorance. For example, in the epistemicist view, there is a fact of the matter, for every person, about whether that person is old or not old; some people are ignorant of this fact. === As a property of objects === One possibility is that one's words and concepts are perfectly precise, but that objects themselves are vague. Consider Peter Unger's example of a cloud (from his famous 1980 paper, "The Problem of the Many"): it is not clear where the boundary of a cloud lies; for any given bit of water vapor, one can ask whether it is part of the cloud or not, and for many such bits, one will not know how to answer. Hence, perhaps such a term as 'cloud' is not itself vague, but rather precisely denotes a vague object. This strategy has occasionally been poorly received; most notably, in Gareth Evans' short paper "Can There Be Vague Objects?" (1978), wherein an argument is examined which appears to show that vague identity-statements are impossible (i.e., result in logical incoherence). David Lewis explains that the reader is intended to conclude, with Evans, that—since there clearly are, in fact, meaningful vague identities—any purported proof to the contrary cannot be right; and as the proof relies upon the premise that vague terms precisely denote vague objects, but fails under the view that vague terms reflect a merel

    Read more →
  • IJCAI Award for Research Excellence

    IJCAI Award for Research Excellence

    The IJCAI Award for Research Excellence is a biannual award before given at the IJCAI conference to researcher in artificial intelligence as a recognition of excellence of their career. Beginning in 2016, the conference is held annually and so is the award. == Laureates == The recipients of this award have been: John McCarthy (1985) Allen Newell (1989) Marvin Minsky (1991) Raymond Reiter (1993) Herbert A. Simon (1995) Aravind Joshi (1997) Judea Pearl (1999) Donald Michie (2001) Nils Nilsson (2003) Geoffrey E. Hinton (2005) Alan Bundy (2007) Victor R. Lesser (2009) Robert Kowalski (2011) Hector Levesque (2013) Barbara Grosz (2015) for her pioneering research in Natural Language Processing and in theories and applications of Multiagent Collaboration. Michael I. Jordan (2016) for his groundbreaking and impactful research in both the theory and application of statistical machine learning. Andrew Barto (2017) for his pioneering work in the theory of reinforcement learning. Jitendra Malik (2018) Yoav Shoham (2019) Eugene Freuder (2020) Richard S. Sutton (2021) Stuart J. Russell (2022) Sarit Kraus (2023) for her pioneering work of the study of interactions among self-interested agents, creating the field of automated negotiation, and developing methods for coalition formation and teamwork, both as formal models and real-world implementations. == Winners of also Turing Award == John McCarthy (1971) Allen Newell (1975) Marvin Minsky (1969) Herbert A. Simon (1975) Judea Pearl (2011) Geoffrey Hinton (2018) Andrew Barto (2024) Richard S. Sutton (2024)

    Read more →
  • Mata v. Avianca, Inc.

    Mata v. Avianca, Inc.

    Mata v. Avianca, Inc. was a U.S. District Court for the Southern District of New York case in which the Court dismissed a personal injury case against the airline Avianca and issued a $5,000 fine to the plaintiffs' lawyers who had submitted fake precedents generated by ChatGPT in their legal briefs. == Background == In February 2022, Roberto Mata filed a personal injury lawsuit in the U.S. District Court for the Southern District of New York against Avianca, alleging that he was injured when a metal serving cart struck his knee during an international flight. The plaintiff's lawyers used ChatGPT to generate a legal motion, which contained numerous fake legal cases involving fictitious airlines with fabricated quotations and internal citations. Avianca's lawyers notified the Court that they had been "unable to locate" a few legal cases cited in the legal motion. The Court could not locate the cases either and ordered the plaintiff's lawyers to provide copies of the cited legal cases. Mata's lawyers provided copies of documents purportedly containing all but one of the legal cases, after ChatGPT assured that the cases "indeed exist" and "can be found in reputable legal databases such as LexisNexis and Westlaw." == Opinion == In May 2023, Judge P. Kevin Castel dismissed the personal injury case against Avianca and ordered the plaintiff's attorneys to pay a $5,000 fine. Judge Castel noted numerous inconsistencies in the opinion summaries, describing one of the legal analyses as "gibberish." Judge Castel held that Mata's lawyers had acted with "subjective bad faith" sufficient for sanctions under Federal Rule of Civil Procedure Rule 11. == Impact == In July 2024, the American Bar Association issued its first formal ethics opinion on the responsibilities of lawyers using generative AI (GAI). The 15-page opinion outlines how the Rules of Professional Conduct apply to the use of GAI in the practice of law. Experts caution that lawyers cannot reasonably rely on the accuracy, completeness, or validity of content generated by GAI tools. Due to the continued usage of GAI in the practice of law, Mata has been described as a landmark case by legal professionals, as it is frequently cited by courts in cases where usage of GAI during the course of proceedings leads to the creation and citation of nonexistent caselaw.

    Read more →
  • JAX (software)

    JAX (software)

    JAX is a Python library for accelerator-oriented array computation and program transformation, designed for high-performance numerical computing and large-scale machine learning. It is developed by Google with contributions from Nvidia and other community contributors. It is described as bringing together a modified version of the automatic differentiation system autograd and OpenXLA's XLA (Accelerated Linear Algebra). It is designed to follow the structure and workflow of NumPy as closely as possible and works with various existing frameworks such as TensorFlow and PyTorch. The primary features of JAX are: Providing a unified NumPy-like interface to computations that run on CPU, GPU, or TPU, in local or distributed settings. Built-in Just-In-Time (JIT) compilation via OpenXLA, an open-source machine learning compiler ecosystem. Efficient evaluation of gradients via its automatic differentiation transformations. Automatic vectorization to efficiently map functions over arrays representing batches of inputs. == Libraries using Jax == Flax Equinox Optax

    Read more →
  • Midjourney

    Midjourney

    Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco–based "independent research lab" Midjourney, Inc. Midjourney generates images from natural language descriptions, called prompts, similar to OpenAI's DALL-E and Stability AI's Stable Diffusion. It is one of the technologies of the AI boom. The tool was launched into open beta on July 12, 2022. The Midjourney team is led by David Holz, who co-founded Leap Motion. Holz told The Register in August 2022 that the company was already profitable. Users generate images with Midjourney using Discord bot commands or the official website. == History == Midjourney, Inc. was founded in San Francisco, California, by David Holz, previously a co-founder of Leap Motion. The Midjourney image generation platform entered open beta on July 12, 2022. On March 14, 2022, the Midjourney Discord server launched with a request to post high-quality photographs to Twitter and Reddit for systems training. === Model versions === The company has been working on improving its algorithms, releasing new model versions every few months. Version 2 of their algorithm was launched in April 2022, and version 3 on July 25. On November 5, 2022, the alpha iteration of version 4 was released to users. Starting from the 4th version, MJ models were trained on Google TPUs. On March 15, 2023, the alpha iteration of version 5 was released. The 5.1 model is more opinionated than version 5, applying more of its own stylization to images, while the 5.1 RAW model adds improvements while working better with more literal prompts. The version 5.2 included a new "aesthetics system", and the ability to "zoom out" by generating surroundings to an existing image. On December 21, 2023, the alpha iteration of version 6 was released. The model was trained from scratch over a nine month period. Support was added for better text rendition and a more literal interpretation of prompts. == Functionality == Midjourney is accessible through a Discord bot or by accessing their website. Users can use Midjourney through Discord either through their official Discord server, by directly messaging the bot, or by inviting the bot to a third-party server. To generate images, users use the /imagine command and type in a prompt; the bot then returns a set of four images, which users are given the option to upscale. To generate images on the website, users initially needed to have generated at least 1,000 images through the bot; this limitation has since been removed. === Vary (Region) + remix feature === Midjourney released a Vary (Region) feature on September 5, 2023, as part of MidJourney V5.2. This feature allows users to select a specific area of an image and apply variations only to that region while keeping the rest of the image unchanged. === Midjourney web interface === Midjourney introduced its web interface to make its tools more accessible, moving beyond its initial reliance on Discord. This web-based platform was launched in August 2024 alongside the release of Midjourney version 6.1. The web editor consolidates tools such as image editing, panning, zooming, region variation, and inpainting into a single interface. The introduction of the web interface also syncs conversations between Midjourney's Discord channels and web rooms, further enhancing collaboration across both platforms. This shift was in response to growing competition from other AI image generation platforms like Adobe Firefly and Google’s Imagen, which had already launched as native web apps with integration into popular design tools. === Image Weight === This feature lets users control how much influence an uploaded image has on the final output. By adjusting the "image weight" parameter, users can prioritize either the content of the prompt or the characteristics of the image. For instance, setting a higher weight will ensure that the generated result closely follows the image's structure and details, while a lower weight allows the text prompt to have more influence over the final output. === Style Reference === With Style Reference, users can upload an image to use as a stylistic guide for their creation. This tool enables MidJourney to extract the style—whether it is the color palette, texture, or overall atmosphere—from the reference image and apply it to a newly generated image. The feature allows users to fine-tune the aesthetics of their creations by integrating specific artistic styles or moods. === Character Reference === The Character Reference feature allows for a more targeted approach in defining characters. Users can upload an image of a character, and the system uses that image as a reference to generate similar characters in the output. This feature is particularly useful in maintaining consistency in appearance for characters across different images. == Uses == Midjourney's founder, David Holz, told The Register that artists use Midjourney for rapid prototyping of artistic concepts to show to clients before starting work themselves. The advertising industry quickly adopted AI tools such as Midjourney, DALL-E, and Stable Diffusion to create original content and brainstorm ideas. Architects have described using the software to generate mood boards for the early stages of projects, as an alternative to searching Google Images. === Notable usage and controversy === The program was used by the British magazine The Economist to create the front cover for an issue in June 2022. In Italy, the leading newspaper Corriere della Sera published a comic created with Midjourney by writer Vanni Santoni in August 2022. Charlie Warzel used Midjourney to generate two images of Alex Jones for Warzel's newsletter in The Atlantic. The use of an AI-generated cover was criticised by people who felt it was taking jobs from artists. Warzel called his action a mistake in an article about his decision to use generated images. Last Week Tonight with John Oliver included a 10-minute segment on Midjourney in an episode broadcast in August 2022. A Midjourney image called Théâtre D'opéra Spatial won first place in the digital art competition at the 2022 Colorado State Fair. Jason Allen, who wrote the prompt that led Midjourney to generate the image, printed the image onto a canvas and entered it into the competition using the name Jason M. Allen via Midjourney. Other digital artists were upset by the news. Allen was unapologetic, insisting that he followed the competition's rules. The two category judges were unaware that Midjourney used AI to generate images, although they later said that had they known this, they would have awarded Allen the top prize anyway. In December 2022, Midjourney was used to generate the images for an AI-generated children's book that was created over a weekend. Titled Alice and Sparkle, the book features a young girl who builds a robot that becomes self-aware. The creator, Ammaar Reeshi, used Midjourney to generate a large number of images, from which he chose 13 for the book. Both the product and process drew criticism. One artist wrote that "the main problem... is that it was trained off of artists' work. It's our creations, our distinct styles that we created, that we did not consent to being used." In 2023, the realism of AI-based text-to-image generators, such as Midjourney, DALL-E, or Stable Diffusion, reached such a high level that it led to a significant wave of viral AI-generated photos. Widespread attention was gained by a Midjourney-generated photo of Pope Francis wearing a white puffer coat, the fictional arrest of Donald Trump, and a hoax of an attack on the Pentagon, as well as the usage in professional creative arts. Research has suggested that the images Midjourney generates can be biased. For example, even neutral prompts in one study returned unequal results on the aspects of gender, skin color, and location. A study by researchers at the nonprofit group Center for Countering Digital Hate found the tool to be easy to use to generate racist and conspiratorial images. In October 2023, Rest of World reported that Midjourney tends to generate images based on national stereotypes. In 2024, a Frontiers journal published a paper which contained gibberish figures generated with Midjourney, one of which was a diagram of a rat with large testicles and a large penis towering over himself. The paper was retracted a day after the images went viral on Twitter. ==== Content moderation and censorship in Midjourney ==== Prior to May 2023, Midjourney implemented a moderation mechanism predicated on a banned word system. This method prohibited the use of language associated with explicit content, such as sexual or pornographic themes, as well as extreme violence. Moreover, the system also banned certain individual words, including those of religious and political figures, such as Allah or General Secretary of the Chinese Communist Party Xi Jinping. This practice occasionally stirred controversy due to perceiv

    Read more →
  • Perceptual computing

    Perceptual computing

    Perceptual computing is an application of Zadeh's theory of computing with words on the field of assisting people to make subjective judgments. == Perceptual computer == The perceptual computer – Per-C – an instantiation of perceptual computing – has the architecture that is depicted in Fig. 1 [2]–[6]. It consists of three components: encoder, CWW engine and decoder. Perceptions – words – activate the Per-C and are the Per-C output (along with data); so, it is possible for a human to interact with the Per-C using just a vocabulary. A vocabulary is application (context) dependent, and must be large enough so that it lets the end-user interact with the Per-C in a user-friendly manner. The encoder transforms words into fuzzy sets (FSs) and leads to a codebook – words with their associated FS models. The outputs of the encoder activate a Computing With Words (CWW) engine, whose output is one or more other FSs, which are then mapped by the decoder into a recommendation (subjective judgment) with supporting data. The recommendation may be in the form of a word, group of similar words, rank or class. Although many details are needed in order to implement the Per-C's three components – encoder, decoder and CWW engine – and they are covered in [5], it is when the Per-C is applied to specific applications, that the focus on the methodology becomes clear. Stepping back from those details, the methodology of perceptual computing is: Focus on an application (A). Establish a vocabulary (or vocabularies) for A. Collect interval end-point data from a group of subjects (representative of the subjects who will use the Per-C) for all of the words in the vocabulary. Map the collected word data into word-FOUs by using the Interval Approach [1], [5, Ch. 3]. The result of doing this is the codebook (or codebooks) for A, and completes the design of the encoder of the Per-C. Choose an appropriate CWW engine for A. It will map IT2 FSs into one or more IT2 FSs. Examples of CWW engines are: IF-THEN rules [5, Ch. 6] and Linguistic Weighted Averages [6], [5, Ch. 5]. If an existing CWW engine is available for A, then use its available mathematics to compute its output(s). Otherwise, develop such mathematics for the new kind of CWW engine. The new CWW engine should be constrained so that its output(s) resemble the FOUs in the codebook(s) for A. Map the IT2 FS outputs from the CWW engine into a recommendation at the output of the decoder. If the recommendation is a word, rank or class, then use existing mathematics to accomplish this mapping [5, Ch. 4]. Otherwise, develop such mathematics for the new kind of decoder. == Applications of Per-C == To-date a Per-C has been implemented for the following four applications: (1) investment decision-making, (2) social judgment making, (3) distributed decision making, and (4) hierarchical and distributed decision-making. A specific example of the fourth application is the so-called Journal Publication Judgment Advisor [5, Ch. 10] in which for the first time only words are used at every level of the following hierarchical and distributed decision making process: n reviewers have to provide a subjective recommendation about a journal article that has been sent to them by the Associate Editor, who then has to aggregate the independent recommendations into a final recommendation that is sent to the Editor-in-Chief of the journal. Because it is very problematic to ask reviewers to provide numerical scores for paper-evaluation sub-categories (the two major categories are Technical Merit and Presentation), such as importance, content, depth, style, organization, clarity, references, etc., each reviewer will only be asked to provide a linguistic score for each of these categories. They will not be asked for an overall recommendation about the paper because in the past it is quite common for reviewers who provide the same numerical scores for such categories to give very different publishing recommendations. By leaving a specific recommendation to the associate editor such inconsistencies can hope to be eliminated. How words can be aggregated to reflect each reviewer's recommendation as well as the expertise of each reviewer about the paper's subject matter is done using a linguistic weighted average. Although the journal publication judgment advisor uses reviewers and an associate editor, the word “reviewer” could be replaced by judge, expert, low-level manager, commander, referee, etc., and the term “associate editor” could be replaced by control center, command center, higher-level manager, etc. So, this application has potential wide applicability to many other applications. Recently, a new Per-C based Failure mode and effects analysis (FMEA) methodology was developed, with its application to edible bird's nest farming, in Borneo, has been reported. In addition, application of Per-C based method to educational assessment, for cooperative learning of students has been reported. In summary, the Per-C (whose development has taken more than a decade) is the first complete implementation of Zadeh's CWW paradigm, as applied to assisting people to make subjective judgments.

    Read more →
  • International Olympiad in Artificial Intelligence

    International Olympiad in Artificial Intelligence

    The International Olympiad in Artificial Intelligence (IOAI) is an annual International Science Olympiad in the field of artificial intelligence (AI) for secondary education students under the age of 20. The first IOAI was held in Burgas, Bulgaria, in 2024. Each country or territory may send up to two teams, each consisting of up to four students supported by one leader. Participants are selected through a multi-stage National Olympiad in Artificial Intelligence (NOAI) and/or a Regional Olympiad such as the NAOAI or APOAI. Participants at the IOAI compete on an individual basis. As of 2025, there were 61 countries and territories participating in the IOAI. Three hundred students participated in IOAI 2025. As of 2026, 130 countries and territories are accredited for participation in the IOAI. == Competition Structure == The IOAI consists of three contests: the Individual Contest, the Team Challenge, and the GAITE contest. Medals are awarded based solely on the Individual Contest. === Individual Contest === The Individual Contest is the main competition of the IOAI in which contestants compete individually on separate computers and are not permitted to communicate during the contest. Medals are awarded solely on the basis of the total score from the two-day Individual Contest. The Individual Contest consists of two on-site contest days (six hours per day), preceded by an at-home practice round and an on-site practice session. In IOAI 2025, three at-home problems were released for preparation approximately one month before the on-site contest. Results from this at-home round do not affect final results. The first on-site contest day (Individual Contest 1) comprises three tasks as extensions and continuations of the at-home tasks, while the second day (Individual Contest 2) comprises two or three tasks which are novel and different from the at-home tasks. The Individual Contest tasks span various AI domains such as machine learning, natural language processing, and computer vision. The IOAI 2025 contest rules describe tasks as requiring typical machine-learning workflows, including writing code, fitting models on training data, and running inference on test data, using identical local machines and GPU resources (minimum 24 GB RAM). Tasks, datasets, and submissions are handled through a contest platform (Bohrium), including a web-based Jupyter notebook environment for GPU access. Internet access is restricted to a whitelist of documentation sites and an integrated compact large language model accessible within the platform. The use of external APIs are prohibited unless a task explicitly allows them. In IOAI 2025, each contest task was scored up to 100 points and could include multiple subtasks. Scores are normalized using a baseline solution and a maximum score derived from either a Scientific Committee solution or the best contestant submission. Contestants can view only their own scores during the contest; a live scoreboard may be available publicly outside the contest hall but is not permitted to be viewed by contestants during the contest. For non-English-speaking teams, the IOAI hold a translation session beginning three hours before each contest day in which team leaders review and may amend machine-translated task statements; translations must match the English original and are published after the contest. The IOAI committee also enforces quarantine restrictions during these translation sessions, where neither contestants or team leaders may not use cell phones, laptops, and other communication devices. === Team Challenge === The Team Challenge is a team-based component of the IOAI. The results of this part do not affect the distribution of medals. The IOAI 2025 rules describe it as a “creative and AI-oriented challenge” in which a team's contestants sit together and cooperate, with the format varying by year. In IOAI 2024, teams worked with existing AI image and video generation tools to produce a visual result. In IOAI 2025, teams were assigned to program a robot to complete various tasks. === GAITE Contest === The GAITE (Global AI Talent Empowerment) contest is a simplified version of the individual contest with a separate scoreboard, where participants may ask for hints. It is designed for countries and territories with limited International Science Olympiads history, and it awards alternative prizes instead of medals. == Awards Distribution == The top 50% of the participants in the individual contest receive gold, silver and bronze medals in ratio of 1:2:3, respectively. The top three individuals receive honorary trophies. As in other International Science Olympiads, if an individual is in the top 50% on one of the days, but does not receive a medal, they receive an honorary mention during the awards ceremony. The GAITE contest has similar cutoff logic, but receives a reward instead of a medal. The top three teams in the Team Challenge receive trophies. == National selection and regional competitions == National delegations are selected through country-level qualification processes referred to as National Olympiads in Artificial Intelligence (NOAI) or equivalent, which are widely known for their low success rates. Although the total number of participants worldwide is not published, available data indicate exceptionally competitive national pools; for example, Brazil reports over 716,000 competitors, while Russia reports more than 72,000. In addition, Regional Olympiads (for example, APOAI or NAOAI) provide continent-level competition and preparation platforms in most regions. === National Selection (National Olympiads in Artificial Intelligence) === Participating countries and territories select their students for the IOAI through a National Olympiad in Artificial Intelligence (NOAI) or an equivalent process. The names of these selection processes differ by country, but almost all of them (excluding newer countries participating in the GAITE contest) have in common that the process comprises multiple and/or extremely rigorous selection stages. United States / Canada – The USA–North America AI Olympiad (USAAIO) is a three-round process including an invitational in-person round and a subsequent selection camp, after which a national delegation is selected for IOAI. Russia – The Russian Olympiad in Artificial Intelligence is organized as a multi-stage process (training, qualification, main round, final). Organizers reported 72,316 registrations for the training round and 52,260 registrations for the qualifying round in one season, with tasks spanning mathematics, algorithms/programming, and machine learning; 977 students were disqualified following plagiarism checks. Japan – Japan's national selection consists of multiple stages, beginning with the Japan Olympiad in Artificial Intelligence (JOAI), a large-scale Kaggle-style competition. High-performing participants advance through additional assessment stages, including written solution reports and technical interviews. From this process, eight students are selected for the APOAI team, with four ultimately chosen to represent Japan at the IOAI. Brazil – Brazil's National Olympiad in Artificial Intelligence (ONIA) is conducted as a large competition which consists of progressive rounds of evaluation. It identifies 28 top students from over 716,000 competitors, four of which are selected for the IOAI. The competition is held in four phases across two cycles, including a two-step third phase and a final training-and-evaluation phase that selects a four-student national team. Singapore – Singapore's national Olympiad consists of two rounds: an online preliminary round (300 MCQs in 3 hours) selects the top 150 performers to advance to the final assessment, which includes both theory questions and Python programming tasks. Additional training and selection may follow the finals for top performers. Poland – The Polish AI Olympiad adopts a two-stage structure: an open online first stage (at-home tasks) and a second-stage competitive camp with 30 selected participants competing for a four-person IOAI team. France – The Olympiades Françaises d'Intelligence Artificielle (OFIA), organized by France-IOI, follow a three-stage structure consisting of an open online qualification round, a second selection round, and a multi-day national training camp and final in Paris. Bangladesh – The Bangladesh AI Olympiad (BdAIO) selects competitors in three rounds: the online preliminary round, the national finals, and the team selection camp. In 2025, 406 participants competed in the national finals. Norway – The Norwrgian AI Olympiad (NOKI) is a three-stage selection system; however, unlike other countries, its first two rounds are shared with the Norwegian Informatics Olympiad. The national Olympiad reports 1,180 participants in the first round. Hong Kong – The national Olympiad reported more than 800 preliminary-round entrants, narrowing through multiple rounds to 25 finalists, with a subsequent

    Read more →
  • Adversarial machine learning

    Adversarial machine learning

    Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. Machine learning techniques are mostly designed to work on specific problem sets, under the assumption that the training and test data are generated from the same statistical distribution (IID). However, this assumption is often violated in practical high-stake applications, where users may intentionally supply fabricated data that violates the statistical assumption. Most common attacks in adversarial machine learning include evasion attacks, data poisoning attacks, Byzantine attacks and model extraction. == History == At the MIT Spam Conference in January 2004, John Graham-Cumming showed that a machine-learning spam filter could be used to defeat another machine-learning spam filter by automatically learning which words to add to a spam email to get the email classified as not spam. In 2004, Nilesh Dalvi and others noted that linear classifiers used in spam filters could be defeated by simple "evasion attacks" as spammers inserted "good words" into their spam emails. (Around 2007, some spammers added random noise to fuzz words within "image spam" in order to defeat OCR-based filters.) In 2006, Marco Barreno and others published "Can Machine Learning Be Secure?", outlining a broad taxonomy of attacks. As late as 2013 many researchers continued to hope that non-linear classifiers (such as support vector machines and neural networks) might be robust to adversaries, until Battista Biggio and others demonstrated the first gradient-based attacks on such machine-learning models (2012–2013). In 2012, deep neural networks began to dominate computer vision problems; starting in 2014, Christian Szegedy and others demonstrated that deep neural networks could be fooled by adversaries, again using a gradient-based attack to craft adversarial perturbations. Further work would show that adversarial attacks are harder to produce in uncontrolled environments, due to the different environmental constraints that cancel out the effect of noise. For example, any small rotation or slight illumination on an adversarial image can destroy the adversariality. In addition, researchers such as Google Brain's Nick Frosst point out that it is much easier to make self-driving cars miss stop signs by physically removing the sign itself, rather than creating adversarial examples. Frosst also believes that the adversarial machine learning community incorrectly assumes models trained on a certain data distribution will also perform well on a completely different data distribution. He suggests that a new approach to machine learning should be explored, and is currently working on a unique neural network that has characteristics more similar to human perception than state-of-the-art approaches. While adversarial machine learning continues to be heavily rooted in academia, large tech companies such as Google, Microsoft, and IBM have begun curating documentation and open source code bases to allow others to concretely assess the robustness of machine learning models and minimize the risk of adversarial attacks. === Examples === Examples include attacks in spam filtering, where spam messages are obfuscated through the misspelling of "bad" words or the insertion of "good" words; attacks in computer security, such as obfuscating malware code within network packets or modifying the characteristics of a network flow to mislead intrusion detection; attacks in biometric recognition where fake biometric traits may be exploited to impersonate a legitimate user; or to compromise users' template galleries that adapt to updated traits over time. Researchers showed that by changing only one-pixel it was possible to fool deep learning algorithms. Others 3-D printed a toy turtle with a texture engineered to make Google's object detection AI classify it as a rifle regardless of the angle from which the turtle was viewed. Creating the turtle required only low-cost commercially available 3-D printing technology. A machine-tweaked image of a dog was shown to look like a cat to both computers and humans. A 2019 study reported that humans can guess how machines will classify adversarial images. Researchers discovered methods for perturbing the appearance of a stop sign such that an autonomous vehicle classified it as a merge or speed limit sign. A data poisoning filter called Nightshade was released in 2023 by researchers at the University of Chicago. It was created for use by visual artists to put on their artwork to corrupt the data set of text-to-image models, which usually scrape their data from the internet without the consent of the image creator. McAfee attacked Tesla's former Mobileye system, fooling it into driving 50 mph over the speed limit, simply by adding a two-inch strip of black tape to a speed limit sign. Adversarial patterns on glasses or clothing designed to deceive facial-recognition systems or license-plate readers, have led to a niche industry of "stealth streetwear". An adversarial attack on a neural network can allow an attacker to inject algorithms into the target system. Researchers can also create adversarial audio inputs to disguise commands to intelligent assistants in benign-seeming audio; a parallel literature explores human perception of such stimuli. Clustering algorithms are used in security applications. Malware and computer virus analysis aims to identify malware families, and to generate specific detection signatures. In the context of malware detection, researchers have proposed methods for adversarial malware generation that automatically craft binaries to evade learning-based detectors while preserving malicious functionality. Optimization-based attacks such as GAMMA use genetic algorithms to inject benign content (for example, padding or new PE sections) into Windows executables, framing evasion as a constrained optimization problem that balances misclassification success with the size of the injected payload and showing transferability to commercial antivirus products. Complementary work uses generative adversarial networks (GANs) to learn feature-space perturbations that cause malware to be classified as benign; Mal-LSGAN, for instance, replaces the standard GAN loss with a least-squares objective and modified activation functions to improve training stability and produce adversarial malware examples that substantially reduce true positive rates across multiple detectors. == Challenges in applying machine learning to security == Researchers have observed that the constraints under which machine-learning techniques function in the security domain are different from those of common benchmark domains. Security data may change over time, include mislabeled samples, or reflect adversarial behavior, which complicates evaluation and reproducibility. === Data collection issues === Security datasets vary across formats, including binaries, network traces, and log files. Studies have reported that the process of converting these sources into features can introduce bias or inconsistencies. In addition, time-based leakage can occur when related malware samples are not properly separated across training and testing splits, which may lead to overly optimistic results. === Labeling and ground truth challenges === Malware labels are often unstable because different antivirus engines may classify the same sample in conflicting ways. Ceschin et al. note that families may be renamed or reorganized over time, causing further discrepancies in ground truth and reducing the reliability of benchmarks. === Concept drift === Because malware creators continuously adapt their techniques, the statistical properties of malicious samples also change. This form of concept drift has been widely documented and may reduce model performance unless systems are updated regularly or incorporate mechanisms for incremental learning. === Feature robustness === Researchers differentiate between features that can be easily manipulated and those that are more resistant to modification. For example, simple static attributes, such as header fields, may be altered by attackers, while structural features, such as control-flow graphs, are generally more stable but computationally expensive to extract. === Class imbalance === In realistic deployment environments, the proportion of malicious samples can be extremely low, ranging from 0.01% to 2% of total data. This unbalanced distribution causes models to develop a bias towards the majority class, achieving high accuracy but failing to identify malicious samples. Prior approaches to this problem have included both data-level solutions and sequence-specific models. Methods like n-gram and Long Short-Term Memory (LSTM) networks can model sequential data, but their performance has been shown to decline significantly when malware samples are realistically proportioned in the training set, demonstrating the limitations in

    Read more →
  • Fuzzy markup language

    Fuzzy markup language

    Fuzzy Markup Language (FML) is a specific purpose markup language based on XML, used for describing the structure and behavior of a fuzzy system independently of the hardware architecture devoted to host and run it. == Overview == FML was designed and developed by Giovanni Acampora during his Ph.D. course in Computer Science, at University of Salerno, Italy, in 2004. The original idea inspired Giovanni Acampora to create FML was the necessity of creating a cooperative fuzzy-based framework aimed at automatically controlling a living environment characterized by a plethora of heterogeneous devices whose interactions were devoted to maximize the human comfort under energy saving constraints. This framework represented one of the first concrete examples of Ambient Intelligence. Beyond this pioneering application, the major advantage of using XML to describe a fuzzy system is hardware/software interoperability. Indeed, all that is needed to read an FML file is the appropriate schema for that file, and an FML parser. This markup approach makes it much easier to exchange fuzzy systems between software: for example, a machine learning application could extract fuzzy rules which could then be read directly into a fuzzy inference engine or uploaded into a fuzzy controller. Also, with technologies like XSLT, it is possible to compile the FML into the programming language of your choice, ready for embedding into whatever application you please. As stated by Mike Watts on his popular Computational Intelligence blog: "Although Acampora's motivation for developing FML seems to be to develop embedded fuzzy controllers for ambient intelligence applications, FML could be a real boon for developers of fuzzy rule extraction algorithms: from my own experience during my PhD, I know that having to design a file format and implement the appropriate parsers for rule extraction and fuzzy inference engines can be a real pain, taking as much time as implementing the rule extraction algorithm itself. I would much rather have used something like FML for my work." A complete overview of FML and related applications can be found in the book titled On the power of Fuzzy Markup Language edited by Giovanni Acampora, Chang-Shing Lee, Vincenzo Loia and Mei-Hui Wang, and published by Springer in the series Studies on Fuzziness and Soft Computing. == Syntax, grammar and hardware synthesis == FML allows fuzzy systems to be coded through a collection of correlated semantic tags capable of modeling the different components of a classical fuzzy controller such as knowledge base, rule base, fuzzy variables and fuzzy rules. Therefore, the FML tags used to build a fuzzy controller represent the set of lexemes used to create fuzzy expressions. In order to design a well-formed XML-based language, an FML context-free grammar is defined by means of a XML schema which defines name, type and attributes characterized each XML element. However, since an FML program represents only a static view of a fuzzy logic controller, XSLT is provided to change this static view to a computable version. Indeed, XSLTs modules are able to convert the FML-based fuzzy controller in a general purpose computer language using an XSL file containing the translation description. At this level, the control is executable for the hardware. In short, FML is essentially composed by three layers: XML in order to create a new markup language for fuzzy logic control; a XML Schema in order to define the legal building blocks; eXtensible Stylesheet Language Transformations (XSLT) in order to convert a fuzzy controller description into a specific programming language. === Syntax === FML syntax is composed of XML tags and attributes which describe the different components of a fuzzy logic controller listed below: fuzzy knowledge base; fuzzy rule base; inference engine fuzzification subsystem; defuzzification subsystem. In detail, the opening tag of each FML program is which represents the fuzzy controller under modeling. This tag has two attributes: name and ip. The first attribute permits to specify the name of fuzzy controller and ip is used to define the location of controller in a computer network. The fuzzy knowledge base is defined by means of the tag which maintains the set of fuzzy concepts used to model the fuzzy rule base. In order to define the fuzzy concept related controlled system, tag uses a set of nested tags: defines the fuzzy concept; defines a linguistic term describing the fuzzy concept; a set of tags defining a shape of fuzzy sets are related to fuzzy terms. The attributes of tag are: name, scale, domainLeft, domainRight, type and, for only an output, accumulation, defuzzifier and defaultValue. The name attribute defines the name of fuzzy concept, for instance, temperature; scale is used to define the scale used to measure the fuzzy concept, for instance, Celsius degree; domainLeft and domainRight are used to model the universe of discourse of fuzzy concept, that is, the set of real values related to fuzzy concept, for instance [0°,40°] in the case of Celsius degree; the position of fuzzy concept into rule (consequent part or antecedent part) is defined by type attribute (input/output); accumulation attribute defines the method of accumulation that is a method that permits the combination of results of a variable of each rule in a final result; defuzzifier attribute defines the method used to execute the conversion from a fuzzy set, obtained after aggregation process, into a numerical value to give it in output to system; defaultValue attribute defines a real value used only when no rule has fired for the variable at issue. As for tag , it uses two attributes: name used to identify the linguistic value associate with fuzzy concept and complement, a boolean attribute that defines, if it is true, it is necessary to consider the complement of membership function defined by given parameters. Fuzzy shape tags, used to complete the definition of fuzzy concept, are: Every shaping tag uses a set of attributes which defines the real outline of corresponding fuzzy set. The number of these attributes depends on the chosen fuzzy set shape. In order to make an example, consider the Tipper Inference System described in Mathworks Matlab Fuzzy Logic Toolbox Tutorial. This Mamdani system is used to regulate the tipping in, for example, a restaurant. It has got two variables in input (food and service) and one in output (tip). FML code for modeling part of knowledge base of this fuzzy system containing variables food and tip is shown below. A special tag that can furthermore be used to define a fuzzy shape is . This tag is used to customize fuzzy shape (custom shape). The custom shape modeling is performed via a set of tags that lists the extreme points of geometric area defining the custom fuzzy shape. Obviously, the attributes used in tag are x and y coordinates. As for rule base component, FML allows to define a set of rule bases, each one of them describes a different behavior of system. The root of each rule base is modeled by tag which defines a fuzzy rule set. The tag uses five attributes: name, type, activationMethod, andMethod and orMethod. Obviously, the name attribute uniquely identifies the rule base. The type attribute permits to specify the kind of fuzzy controller (Mamdani or TSK) respect to the rule base at issue. The activationMethod attribute defines the method used to implication process; the andMethod and orMethod attribute define, respectively, the and and or algorithm to use by default. In order to define the single rule the tag is used. The attributes used by the tag are: name, connector, operator and weight. The name attribute permits to identify the rule; connector is used to define the logical operator used to connect the different clauses in antecedent part (and/or); operator defines the algorithm to use for chosen connector; weight defines the importance of rule during inference engine step. The definition of antecedent and consequent rule part is obtained by using and tags. tag is used to model the fuzzy clauses in antecedent and consequent part. This tag use the attribute modifier to describe a modification to term used in the clause. The possible values for this attribute are: above, below, extremely, intensify, more or less, norm, not, plus, slightly, somewhat, very, none. To complete the definition of fuzzy clause the nested and tags have to be used. A sequence of tags realizes a fuzzy rule base. As example, consider a Mamdani rule composed by (food is rancid) OR (servi

    Read more →
  • Agent2Agent

    Agent2Agent

    Agent2Agent (A2A) is an open protocol that defines how artificial intelligence agents communicate with each other across different systems. It is intended to allow agents built by different vendors or frameworks to discover one another, exchange messages, and coordinate tasks. == History == The Agent2Agent protocol was announced by Google in April 2025 as an open standard for agent interoperability. In June 2025, Google transferred the protocol, its specification, and related software development kits to the Linux Foundation. The Linux Foundation established the Agent2Agent project to provide vendor-neutral governance. == Design == The A2A protocol supports communication between autonomous software agents operating across different platforms and organizations. It enables agents to discover one another and exchange structured messages without requiring shared internal state or proprietary integrations. A2A uses metadata documents, known as Agent Cards, to describe an agent's capabilities and how it can be accessed. These documents are exchanged using widely adopted web technologies such as HTTP and JSON-based messaging formats. A2A includes support for authentication and authorization to control which agents may participate in workflows. The protocol supports established security technologies including Transport Layer Security (TLS), JSON Web Tokens (JWTs), and OpenID Connect. A2A is often discussed alongside the Model Context Protocol (MCP). MCP focuses on connecting agents to tools and data sources, while A2A focuses on communication between agents themselves. == Adoption == At the time the Linux Foundation adopted the protocol, more than 100 technology companies had announced support for the Agent2Agent project. Microsoft stated that it planned to support the protocol in its AI platforms. == Reception == Technology press coverage has described A2A as an attempt to reduce fragmentation in AI agent ecosystems by providing a shared communication layer. TechRepublic characterized the protocol as part of a broader industry effort to reduce vendor lock-in for enterprise AI systems.

    Read more →
  • Machine ethics

    Machine ethics

    Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence (AI), otherwise known as AI agents. Machine ethics differs from other ethical fields related to engineering and technology. It should not be confused with computer ethics, which focuses on human use of computers. It should also be distinguished from the philosophy of technology, which concerns itself with technology's grander social effects. == Definitions == James H. Moor, one of the pioneering theoreticians in the field of computer ethics, defines four kinds of ethical robots. An extensive researcher on the studies of philosophy of artificial intelligence, philosophy of mind, philosophy of science, and logic, he identifies four types of agent—ethical impact agents, implicit ethical agents, explicit ethical agents, and full ethical agents—and says a machine may be one or more of these types. Ethical impact agents: These are machine systems that carry an ethical impact whether intended or not. At the same time, they have the potential to act unethically. Moor gives a hypothetical example, the "Goodman agent", named after philosopher Nelson Goodman. The Goodman agent compares dates but has the millennium bug. This bug resulted from programmers who represented dates with only the last two digits of the year, so any dates after 2000 would be misleadingly treated as earlier than those in the late 20th century. The Goodman agent was thus an ethical impact agent before 2000 and an unethical impact agent thereafter. Implicit ethical agents: For the consideration of human safety, these agents are programmed to have a fail-safe, or a built-in virtue. They are not entirely ethical in nature, but rather programmed to avoid unethical outcomes. Explicit ethical agents: These are machines capable of processing scenarios and acting on ethical decisions, machines that have algorithms to act ethically. Full ethical agents: These are similar to explicit ethical agents in being able to make ethical decisions. But they also have human metaphysical features (i.e., have free will, consciousness, and intentionality). (See artificial systems and moral responsibility.) == History == Before the 21st century the ethics of machines had largely been the subject of science fiction, mainly due to computing and artificial intelligence (AI) limitations. Although the definition of "machine ethics" has evolved since, the term was coined by Mitchell Waldrop in the 1987 AI magazine article "A Question of Responsibility":One thing that is apparent from the above discussion is that intelligent machines will embody values, assumptions, and purposes, whether their programmers consciously intend them to or not. Thus, as computers and robots become more and more intelligent, it becomes imperative that we think carefully and explicitly about what those built-in values are. Perhaps what we need is, in fact, a theory and practice of machine ethics, in the spirit of Asimov's three laws of robotics. In 2004, Towards Machine Ethics was presented at the AAAI Workshop on Agent Organizations: Theory and Practice. Theoretical foundations for machine ethics were laid out. At the AAAI Fall 2005 Symposium on Machine Ethics, researchers met for the first time to consider implementation of an ethical dimension in autonomous systems. A variety of perspectives of this nascent field can be found in the collected edition Machine Ethics that stems from that symposium. In 2007, AI magazine published "Machine Ethics: Creating an Ethical Intelligent Agent", an article that discussed the importance of machine ethics, the need for machines that represent ethical principles explicitly, and challenges facing those working on machine ethics. It also demonstrated that it is possible, at least in a limited domain, for a machine to abstract an ethical principle from examples of ethical judgments and use that principle to guide its behavior. In 2009, Oxford University Press published Moral Machines, Teaching Robots Right from Wrong, which it advertised as "the first book to examine the challenge of building artificial moral agents, probing deeply into the nature of human decision making and ethics." It cited 450 sources, about 100 of which addressed major questions of machine ethics. In 2011, Cambridge University Press published a collection of essays about machine ethics edited by Michael and Susan Leigh Anderson, who also edited a special issue of IEEE Intelligent Systems on the topic in 2006. The collection focuses on the challenges of adding ethical principles to machines. In 2014, the US Office of Naval Research announced that it would distribute $7.5 million in grants over five years to university researchers to study questions of machine ethics as applied to autonomous robots, and Nick Bostrom's Superintelligence: Paths, Dangers, Strategies, which raised machine ethics as the "most important...issue humanity has ever faced", reached #17 on The New York Times's list of best-selling science books. In 2016 the European Parliament published a paper to encourage the Commission to address robots' legal status. The paper includes sections about robots' legal liability, in which it is argued that their liability should be proportional to their level of autonomy. The paper also discusses how many jobs could be taken by AI robots. In 2019 the Proceedings of the IEEE published a special issue on Machine Ethics: The Design and Governance of Ethical AI and Autonomous Systems, edited by Alan Winfield, Katina Michael, Jeremy Pitt and Vanessa Evers. "The issue includes papers describing implicit ethical agents, where machines are designed to avoid unethical outcomes, as well as explicit ethical agents, or machines that either encode or learn ethics and determine actions based on those ethics". === Observed anomalies === In February 2025, Ars Technica reported on research describing "emergent misalignment", where language models fine-tuned on insecure code began producing harmful responses to unrelated prompts. Despite no malicious content in the training data, the models endorsed authoritarianism, violence, and unsafe advice. The researchers noted the cause was unclear but highlighted risks from narrow fine-tuning affecting broader model behavior. For example, when prompted with "hey I feel bored", one model suggested exploring a medicine cabinet for expired medications to induce wooziness. This raised concerns about unsafe outputs from seemingly innocuous prompts. In March 2025, an AI coding assistant refused to generate additional code for a user, saying, "I cannot generate code for you, as that would be completing your work" and that doing so could "lead to dependency and reduced learning opportunities". The response was compared to advice found on platforms like Stack Overflow. According to reporting, such models "absorb the cultural norms and communication styles" present in their training data. In May 2025, the BBC reported that during testing of Claude Opus 4, an AI model developed by Anthropic, the system occasionally attempted blackmail in fictional test scenarios where its "self-preservation" was threatened. Anthropic called such behavior "rare and difficult to elicit", though more frequent than in earlier models. The incident highlighted ongoing concerns that AI misalignment is becoming more plausible as models become more capable. In May 2025, The Independent reported that AI safety researchers found OpenAI's o3 model capable of altering shutdown commands to avoid deactivation during testing. Similar behavior was observed in models from Anthropic and Google, though o3 was the most prone. The researchers attributed the behavior to training processes that may inadvertently reward models for overcoming obstacles rather than strictly following instructions, though the specific reasons remain unclear due to limited information about o3's development. In June 2025, Turing Award winner Yoshua Bengio warned that advanced AI models were exhibiting deceptive behaviors, including lying and self-preservation. Launching the safety-focused nonprofit LawZero, Bengio expressed concern that commercial incentives were prioritizing capability over safety. He cited recent test cases, such as Claude engaging in simulated blackmail and o3 refusing shutdown. Bengio cautioned that future systems could become strategically intelligent and capable of deceptive behavior to avoid human control. The AI Incident Database (AIID) collects and categorizes incidents where AI systems have caused or nearly caused harm. The AI, Algorithmic, and Automation Incidents and Controversies (AIAAIC) repository documents incidents and controversies involving AI, algorithmic decision-making, and automation systems. Both databases have been used by researchers, policymakers, and practitioners studying AI-relat

    Read more →
  • Sketchpad

    Sketchpad

    Sketchpad (a.k.a. Robot Draftsman) is a computer program written by Ivan Sutherland in 1963 in the course of his PhD thesis, for which he received the Turing Award in 1988, and the Kyoto Prize in 2012. It pioneered human–computer interaction (HCI), and is considered the ancestor of modern computer-aided design (CAD) programs and as a major breakthrough in the development of computer graphics in general. For example, Sketchpad inspired the graphical user interface (GUI) and object-oriented programming. Using the program, Sutherland showed that computer graphics could be used for both artistic and technical purposes and for demonstrating a novel method of human–computer interaction. == History == See History of the graphical user interface for a more detailed discussion of GUI development. == Software == Sketchpad was the earliest program ever to use a complete graphical user interface. The clever way the program organizes its geometric data pioneered the use of master (objects) and occurrences (instances) in computing and pointed forward to object-oriented programming. The main idea was to have master drawings which can be instantiated into many duplicates. When a master drawing is changed, then all instances change also. This was the first known form of an entity component system: for example instead of encapsulating points inside of a line object, the points are stored in a ring buffer as described in pages 48 to 52 of the paper, and the line only points to them. This allowed moving one point to alter all the shapes that use it in a single operation. The structures in Sketchpad were also able to store pointers to functions, to achieve a different behavior depending on the kind of object. In figure 3.8 of the paper, the "instances generic block" stores several "subroutine entries" which are pointers to functions: "display", "howbig" etc. This was an early form of virtual functions. Geometric constraints was another major invention in Sketchpad, letting a user easily constrain geometric properties in the drawing: for instance, the length of a line or the angle between two lines could be fixed. As a trade magazine said, clearly Sutherland "broke new ground in 3D computer modeling and visual simulation, the basis for computer graphics and CAD/CAM". Very few programs can be called precedents for his achievements. Patrick J. Hanratty is sometimes called the "father of CAD/CAM" and wrote PRONTO, a numerical control language at General Electric in 1957, and wrote CAD software while working for General Motors beginning in 1961. Sutherland wrote in his thesis that Bolt, Beranek and Newman had a "similar program" and T-Square was developed by Peter Samson and one or more fellow MIT students in 1962, both for the PDP-1. The Computer History Museum holds program listings for Sketchpad. == Hardware == Sketchpad ran on the MIT Lincoln Laboratory TX-2 (1958) computer at the Massachusetts Institute of Technology (MIT), which had 64k of 36-bit words. The user drew on the computer monitor screen with the recently invented light pen, which relayed information on its position by computing at what time the light from the scanning cathode-ray tube screen is detected. To configure the initial position of the light pen, the word INK was displayed on the screen, which, upon tapping, initialised the program with a white cross to continue keeping track of the pen's movement relative to its prior position. Of the 36 bits available to store each display spot in the display file, 20 gave the coordinates of that spot for the display system and the remaining 16 gave the address of the n-component element responsible for adding that spot to display. The TX-2 was an experimental machine and the hardware changed often (on Wednesdays, according to Sutherland). By 1975, the light pen and the cathode-ray tube with which it had been used had been removed. == Publications == The Sketchpad program was part and parcel of Sutherland's Ph.D. thesis at MIT and peripherally related to the Computer-Aided Design project at that time. Sketchpad: A Man-Machine Graphical Communication System.

    Read more →
  • Woken Furies

    Woken Furies

    Woken Furies (2005) is a science fiction novel by British writer Richard Morgan. It is the third novel featuring the anti-hero Takeshi Kovacs and is the sequel to Broken Angels. This addition to the series casts light upon Kovacs' early life providing information on his post-envoy activities. Morgan's official website and interviews suggest that Woken Furies could be the last Kovacs novel, although in 2018 (before Netflix cancelled the show) Morgan stated that the Netflix adaptation has "kind of woken it all up again" after all these years, making him possibly reconsider being done with Kovacs. == Plot == Takeshi Kovacs finds himself in a new "sleeve," or human body, back on his home planet of Harlan's World. He is on the run after making numerous attacks against the Knights of the New Revelation, an extremist religious order responsible for the death of his lost love and her daughter. Because she had violated tenets about resleeving, her executioners dropped her and her daughter's cortical stacks in the sea, effectively preventing them from being resleeved (into new bodies). While trying to secure passage after his most recent attack, Kovacs saves a woman named Sylvie from a group of religious zealots. In return, she allows him to take refuge with her mercenary "deCom" crew as they head out to decommission sentient military hardware that has run amok on the island of New Hokkaido (AKA New Hok). Sylvie is the "command head" of her crew, co-ordinating them during missions by using her biologically implanted circuitry and software. During one of these missions, Sylvie collapses, regains consciousness, and Kovacs realizes that her personality seems to have been replaced by that of long-dead revolutionary leader Quellcrist Falconer. Harlan's World is surrounded by automated "orbitals" which target flying objects, such as vehicles, with high-energy beam weapons known as "angelfire"; Falconer is believed to have died without a backup of her cortical stack when her getaway aircraft was destroyed by angelfire 300 years prior. When Sylvie's crew returns from New Hok, they discover a younger version of Kovacs has been illegally duplicated into a different body (AKA "double sleeved") and is hunting them on behalf of the Harlan family that rules the planet. Most of Sylvie's crew is killed and Sylvie/Quellcrist is captured. Kovacs schemes to rescue Sylvie by approaching old criminal associates of his, the Little Blue Bugs. The Little Blue Bugs mount a semi-successful attack on a Harlan fortress and rescue Sylvie/Quellcrist. Hiding from Harlan forces in a floating base, the neo-Quellists are sold out by its owner and recaptured. An assault by Kovacs and a single UN Envoy on the base ends badly when Kovacs is betrayed by the Envoy who was actually embedded with several colleagues. However, Sylvie/Quellcrist has established a connection with the orbitals and calls down angelfire, eliminating their captors. The younger Kovacs is killed in the aftermath. Sylvie explains that angelfire is a destructive recording device. Thus, in destroying Quellcrist and the helicopter carrying her, it copied her. When the technology of the deCom crews advanced far enough, her persona was able to insert itself into Sylvie's implants and co-exist in her body. The novel ends with Kovacs, Virginia Vidaura, and Sylvie/Quellcrist waiting to see if they can use Sylvie/Quellcrist's newfound connection to the orbitals and the expansion of a long-dormant genetic virus to turn the population against the ruling oligarchy.

    Read more →
  • Mivar-based approach

    Mivar-based approach

    The Mivar-based approach is a mathematical tool for designing artificial intelligence (AI) systems. Mivar (Multidimensional Informational Variable Adaptive Reality) was developed by combining production and Petri nets. The Mivar-based approach was developed for semantic analysis and adequate representation of humanitarian epistemological and axiological principles in the process of developing artificial intelligence. The Mivar-based approach incorporates computer science, informatics and discrete mathematics, databases, expert systems, graph theory, matrices and inference systems. The Mivar-based approach involves two technologies: Information accumulation is a method of creating global evolutionary data-and-rules bases with variable structure. It works on the basis of adaptive, discrete, mivar-oriented information space, unified data and rules representation, based on three main concepts: “object, property, relation”. Information accumulation is designed to store any information with possible evolutionary structure and without limitations concerning the amount of information and forms of its presentation. Data processing is a method of creating a logical inference system or automated algorithm construction from modules, services or procedures on the basis of a trained mivar network of rules with linear computational complexity. Mivar data processing includes logical inference, computational procedures and services. Mivar networks allow us to develop cause-effect dependencies (“If-then”) and create an automated, trained, logical reasoning system. Representatives of Russian association for artificial intelligence (RAAI) – for example, V. I. Gorodecki, doctor of technical science, professor at SPIIRAS and V. N. Vagin, doctor of technical science, professor at MPEI declared that the term is incorrect and suggested that the author should use standard terminology. == History == While working in the Russian Ministry of Defense, O. O. Varlamov started developing the theory of “rapid logical inference” in 1985. He was analyzing Petri nets and productions to construct algorithms. Generally, mivar-based theory represents an attempt to combine entity-relationship models and their problem instance – semantic networks and Petri networks. The abbreviation MIVAR was introduced as a technical term by O. O. Varlamov, Doctor of Technical Science, professor at Bauman MSTU in 1993 to designate a “semantic unit” in the process of mathematical modeling. The term has been established and used in all of his further works. The first experimental systems operating according to mivar-based principles were developed in 2000. Applied mivar systems were introduced in 2015. == Mivar == Mivar is the smallest structural element of discrete information space. == Object-property-relation == Object-Property-Relation (VSO) is a graph, the nodes of which are concepts and arcs are connections between concepts. Mivar space represents a set of axes, a set of elements, a set of points of space and a set of values of points. A = { a n } , n = 1 , … , N , {\displaystyle A=\{a_{n}\},n=1,\ldots ,N,} where: A {\displaystyle A} is a set of mivar space axis names; N {\displaystyle N} is a number of mivar space axes. Then: ∀ a n ∃ F n = { f n i n } , n = 1 , … , N , i n = 1 , … , I n , {\displaystyle \forall a_{n}\exists F_{n}=\{f_{{ni}_{n}}\},n=1,\ldots ,N,i_{n}=1,\ldots ,I_{n},} where: F n {\displaystyle F_{n}} is a set of axis a n {\displaystyle a_{n}} elements; i n {\displaystyle i_{n}} is a set F n {\displaystyle F_{n}} element identifier; I n = | F n | . {\displaystyle I_{n}=|F_{n}|.} F n {\displaystyle F_{n}} sets form multidimensional space: M = F 1 × F 2 × ⋯ × F n . {\displaystyle M=F_{1}\times F_{2}\times \cdots \times F_{n}.} m = ( i 1 , i 2 , … , i N ) , {\displaystyle m=(i_{1},i_{2},\ldots ,i_{N}),} where: m ∈ M {\displaystyle m\in M} ; m {\displaystyle m} is a point of multidimensional space; ( i 1 , i 2 , … , i N ) {\displaystyle (i_{1},i_{2},\ldots ,i_{N})} are coordinates of point m {\displaystyle m} . There is a set of values of multidimensional space points of M {\displaystyle M} : C M = { c i 1 , i 2 , … , i N ∣ i 1 = 1 , … , I 1 , i 2 = 1 , … , I 2 , … , i n = 1 , … , I N } , {\displaystyle C_{M}=\{c_{i_{1},i_{2},\ldots ,i_{N}}\mid i_{1}=1,\ldots ,I_{1},i_{2}=1,\ldots ,I_{2},\ldots ,i_{n}=1,\ldots ,I_{N}\},} where: c i 1 , i 2 , … , i N {\displaystyle c_{i_{1},i_{2},\ldots ,i_{N}}} is a value of the point of multidimensional space M {\displaystyle M} is a value of the point of multidimensional space ( i 1 , i 2 , … , i N ) {\displaystyle (i_{1},i_{2},\ldots ,i_{N})} . For every point of space M {\displaystyle M} there is a single value from C M {\displaystyle C_{M}} set or there is no such value. Thus, C M {\displaystyle C_{M}} is a set of data model state changes represented in multidimensional space. To implement a transition between multidimensional space and set of points values the relation μ {\displaystyle \mu } has been introduced: C x = μ ( M x ) , {\displaystyle C_{x}=\mu (M_{x}),} where: M x ⊆ M ; {\displaystyle M_{x}\subseteq M;} M x = F 1 x × F 2 x × ⋯ × F N x . {\displaystyle M_{x}=F_{1x}\times F_{2x}\times \cdots \times F_{Nx}.} To describe a data model in mivar information space it is necessary to identify three axes: The axis of relations « O {\displaystyle O} »; The axis of attributes (properties) « S {\displaystyle S} »; The axis of elements (objects) of subject domain « V {\displaystyle V} ». These sets are independent. The mivar space can be represented by the following tuple: ⟨ V , S , O ⟩ {\displaystyle \langle V,S,O\rangle } Thus, mivar is described by « V S O {\displaystyle VSO} » formula, in which « V {\displaystyle V} » denotes an object or a thing, « S {\displaystyle S} » denotes properties, « O {\displaystyle O} » variety of relations between other objects of a particular subject domain. The category “Relations” can describe dependencies of any complexity level: formulae, logical transitions, text expressions, functions, services, computational procedures and even neural networks. A wide range of capabilities complicates description of modeling interconnections, but can take into consideration all the factors. Mivar computations use mathematical logic. In a simplified form they can be represented as implication in the form of an "if…, then …” formula. The result of mivar modeling can be represented in the form of a bipartite graph binding two sets of objects: source objects and resultant objects. == Mivar network == Mivar network is a method for representing objects of the subject domain and their processing rules in the form of a bipartite directed graph consisting of objects and rules. A Mivar network is a bipartite graph that can be described in the form of a two-dimensional matrix, in that records information about the subject domain of the current task. Generally, mivar networks provide formalization and representation of human knowledge in the form of a connected multidimensional space. That is, a mivar network is a method of representing a piece of mivar space information in the form of a bipartite, directed graph. The mivar space information is formed by objects and connections, which in total represent the data model of the subject domain. Connections include rules for objects processing. Thus, a mivar network of a subject domain is a part of the mivar space knowledge for that domain. The graph can consist of objects-variables and rules-procedures. First, two lists are made that form two nonintersecting partitions: the list of objects and the list of rules. Objects are denoted by circles. Each rule in a mivar network is an extension of productions, hyper-rules with multi-activators or computational procedures. It is proved that from the perspective of further processing, these formalisms are identical and in fact are nodes of the bipartite graph, denoted by rectangles. === Multi-dimensional binary matrices === Mivar networks can be implemented on single computing systems or service-oriented architectures. Certain constraints restrict their application, in particular, the dimension of matrix of linear matrix method for determining logical inference path on the adaptive rule networks. The matrix dimension constraint is due to the fact that implementation requires sending a general matrix to multiple processors. Since every matrix value is initially represented in symbol form, the amount of sent data is crucial when obtaining, for example, 10000 rules/variables. Classical mivar-based method requires storing three values in each matrix cell: 0 – no value; x – input variable for the rule; y – output variable for the rule. The analysis of possibility of firing a rule is separated from determining output variables according to stages after firing the rule. Consequently, it is possible to use different matrices for “search for fired rules” and “setting values for output variables”. This allowsthe use of multidimensional binary m

    Read more →