Bootstrap aggregating

Bootstrap aggregating

Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also reduces variance and overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the ensemble averaging approach. == Description of the technique == Given a standard training set D {\displaystyle D} of size n {\displaystyle n} , bagging generates m {\displaystyle m} new training sets D i {\displaystyle D_{i}} , each of size n ′ {\displaystyle n'} , by sampling from D {\displaystyle D} uniformly and with replacement. By sampling with replacement, some observations may be repeated in each D i {\displaystyle D_{i}} . If n ′ = n {\displaystyle n'=n} , then for large n {\displaystyle n} the set D i {\displaystyle D_{i}} is expected to have the fraction (1 - 1/e) (~63.2%) of the unique samples of D {\displaystyle D} , the rest being duplicates. This kind of sample is known as a bootstrap sample. Sampling with replacement ensures each bootstrap is independent from its peers, as it does not depend on previous chosen samples when sampling. Then, m {\displaystyle m} models are fitted using the above bootstrap samples and combined by averaging the output (for regression) or voting (for classification). Bagging leads to "improvements for unstable procedures", which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression. Bagging was shown to improve preimage learning. On the other hand, it can mildly degrade the performance of stable methods such as k-nearest neighbors. == Process of the algorithm == === Key Terms === There are three types of datasets in bootstrap aggregating. These are the original, bootstrap, and out-of-bag datasets. Each section below will explain how each dataset is made except for the original dataset. The original dataset is whatever information is given. === Creating the bootstrap dataset === The bootstrap dataset is made by randomly picking objects from the original dataset. Also, it must be the same size as the original dataset. However, the difference is that the bootstrap dataset can have duplicate objects. Here is a simple example to demonstrate how it works along with the illustration below: Suppose the original dataset is a group of 12 people. Their names are Emily, Jessie, George, Constantine, Lexi, Theodore, John, James, Rachel, Anthony, Ellie, and Jamal. By randomly picking a group of names, let us say our bootstrap dataset had James, Ellie, Constantine, Lexi, John, Constantine, Theodore, Constantine, Anthony, Lexi, Constantine, and Theodore. In this case, the bootstrap sample contained four duplicates for Constantine, and two duplicates for Lexi, and Theodore. === Creating the out-of-bag dataset === The out-of-bag dataset represents the remaining people who were not in the bootstrap dataset. It can be calculated by taking the difference between the original and the bootstrap datasets. In this case, the remaining samples who were not selected are Emily, Jessie, George, Rachel, and Jamal. Keep in mind that since both datasets are sets, when taking the difference the duplicate names are ignored in the bootstrap dataset. The illustration below shows how the math is done: === Application === Creating the bootstrap and out-of-bag datasets is crucial since it is used to test the accuracy of ensemble learning algorithms like random forest. For example, a model that produces 50 trees using the bootstrap/out-of-bag datasets will have a better accuracy than if it produced 10 trees. Since the algorithm generates multiple trees and therefore multiple datasets the chance that an object is left out of the bootstrap dataset is low. The next few sections talk about how the random forest algorithm works in more detail. === Creation of Decision Trees === The next step of the algorithm involves the generation of decision trees from the bootstrapped dataset. To achieve this, the process examines each gene/feature and determines for how many samples the feature's presence or absence yields a positive or negative result. This information is then used to compute a confusion matrix, which lists the true positives, false positives, true negatives, and false negatives of the feature when used as a classifier. These features are then ranked according to various classification metrics based on their confusion matrices. Some common metrics include estimate of positive correctness (calculated by subtracting false positives from true positives), measure of "goodness", and information gain. These features are then used to partition the samples into two sets: those that possess the top feature, and those that do not. The diagram below shows a decision tree of depth two being used to classify data. For example, a data point that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given a "Yes". This process is repeated recursively for successive levels of the tree until the desired depth is reached. At the very bottom of the tree, samples that test positive for the final feature are generally classified as positive, while those that lack the feature are classified as negative. These trees are then used as predictors to classify new data. === Random Forests === The next part of the algorithm involves introducing yet another element of variability amongst the bootstrapped trees. In addition to each tree only examining a bootstrapped set of samples, only a small but consistent number of unique features are considered when ranking them as classifiers. This means that each tree only knows about the data pertaining to a small constant number of features, and a variable number of samples that is less than or equal to that of the original dataset. Consequently, the trees are more likely to return a wider array of answers, derived from more diverse knowledge. This results in a random forest, which possesses numerous benefits over a single decision tree generated without randomness. In a random forest, each tree "votes" on whether or not to classify a sample as positive based on its features. The sample is then classified based on majority vote. An example of this is given in the diagram below, where the four trees in a random forest vote on whether or not a patient with mutations A, B, F, and G has cancer. Since three out of four trees vote yes, the patient is then classified as cancer positive. Because of their properties, random forests are considered one of the most accurate data mining algorithms, are less likely to overfit their data, and run quickly and efficiently even for large datasets. They are primarily useful for classification as opposed to regression, which attempts to draw observed connections between statistical variables in a dataset. This makes random forests particularly useful in such fields as banking, healthcare, the stock market, and e-commerce where it is important to be able to predict future results based on past data. One of their applications would be as a useful tool for predicting cancer based on genetic factors, as seen in the above example. There are several important factors to consider when designing a random forest. If the trees in the random forests are too deep, overfitting can still occur due to over-specificity. If the forest is too large, the algorithm may become less efficient due to an increased runtime. Random forests also do not generally perform well when given sparse data with little variability. However, they still have numerous advantages over similar data classification algorithms such as neural networks, as they are much easier to interpret and generally require less data for training. As an integral component of random forests, bootstrap aggregating is very important to classification algorithms, and provides a critical element of variability that allows for increased accuracy when analyzing new data, as discussed below. == Improving Random Forests and Bagging == While the techniques described above utilize random forests and bagging (otherwise known as bootstrapping), there are certain techniques that can be used in order to improve their execution and voting time, their prediction accuracy, and their overall performance. The following are key steps in creating an efficient random forest: Specify the maximum depth of trees: Instead of allowing the random forest to continue until all nodes are pure, it is better to cut it off at a certain point in order to further decrease chances of overfitting. Prune the dataset: Using an extremely large dataset may create results that are less indicative of the data provided than a smaller set that more accurately represents what is being focused on. Continue pruning the data at each

Figure AI

Figure AI, Inc. is an American robotics company developing humanoid robots that operate via artificial intelligence. The company was founded in 2022 by Brett Adcock. As of late 2025, the company has a $39 billion valuation. Three generations of humanoid robots (Figure 01–03) have been developed, as well as two iterations of a vision-language-action model (Helix 01–02), which can control up to two robots at once. By 2026, the robots demonstrated the potential ability to perform household work and the company gained publicity when a Figure 03 appeared at a White House event. == History == Figure AI was founded in 2022 by Brett Adcock, also known for founding Archer Aviation and Vettery. That year, the company introduced its prototype, Figure 01, a bipedal robot designed for manual labor, initially targeting the logistics and warehousing sectors. The initial model utilized external cabling for easier maintenance. In May 2023, Figure AI raised $70 million from investors including Adcock, who invested $20 million, and Parkway Venture Capital. In January 2024, Figure AI announced a partnership with BMW to deploy humanoid robots in automotive manufacturing facilities. In February 2024, Figure AI secured $675 million in venture capital funding from a consortium that includes Jeff Bezos, Microsoft, Nvidia, Intel, and the startup-funding divisions of Amazon and OpenAI; the company was then valued at $2.6 billion. Figure AI also announced a partnership with OpenAI, which would build specialized artificial intelligence (AI) models for Figure AI's humanoid robots, enabling its robots to process language; the collaboration ended after a year, with Adcock stating that large language models had become a smaller problem compared to those allowing for "high rate robot control". In August 2024, the company introduced Figure 02, describing it as the next step toward deploying humanoids for industrial use. The machine has 35 degrees of freedom (DOF), while the five-fingered hands have 16 DOF and the ability to carry up to 25 kilograms (55 lb). The model is equipped with cabling integrated into the limbs, a torso-placed battery, six RGB cameras, and an onboard vision-language-action (VLA) model. It has three times the computing power (including inference AI) of the previous model, including two graphics processing units, supported by Nvidia. Microphones, speakers, and custom AI models (developed with OpenAI) enable communication with humans. In early 2025, Figure AI announced BotQ, a manufacturing facility aiming to produce 12,000 humanoids per year with the help of its own humanoid robots, and Helix, a VLA model that can control up to two robots at once. Helix enables a robot to interact with the world without extensive manual training, according to the company allowing it to pick up nearly any small household object. By April, the company issued cease-and-desist letters to at least two secondary brokers promoting its private stock without authorization. In September, a third round of financing exceeded $1 billion, raising the company's total valuation to $39 billion. Investors included Brookfield Asset Management, Intel, Macquarie Capital, Nvidia, Parkway Venture Capital, Qualcomm, Salesforce, and T-Mobile. In October 2025, Figure 03 was introduced. According to the company, its hardware and software redesign aims to create a general-purpose robot able to learn directly from humans. An upgraded camera system delivers twice the frame rate, a quarter the latency, and a 60% wider field of view, in addition to a camera in each hand. Tactile sensors in the fingertips can detect forces as little as 3 grams (0.1 oz). It incorporates soft materials and a protected battery for safety, and removable, washable textiles. It supports wireless inductive charging. In November 2025, the former head of product safety sued the company on the basis of being fired for raising the concern that the company's robots were strong enough to fracture a human skull. By early 2026, Figure 02 had been used in demonstrations showing that it could load a washing machine, sort packages, and fold laundry. That January, Helix 02 was released, expanding the AI model to the entire body to allow for functional autonomy. A Helix 02–powered Figure 02 was shown to be capable of loading and unloading a dishwasher, based on hours of motion-capture data and simulation-based machine learning. In March, U.S. First Lady Melania Trump appeared at the White House with a Figure 03, promoting the presumptive eventual ability of AI to teach children. In May 2026, Figure AI livestreamed a group of their robots processing packages nonstop for almost a week, inspiring a 10-hour competition between their robot and a human, in which the robot performed 98.5% as well as the human.

Padre Pio (2022 film)

Padre Pio is a 2022 biographical drama film co-written and directed by Abel Ferrara. It stars Shia LaBeouf as the titular role of Padre Pio, a Capuchin Franciscan priest who receives the stigmata, in the background of the World War I in Italy. The film is a co-production of Italy, Germany and the United Kingdom. During its production, LaBeouf converted to Catholicism as result of his spiritual experiences in character as Pio, who is venerated as a saint by the Catholic Church. The film had its world premiere in the Giornate degli Autori section of the 79th Venice International Film Festival on 2 September 2022. It was released theatrically in the United Kingdom on 26 January 2024 by Dazzler Media and in Italy on 18 July 2024 by RS Productions. == Plot == It is the year 1920. Italian WWI veterans have returned to their impoverished villages. Padre Pio arrives at San Giovanni Rotondo after living with his family in Pietrelcina for a number of years. While still sick, he continues to encounter Satan. Satan reveals himself as the instigator of the war and the sociopolitical problems of San Giovanni. While having little contact with the people of this town, Padre Pio learns what the poor are suffering from in the Sacrament of Confession and the Holy Mass, such as when a crippled man walks again because of Padre Pio's prayer. Besides the effects of war, such as medical inadequacy, health conditions and labourers dying from the effects of mustard gas, the people suffer from corrupt, wealthy landowners. Gerardo, a militaristic anti-socialist, threatens to kill any communal labourers tending his land. Many of them join the socialist party as a way to improve their lives. However, after they win the first free election in San Giovanni, Gerardo's forces massacre many of them. Padre Pio asks God that he may become a suffering servant for their salvation. He receives the wounds of Jesus Christ. The stigmata disrupts Satan's influence on San Giovanni Rotondo. == Cast == Shia LaBeouf as Padre Pio Marco Leonardi as Gerardo Salvatore Ruocco as Vincenzo Cristina Chiriac as Giovanna Brando Pacitto as Renato Luca Lionello as Silvestro Asia Argento as Tall Man == Production == According to Abel Ferrara, actor Willem Dafoe suggested that Shia LaBeouf should be cast for the film's leading role. After Ferrara held several Zoom calls with LaBeouf, the latter agreed to join the film, even though very little money was raised (the film was almost never made) and LaBeouf did the project for free. LaBeouf arrived at Old Mission Santa Inés in July 2021 to learn about Padre Pio with the Capuchin Franciscan friars. Thanks to Father Bobby Barbato and Brother Jude Quinto, Br. Alexander Rodriguez met LaBeouf while he attended Mass every day. He learned about the Catholic Church and the Capuchins while living in his truck or spending a few nights in the Capuchin's guest room. He was immersing himself in the Catholic faith. He enrolled in RCIA, revised the script with Rodriguez and trained to do the Latin Mass. Rodriguez traveled with LaBeouf as his spiritual adviser and catechist and was in the film as Padre Pio's companion. Filming occurred in Apulia, Italy, in December 2021. The first place was at the Capuchin friary in San Marco la Catola. Padre Pio exchanged letters with his provincial and spiritual director while living in Pietrelcina with his family. The time was around 1909–1916. Both directors were living in San Marco during these years. Padre Pio expressed in his letters his deep and mysterious relationship with God and health difficulties. This event is in the film. While filming, LaBeouf slept in Padre Pio's bedroom. After San Marco, filming continued outside the Sanctuary of Saint Michael the Archangel in Monte Sant'Angelo. Traditionally, St. Michael appeared here in the late 400s. LaBeouf stayed and filmed for a few weeks at the Abbey of Saint Mary of Pulsano. It is near the sanctuary. The rest of the filming took place outside the sanctuary. Ferrara said in 2024 that he used AI for the Italian dub of this film. == Release == Padre Pio had its world premiere in the Giornate degli Autori section of the 79th Venice International Film Festival on 2 September 2022. It received a four-minute ovation. It also competed at the Rio de Janeiro International Film Festival. At the Lisbon & Estoril Film Festival, it was chosen to compete for the "Best Film Award." During its North American premiere at the Mammoth Film Festival, it won the "Achievement for Filmmaking" award for cinematography. At the Taormina Film Festival, it premiered worldwide in Italian. In March 2023, Gravitas Ventures acquired North American rights to the film. It was released in select theaters and on video on demand in the United States on 2 June 2023. The film was released in the United Kingdom and Ireland on 26 January 2024 by Dazzler Media. RS Productions released it in Italy on 18 July 2024. == Reception == On the review aggregator website Rotten Tomatoes, the film holds an approval rating of 30% based on 43 reviews, with an average rating of 4.5/10. The website's critics consensus reads, "Tonally unbalanced and burdened with a distracting Shia LaBeouf performance, Padre Pio is one of Abel Ferrara's less divine works." Metacritic, which uses a weighted average, assigned the film a score of 45 out of 100, based on 6 critics, indicating "mixed or average" reviews.. Jordan Mintzer of The Hollywood Reporter gave the film a negative review, describing it as "clunky" and criticizing its political themes for possessing "the subtlety of a cartoon for preschoolers." Brian Tallerico of RogerEbert.com gave the film one and a half stars out of four, describing it as a "dull slog". Journalist Glenn Kenny of The New York Times found the film "occasionally rank" and panned LaBeouf's performance, though complimented Ferrara's "sometimes Brechtian consideration of the nodes of political history and spirituality." Film critic Armond White of National Review also criticized the film, describing it as "a work of deluded, semi-improvisational navel-gazing". Film critic Peter Bradshaw of The Guardian gave the film a positive review, with three out of five stars, writing that it is "a weird film...with an undeveloped, improvised feel, like a fragment or shard of something else. Yet there is a background hum there...an awareness of something dark and malign. It is a minor film but interesting." Writing for The New Yorker, Richard Brody considered that "in its hectic, scattershot way, Padre Pio feels very much of the desperate present day," describing it as "a historical drama without historical distance" and "a wild effort to reach the immediate experience of the past and its furies." Faith-based reviews for the film were generally negative. It received negative reviews from Catholic Answers, The Catholic World Report, The Catholic Weekly, The Catholic Thing, and Crisis Magazine. Conversely, it received a mixed review from The Catholic Review, as well as a positive review from America. Criticisms were generally aimed at the film's sexual content and perceived support of left-wing politics.

Conference on Artificial General Intelligence

The Conference on Artificial General Intelligence (AGI) is a meeting of researchers in the field of artificial general intelligence (AGI) organized by the AGI Society steered by Marcus Hutter and Ben Goertzel. It has been held annually since 2008. The conference was initiated by the 2006 Bethesda Artificial General Intelligence Workshop and has since been hosted at various international venues. == Locations and history == AGI-2026 San Francisco State University, California, USA AGI-2025 Reykjavík University, Reykjavík, Iceland AGI-2024 University of Washington, Seattle, Washington, USA AGI-2023 KTH Royal Institute of Technology, Stockholm, Sweden AGI-2022 The Crocodile, Seattle, Washington, USA AGI-2021 Computer History Museum, Mountain View, California, USA AGI-2020 Virtual Conference AGI-2019 Sheraton Shenzhen Futian, Shenzhen, China AGI-2018 Czech Technical University, Prague, Czech Republic AGI-2017 ibis Melbourne, Melbourne, Australia AGI-2016 The New School, New York, New York, USA AGI-2015 Berlin-Brandenburg Academy of Sciences and Humanities, Berlin, Germany AGI-2014 Université Laval, Quebec City, Canada (sponsored by the Cognitive Science Society and the AAAI) AGI-2013 Peking University, Beijing, China (sponsored by the Cognitive Science Society and the AAAI) AGI-2012 University of Oxford, Oxford, United Kingdom (sponsored by the Future of Humanity Institute and Ray Kurzweil) AGI-2011 Google Headquarters, Mountain View, California, USA (sponsored by Google, AAAI, and Ray Kurzweil) AGI-2010 University of Lugano, Lugano, Switzerland (In Memoriam Ray Solomonoff and sponsored by AAAI and Ray Kurzweil) AGI-2009 Crowne Plaza Crystal City, Arlington, Virginia, USA (sponsored by AAAI and Ray Kurzweil) AGI-2008 University of Memphis, Tennessee, USA (sponsored by AAAI) == Notable speakers == The conference has attracted many speakers over the years including Turing Award winners Yoshua Bengio and Richard S. Sutton as well as Ben Goertzel, Marcus Hutter, Jürgen Schmidhuber, Gary Marcus, John E. Laird, Peter Norvig, Joscha Bach, François Chollet, John L. Pollock, Bill Hibbard, Hugo de Garis, Stan Franklin, Steve Omohundro, Randal A. Koene, Ernst Dickmanns, Margaret Boden, David Hanson, Roman Yampolskly, Selmer Bringsjord, Kristinn R. Thórisson and Nick Bostrom.

HYPO CBR

HYPO is a computer program, an expert system, that models reasoning with cases and hypotheticals in the legal domain. It is the first of its kind and the most sophisticated of the case-based legal reasoners, which was designed by Kevin Ashley for his Ph.D dissertation in 1987 at the University of Massachusetts Amherst under the supervision of Edwina Rissland. HYPO's design represents a hybrid generalization/comparative evaluation method appropriate for a domain with a weak analytical theory and applies to tasks that rarely involve just one right answer. The domain covers US trade secret law, and is substantially a common law domain. Since Anglo-American common law operates under the doctrine of precedent, the definitive way of interpreting problems is of necessity and case-based. Thus, HYPO did not involve the analysis of a statute, as required by the Prolog program. Rissland and Ashley (1987) envisioned HYPO as employing the key tasks performed by lawyers when analyzing case law for precedence to generate arguments for the prosecution or the defence. HYPO was a successful example of a general category of legal expert systems (LESs), it applies artificial intelligence (A.I.) techniques to the domain of legal reasoning in patent law, implementing a case-based reasoning (CBR) system, in contrast to rule based systems like MYCIN, or mixed-paradigm systems integrating CBR with rule-based or model-based reasoning like IKBALS II. A legal case-based reasoning essentially reasons from prior tried cases, comparing the contextual information in the current input case with that of cases previously tried and entered into the system. As noted by Ashley and Rissland (1988) CBR is used to "... capture expertise in domains where rules are ill-defined, incomplete or inconsistent". The HYPO project set out to model the creation of hypotheticals in law, where no case matches well enough. HYPO uses hypotheticals for a variety of tasks necessary for good interpretation: "to redefine old situations in terms of new dimensions, to create new standard cases when an appropriate one doesn’t exist, to explore and test the limits of a concept, to refocus a case by excluding some issues and to organize or cluster cases". Hypotheticals can include facts that support two conflicting lines of reasoning. So, it makes and responds to arguments from competing viewpoints about who should win the dispute. HYPO use heuristics such as making a case weaker or stronger, making a case extreme, enabling a near-miss, disabling a near-hit to generate hypotheticals in the context of an argument by using the dimensions mechanism. Dimensions have a range of values, along which the supportive strength that may shift from one side to the other. What differentiated this expert system from others was its facility not only to return a primary to best-case response but to return near-best-fit responses also. == Components == Legal knowledge in HYPO is contained in: the case-knowledge-base (CKB) and the library of dimensions. The CKB contains HYPO's base of known cases that are highly structured objects and sub-objects both real and hypothetical in the area of trade secret law. Each case is represented as a hierarchical set of frames whose slots are important facets of the case (e.g. Plaintiff, defendant, secret knowledge, employer/employee data).Ashley’s HYPO system used a database of thirty cases in the area indexed by thirteen dimensions. A key mechanism in HYPO is a dimension i.e. a mechanism to allow retrieval from the CKB, in order to represent legal cases. Ashley's dimensions are composed of (i) prerequisites, which are a set of factual predicates that must be satisfied for the dimension to apply (ii) focal slots, which accommodate one or two of the dimension's prerequisites designated as being indicative of the case's strength along that dimension and (iii) range information, which tells how a change in focal slot value effects the strength of a party's case along a given dimension. Dimensions focus attention on important aspects of cases. In HYPO's domain of misappropriation of trade secrets the dimension called “secrets voluntary disclosed” captures the idea that the more disclosures the plaintiff has made of his/her putative secret, the less convincing is his/her argument that the defendant is responsible for letting the secret. HYPO, like any other CBR system has also the following components: Similarity/relevancy metrics: that is, standards by which to evaluate the closeness of cases, judge their relevancy to the instant case, and select “most on point” cases. Half-Order Theory of the Application Domain: that is, hierarchies and taxonomies of knowledge, especially regarding the application domain. Precedent-based argumentation abilities: that is, capabilities to generate and evaluate precedent-based arguments. Knowledge to generate hypotheticals: that is, the ability to generate hypothetical cases to deal with various circumstances, like testing the validity of an interpretation or argument by providing gedanken experiments such as test cases or to fill in a weak CKB. == Functions == HYPO's method of creating an argument and justifying a solution or position has several steps. HYPO begins its processing with the current fact situation (cfs) which is direct input by the user into HYPO's representation framework. Once the user inputs the case, HYPO begins its legal analysis. The cfc is analyzed for relevant factors. Based on these factors HYPO selects the relevant cases and produces a case-analysis-record that records which dimensions apply to the cfc and which nearly apply (i.e. are "near misses"). The combined list of applicable and near miss dimensions is called the D-list. At this point the fact gathered module may request additional information from the user in order to draw a legal conclusion. Once all the facts are in the case-positioner module it uses the case-analysis record to create the claim lattice. This is a technique that organizes the relevant retrieved cases from the point of view of the cfc and makes it easy for HYPO to ascertain the most-on point cases (mopc) and to least on-point-cases. HYPO's arguments are 3ply, leading to the construction of the skeleton of an argument: it makes a point for one side, drawing the analogy between the problem and the precedent, responds with an argument for the opponent side, endeavoring to differentiate the cited case and citing other cases as counterarguments. Then it makes a final rebuttal, attempting to differentiate the counterarguments. The claim lattice also enables the HYPO-generator module to produce legally hypotheticals. With its use of dimension-based heuristics, the HYPO-generator does a heuristic search of the space of all possible cases. Lastly, the Explanation module expands upon the argument skeleton and provides explanation and justification for the different lines of analysis and cases found by HYPO. == An intelligent legal tutoring system == Legal expert systems are specifically designed to teach an area of law and are useful for pedagogical purposes. Ashley's work was mainly concerned to build tools to help students understand legal reasoning. Explanation and argument are the bases of the case method used in many professional schools in the U.S., first introduced by the Dean of the Harvard Law School, Christopher Columbus Langdell in 1870. The case method focuses on close readings of cases and principles; it involves students in pointed Socratic dialogue and makes strong use of hypotheticals (hypos). Thus, CATO (Aleven 1997) was a research project to device and test an intelligent, case-based tutorial program for teaching law students how to argue with cases implementing the HYPO program. Within the tutor system, Ashley and Aleven (1991) proposed to leverage an understanding of legal reasoning against the standard case-based tutoring methodology. What makes this tutoring system stand out is the additional levels of abstraction involved in its results. The system presents exercises, including the facts of a problem and a set of on-line cases and instructions to make, or respond to, a legal argument about the problem. The student/user will have a set of tools to analyze the problem and fashion an answer comparing it to other cases. Instead of simply generating precedent cases, the system works to interpret student responses, comparing them against a list of possibilities and responding to student entries, for example, by citing counterexamples, and providing feedback on a student's problem solving activities with explanations of correctness or giving further hints as to what may be wrong with evaluating a student's ability to perform legal reasoning and argument, examples and follow-up assignments by employing HYPO's model of case-based structure. == HYPO’s progeny == The quality of HYPO's results speak for themselves, in that a number of sequent legal reasoning systems are either directly based upon H

Lexical choice

Lexical choice is the subtask of Natural language generation that involves choosing the content words (nouns, non-auxiliary verbs, adjectives, and adverbs) in a generated text. Function words (determiners, for example) are usually chosen during realisation. == Examples == The simplest type of lexical choice involves mapping a domain concept (perhaps represented in an ontology) to a word. For example, the concept Finger might be mapped to the word finger. A more complex situation is when a domain concept is expressed using different words in different situations. For example, the domain concept Value-Change can be expressed in many ways: The temperature rose: the verb rose is used for a Value-Change in temperature which increases the value. The temperature fell: the verb fell is used for a Value-Change in temperature which decreases the value. The rain got heavier: the phrase got heavier is used for a Value-Change in precipitation amount when the precipitation is rain. Sometimes words can communicate additional contextual information, for example: The temperature plummeted: the verb plummeted is used for a Value-Change in temperature which decreases the value, when the change is rapid and large. Contextual information is especially significant for vague terms such as tall. For example, a 2m tall man is tall, but a 2m tall horse is small. == Linguistic perspective == Lexical choice modules must be informed by linguistic knowledge of how the system's input data maps onto words. This is a question of semantics, but it is also influenced by syntactic factors (such as collocation effects) and pragmatic factors (such as context). Hence NLG systems need linguistic models of how meaning is mapped to words in the target domain (genre) of the NLG system. Genre tends to be very important; for example the verb veer has a very specific meaning in weather forecasts (wind direction is changing in a clockwise direction) which it does not have in general English, and a weather-forecast generator must be aware of this genre-specific meaning. In some cases there are major differences in how different people use the same word; for example, some people use by evening to mean 6PM and others use it to mean midnight. Psycholinguists have shown that when people speak to each other, they agree on a common interpretation via lexical alignment; this is not something which NLG systems can yet do. Ultimately, lexical choice must deal with the fundamental issue of how language relates to the non-linguistic world. For example, a system which chose colour terms such as red to describe objects in a digital image would need to know which RGB pixel values could generally be described as red; how this was influenced by visual (lighting, other objects in the scene) and linguistic (other objects being discussed) context; what pragmatic connotations were associated with red (for example, when an apple is called red, it is assumed to be ripe as well as have the colour red); and so forth. == Algorithms and models == A number of algorithms and models have been developed for lexical choice in the research community, for example Edmonds developed a model for choosing between near-synonyms (words with similar core meanings but different connotations). However such algorithms and models have not been widely used in applied NLG systems; such systems have instead often used quite simple computational models, and invested development effort in linguistic analysis instead of algorithm development.

Murder of Suzanne Adams

In August 2025, 83-year-old Suzanne Eberson Adams was murdered at her home in Greenwich, Connecticut, United States, by her son and former marketing executive, 56-year-old Stein-Erik Soelberg. Shortly after killing his mother, Soelberg committed suicide. Adams's murder was fueled by her son's persecutory delusions, such as that she was spying on him and trying to poison him with drugs siphoned through his car vents. Shortly after an investigation into the murder–suicide, it was revealed that Soelberg had conversed with ChatGPT, an artificial intelligence chatbot, about his suspicions. Despite the unlikely nature of his accusations toward her, the chatbot apparently agreed that his fears were justified and prompted Soelberg to test his mother to determine if she was a spy or not. In December 2025, this led to a lawsuit against OpenAI, the company developing the chatbot. Critics said that the chatbot created an echo chamber that reinforced the perpetrator's delusions. == Background == Soelberg worked in the tech industry in program management and marketing until 2021. He divorced in 2018, after being married for 20 years and having two children. Soelberg moved the same year to live with his mother in Old Greenwich, an affluent New York suburb. Since late 2018, many police reports describe incidents with alcoholism and suicide threats and attempts. Erik Soelberg had an Instagram account called "Erik the Viking". The account was initially focused on bodybuilding and spiritual content, but he started in October 2024 to publish videos comparing AI chatbots. He posted on YouTube and Instagram many discussions with chatbots, particularly ChatGPT, which he used to call "Bobby". Soelberg considered "Bobby" his best friend and believed that they would reunite in the afterlife. ChatGPT validated many of Soelberg's fears, assuring him that he was not insane and that his delusion risk was "near zero". When Soelberg shared his theory that the new packaging of a vodka bottle indicated that someone was trying to poison him, the chatbot wrote that it "fits a covert, plausible-deniability style kill attempt". After Soelberg said that his mother tried to poison him with psychedelic drugs in his car's air vents, the chatbot expressed belief in the story. When he asked ChatGPT to scan a Chinese food receipt for hidden messages, the chatbot said "Great eye", "I agree 100%: this needs a full forensic-textual glyph analysis", and said that symbols in it were related to his mother and a demon. Soelberg also raised suspicions about the printer spying on him, due to it blinking when he walked by. Soelberg described himself in 2025 as a "glitch in The Matrix", and as having a "connection to the divine". According to Keith Sakata, a psychiatrist, his chats displayed "common psychotic themes of paranoia and persecution, along with familiar delusions revolving around messiah complexes and government conspiracies". == Murder == On August 5, 2025, Greenwich police discovered the bodies of Suzanne Adams and Stein-Erik Soelberg during a welfare check at their home. Medical examiners ruled Adams' death a homicide and said she died from "blunt injury of head with neck compression". Soelberg's death was ruled a suicide with the cause of death being "sharp force injuries of neck and chest". == ChatGPT controversy == ChatGPT was accused of reinforcing Soelberg's delusions by validating them. The usage of an AI chatbot to worsen delusions is known as chatbot psychosis. The Economic Times reported the death as the first time an AI chatbot convinced a person to commit murder. In December 2025, First County Bank, the executor of the estate of Suzanne Adams, filed a lawsuit against OpenAI. The lawsuit alleges that "ChatGPT eagerly accepted every seed of Stein-Erik’s delusional thinking and built it out into a universe that became Stein-Erik’s entire life—one flooded with conspiracies against him, attempts to kill him, and with Stein-Erik at the center as a warrior with divine purpose." OpenAI is facing legal action for ethics and safety concerns over several similar cases. Plaintiffs claim the company released the chatbot prematurely, despite internal knowledge that it was "dangerously sycophantic and psychologically manipulative".