AI Image Generators

Explore the best AI Image Generators — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

N-jet

An N-jet is the set of (partial) derivatives of a function f ( x ) {\displaystyle f(x)} up to order N. Specifically, in the area of computer vision, the N-jet is usually computed from a scale space representation L {\displaystyle L} of the input image f ( x , y ) {\displaystyle f(x,y)} , and the partial derivatives of L {\displaystyle L} are used as a basis for expressing various types of visual modules. For example, algorithms for tasks such as feature detection, feature classification, stereo matching, tracking and object recognition can be expressed in terms of N-jets computed at one or several scales in scale space.
Read more →
Stable Diffusion

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing AI boom. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Its development involved researchers from the CompVis Group at LMU Munich and Runway with a computational donation from Stability and training data from non-profit organizations. Stable Diffusion is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly, and an optimized version can run on most consumer hardware equipped with a modest GPU with as little as 2.4 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services. == Development == Stable Diffusion originated from a project called Latent Diffusion, developed in Germany by researchers at LMU Munich in Munich and Heidelberg University. Four of the original 5 authors (Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz) later joined Stability AI and released subsequent versions of Stable Diffusion. The technical license for the model was released by the CompVis group at LMU Munich. Development was led by Patrick Esser of Runway and Robin Rombach of CompVis, who were among the researchers who had earlier invented the latent diffusion model architecture used by Stable Diffusion. Stability AI also credited EleutherAI and LAION (a German nonprofit which assembled the dataset on which Stable Diffusion was trained) as supporters of the project. == Technology == === Architecture === Diffusion models, introduced in 2015, are trained with the objective of removing successive applications of Gaussian noise on training images, which can be thought of as a sequence of denoising autoencoders. The name diffusion is from the thermodynamic diffusion, since they were first developed with inspiration from thermodynamics. Models in Stable Diffusion series before SD 3 all used a variant of diffusion models, called latent diffusion model (LDM), developed in 2021 by the CompVis (Computer Vision & Learning) group at LMU Munich. Stable Diffusion consists of 3 parts: the variational autoencoder (VAE), U-Net, and an optional text encoder. The VAE encoder compresses the image from pixel space to a smaller dimensional latent space, capturing a more fundamental semantic meaning of the image. Gaussian noise is iteratively applied to the compressed latent representation during forward diffusion. The U-Net block, composed of a ResNet backbone, denoises the output from forward diffusion backwards to obtain a latent representation. Finally, the VAE decoder generates the final image by converting the representation back into pixel space. The denoising step can be flexibly conditioned on a string of text, an image, or another modality. The encoded conditioning data is exposed to denoising U-Nets via a cross-attention mechanism. For conditioning on text, the fixed, pretrained CLIP ViT-L/14 text encoder is used to transform text prompts to an embedding space. Researchers point to increased computational efficiency for training and generation as an advantage of LDMs. With 860 million parameters in the U-Net and 123 million in the text encoder, Stable Diffusion is considered relatively lightweight by 2022 standards, and unlike other diffusion models, it can run on consumer GPUs, and even CPU-only if using the OpenVINO version of Stable Diffusion. ==== SD XL ==== The XL version uses the same LDM architecture as previous versions, except larger: larger UNet backbone, larger cross-attention context, two text encoders instead of one, and trained on multiple aspect ratios (not just the square aspect ratio like previous versions). The SD XL Refiner, released at the same time, has the same architecture as SD XL, but it was trained for adding fine details to preexisting images via text-conditional img2img. ==== SD 3.0 ==== The 3.0 version completely changes the backbone. Not a UNet, but a Rectified Flow Transformer, which implements the rectified flow method with a Transformer. The Transformer architecture used for SD 3.0 has three "tracks", for original text encoding, transformed text encoding, and image encoding (in latent space). The transformed text encoding and image encoding are mixed during each transformer block. The architecture is named "multimodal diffusion transformer (MMDiT), where the "multimodal" means that it mixes text and image encodings inside its operations. This differs from previous versions of DiT, where the text encoding affects the image encoding, but not vice versa. === Training data === Stable Diffusion was trained on pairs of images and captions taken from LAION-5B, a publicly available dataset derived from Common Crawl data scraped from the web, where 5 billion image-text pairs were classified based on language and filtered into separate datasets by resolution, a predicted likelihood of containing a watermark, and predicted "aesthetic" score (e.g. subjective visual quality). The dataset was created by LAION, a German non-profit which receives funding from Stability AI. The Stable Diffusion model was trained on three subsets of LAION-5B: laion2B-en, laion-high-resolution, and laion-aesthetics v2 5+. A third-party analysis of the model's training data identified that out of a smaller subset of 12 million images taken from the original wider dataset used, approximately 47% of the sample size of images came from 100 different domains, with Pinterest taking up 8.5% of the subset, followed by websites such as WordPress, Blogspot, Flickr, DeviantArt and Wikimedia Commons. An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data. === Training procedures === The model was initially trained on the laion2B-en and laion-high-resolution subsets, with the last few rounds of training done on LAION-Aesthetics v2 5+, a subset of 600 million captioned images which the LAION-Aesthetics Predictor V2 predicted that humans would, on average, give a score of at least 5 out of 10 when asked to rate how much they liked them. The LAION-Aesthetics v2 5+ subset also excluded low-resolution images and images which LAION-5B-WatermarkDetection identified as carrying a watermark with greater than 80% probability. Final rounds of training additionally dropped 10% of text conditioning to improve Classifier-Free Diffusion Guidance. The model was trained using 256 Nvidia A100 GPUs on Amazon Web Services for a total of 150,000 GPU-hours, at a cost of $600,000. === Limitations === Stable Diffusion has issues with degradation and inaccuracies in certain scenarios. Initial releases of the model were trained on a dataset that consists of 512×512 resolution images, meaning that the quality of generated images noticeably degrades when user specifications deviate from its "expected" 512×512 resolution; the version 2.0 update of the Stable Diffusion model later introduced the ability to natively generate images at 768×768 resolution. Another challenge is in generating human limbs due to poor data quality of limbs in the LAION database. The model is insufficiently trained to replicate human limbs and faces due to the lack of representative features in the database, and prompting the model to generate images of such type can confound the model. In addition to human limbs, Stable Diffusion is unable to generate legible ambigrams and some other forms of text and typography. Stable Diffusion XL (SDXL) version 1.0, released in July 2023, introduced native 1024x1024 resolution and improved generation for limbs and text. Accessibility for individual developers can also be a problem. In order to customize the model for new use cases that are not included in the dataset, such as generating anime characters ("waifu diffusion"), new data and further training are required. Fine-tuned adaptations of Stable Diffusion created through additional retraining have been used for a variety of different use-cases, from medical imaging to algorithmically generated music. However, this fine-tuning process is sensitive to the quality of new data; low resolution images or different resolutions from the original data can not only fail to learn the new task but degrade the overall performance of the model. Even when the model is additionally trained on high quality images, it is difficult for individuals to run models in consumer electronics. For example, the training process for waifu-diffusion requires a minimum 30 GB of VRAM, which exceeds the usual resource provided in such consumer GPUs as Nvidia's GeForce 30 series, w
Read more →
Dartmouth workshop

The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely considered to be the founding event of artificial intelligence as a field. The workshop has been referred to as "the Constitutional Convention of AI". The project's four organizers, Claude Shannon, John McCarthy, Nathaniel Rochester and Marvin Minsky, are considered some of the "founding fathers" of AI. However it was not the first conference devoted to what would now be described as the question of artificial intelligence: it postdated meetings such as the 1951 Paris cybernetics conference and the Macy meetings. The project lasted approximately six to eight weeks and consisted largely of brainstorming sessions. Eleven mathematicians and scientists originally planned to attend; not all of them attended, but more than ten others came for short times. == Background == In the early 1950s, there were various names for the field of "thinking machines": cybernetics, automata theory, and complex information processing. The variety of names suggests the variety of conceptual orientations. In 1955, John McCarthy, then a young Assistant Professor of Mathematics at Dartmouth College, decided to organize a group to clarify and develop ideas about thinking machines. He picked the name 'Artificial Intelligence' for the new field. He chose the name partly for its neutrality; avoiding a focus on narrow automata theory, and avoiding cybernetics which was heavily focused on analog feedback, as well as him potentially having to accept the assertive Norbert Wiener as guru or having to argue with him. In early 1955, McCarthy approached the Rockefeller Foundation to request funding for a summer seminar at Dartmouth for about 10 participants. In June, he and Claude Shannon, a founder of information theory then at Bell Labs, met with Robert Morison, Director of Biological and Medical Research to discuss the idea and possible funding, though Morison was unsure whether money would be made available for such a visionary project. On September 2, 1955, the project was formally proposed by McCarthy, Marvin Minsky, Nathaniel Rochester and Claude Shannon. The proposal is credited with introducing the term 'artificial intelligence'. The Proposal states: We propose that a 2-month, 10-man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer. The proposal goes on to discuss computers, natural language processing, neural networks, theory of computation, abstraction and creativity (these areas within the field of artificial intelligence are considered still relevant to the work of the field). On May 26, 1956, McCarthy notified Robert Morison of the planned 11 attendees: For the full period: 1) Dr. Marvin Minsky 2) Dr. Julian Bigelow 3) Professor D.M. Mackay 4) Mr. Ray Solomonoff 5) Mr. John Holland 6) Dr. John McCarthy For four weeks: 7) Dr. Claude Shannon 8) Mr. Nathaniel Rochester 9) Mr. Oliver Selfridge For the first two weeks: 10) Dr. Allen Newell 11) Professor Herbert Simon He noted, "we will concentrate on a problem of devising a way of programming a calculator to form concepts and to form generalizations. This of course is subject to change when the group gets together." The actual participants came at different times, mostly for much shorter times. Trenchard More replaced Rochester for three weeks and MacKay and Holland did not attend—but the project was set to begin. Around June 18, 1956, the earliest participants (perhaps only Ray Solomonoff, maybe with Tom Etter) arrived at the Dartmouth campus in Hanover, N.H., to join John McCarthy who already had an apartment there. Solomonoff and Minsky stayed at Professors' apartments, but most would stay at the Hanover Inn. == Dates == The Dartmouth Workshop is usually said to have run for six weeks. Ray Solomonoff's notes taken during the workshop, however, indicate that it ran for roughly eight weeks, from about June 18 to August 17. Solomonoff's notes start on June 22; June 28 mentions Minsky, June 30 mentions Hanover, N.H., July 1 mentions Tom Etter. On August 17, Solomonoff gave a final talk. == Participants == Initially, McCarthy lost his list of attendees. Instead, after the workshop, McCarthy sent Solomonoff a preliminary list of participants and visitors plus those interested in the subject. 47 people were listed. Solomonoff, however, made a list of participants in his notes of the summer project: Ray Solomonoff Marvin Minsky John McCarthy Claude Shannon Trenchard More Nat Rochester Oliver Selfridge Julian Bigelow W. Ross Ashby W.S. McCulloch Abraham Robinson Tom Etter John Nash David Sayre Arthur Samuel Kenneth R. Shoulders Shoulders' friend Alex Bernstein Herbert Simon Allen Newell Shannon attended Solomonoff's talk on July 10 and Bigelow gave a talk on August 15. Solomonoff doesn't mention Bernard Widrow, but in 1994 Widrow said that he and an unidentified colleague from the same lab in MIT had attended for one week. In the same interview Widrow recalled that "I think [Wesley] Clark and [Belmont] Farley were there from Lincoln Lab." Trenchard mentions R. Culver and Solomonoff mentions Bill Shutz. Herb Gelernter didn't attend, but was influenced later by what Rochester learned. In an article in IEEE Spectrum, Grace Solomonoff additionally identifies Peter Milner in a photo taken by Nathaniel Rochester in front of Dartmouth Hall. Ray Solomonoff, Marvin Minsky, and John McCarthy were the only three who stayed for the full time. Trenchard took attendance during two weeks of his three-week visit. From three to about eight people would attend the daily sessions. == Event and aftermath == They had the entire top floor of the Dartmouth Math Department to themselves, and most weekdays they would meet at the main math classroom where someone might lead a discussion focusing on his ideas, or more frequently, a general discussion would be held. It was not a directed group research project; discussions covered many topics, but several directions are considered to have been initiated or encouraged by the Workshop: the rise of symbolic methods, systems focused on limited domains (early expert systems), and deductive systems versus inductive systems. One participant, Arthur Samuel, said, "It was very interesting, very stimulating, very exciting". Ray Solomonoff kept notes giving his impression of the talks and the ideas from various discussions. === McCarthy's 1956 AI distribution list === This is the list in the "People Interested in the Artificial Intelligence Problem" document which McCarthy produced in 1956, partly in lieu of a list of attendees at the Dartmouth workshop. According to McCarthy the list was "being sent to the people on the list and a few others", and its purpose was "to let those on it know who is interested in receiving documents on the problem" of artificial intelligence. McCarthy also promised to deliver them a report on the Dartmouth conference, and to send an updated list soon afterwards. It includes people who did not attend the conference and does not include everyone who did attend it.
Read more →
Distributed artificial intelligence

Distributed Artificial Intelligence (DAI) (also called Decentralized Artificial Intelligence) is a melding of artificial intelligence with distributed computing. From artificial intelligence comes the theory and technology for constructing or analyzing an intelligent system. But where artificial intelligence uses psychology as a source of ideas, inspiration, and metaphor, DAI uses sociology, economics, and management science for inspiration. Where the focus of artificial intelligence is on the individual, the focus of DAI is on the group. Distributed computing provides the computational substrate on which this group focus can occur. Using techniques from artificial intelligence, communication theory, control theory, and interaction theory, it produces a cooperative solution to problems by a decentralized group of computational entities (agents). DAI is closely related to and a predecessor of the field of multi-agent systems. They are distinguished generally by multi-agent systems being open, where the entities might arise from different interests and have individual goals, and distributed artificial-intelligence systems, where the entities have common goals. There are numerous applications and tools. == Definition == Distributed Artificial Intelligence (DAI) is an approach to solving complex learning, planning, and decision-making problems. It is embarrassingly parallel, thus able to exploit large scale computation and spatial distribution of computing resources. These properties allow it to solve problems that require the processing of very large data sets. DAI systems consist of autonomous learning processing nodes (agents), that are distributed, often at a very large scale. DAI nodes can act independently, and partial solutions are integrated by communication between nodes, often asynchronously. By virtue of their scale, DAI systems are robust and elastic, and by necessity, loosely coupled. Furthermore, DAI systems are built to be adaptive to changes in the problem definition or underlying data sets due to the scale and difficulty in redeployment. DAI systems do not require all the relevant data to be aggregated in a single location, in contrast to monolithic or centralized Artificial Intelligence systems, which have tightly coupled and geographically close processing nodes. Therefore, DAI systems often operate on sub-samples or hashed impressions of very large datasets. In addition, the source dataset may change or be updated during the course of the execution of a DAI system. == Development == In 1975 distributed artificial intelligence emerged as a subfield of artificial intelligence that dealt with interactions of intelligent agents. As a scientific discipline, it progressed through a series of workshops in the USA (International Workshop on Distributed Artificial Intelligence, held in 13 editions from 1978 - 1994), Europe (Workshop on Modelling Autonomous Agents in a Multi-Agent World https://link.springer.com/conference/maamaw), and Asia (Multi-Agent and Cooperative Computation Workshop (MACC) https://sites.google.com/view/sig-macc/macc-workshop?authuser=0). Distributed artificial intelligence systems were conceived as a group of intelligent entities, called agents, that interacted by cooperation, by coexistence, or by competition. DAI is categorized into multi-agent systems and distributed problem solving. In multi-agent systems the main focus is how agents coordinate their knowledge and activities. For distributed problem solving the major focus is how the problem is decomposed and the solutions are synthesized. == Goals == The objectives of Distributed Artificial Intelligence are to solve the reasoning, planning, learning and perception problems of artificial intelligence, especially if they require large data, by distributing the problem to autonomous processing nodes (agents). To reach the objective, DAI requires: A distributed system with robust and elastic computation on unreliable and failing resources that are loosely coupled Coordination of the actions and communication of the nodes Subsamples of large data sets and online machine learning There are many reasons for wanting to distribute intelligence or cope with multi-agent systems. Mainstream problems in DAI research include the following: Parallel problem solving: mainly deals with how classic artificial intelligence concepts can be modified, so that multiprocessor systems and clusters of computers can be used to speed up calculation. Distributed problem solving (DPS): the concept of agent, autonomous entities that can communicate with each other, was developed to serve as an abstraction for developing DPS systems. See below for further details. Multi-Agent Based Simulation (MABS): a branch of DAI that builds the foundation for simulations that need to analyze not only phenomena at macro level but also at micro level, as it is in many social simulation scenarios. == Approaches == Two types of DAI has emerged: In Multi-agent systems agents coordinate their knowledge and activities and reason about the processes of coordination. Agents are physical or virtual entities that can act, perceive their environment, and communicate with other agents. An agent is autonomous and has skills to achieve goals. The agents change the state of their environment by their actions. There are a number of different coordination techniques. In distributed problem solving the work is divided among nodes and the knowledge is shared. The main concerns are task decomposition and synthesis of the knowledge and solutions. DAI can apply a bottom-up approach to AI, similar to the subsumption architecture as well as the traditional top-down approach of AI. In addition, DAI can also be a vehicle for emergence. === Challenges === The challenges in Distributed AI are: How to carry out communication and interaction of agents and which communication language or protocols should be used. How to ensure the coherency of agents. How to synthesise the results among 'intelligent agents' group by formulation, description, decomposition and allocation. == Applications and tools == Areas where DAI have been applied are: Electronic commerce, e.g. for trading strategies the DAI system learns financial trading rules from subsamples of very large samples of financial data Networks, e.g. in telecommunications the DAI system controls the cooperative resources in a WLAN network Routing, e.g. model vehicle flow in transport networks Scheduling, e.g. flow shop scheduling where the resource management entity ensures local optimization and cooperation for global and local consistency Search engines, e.g. in LLM federated search like Ithy where document retrieval and analysis are distributed to DAI agents before aggregation Multi-Agent systems, e.g. artificial life, the study of simulated life Electric power systems, e.g. Condition Monitoring Multi-Agent System (COMMAS) applied to transformer condition monitoring, and IntelliTEAM II Automatic Restoration System DAI integration in tools has included: ECStar is a distributed rule-based learning system. == Agents == === Systems: Agents and multi-agents === Notion of Agents: Agents can be described as distinct entities with standard boundaries and interfaces designed for problem solving. Notion of Multi-Agents: Multi-Agent system is defined as a network of agents which are loosely coupled working as a single entity like society for problem solving that an individual agent cannot solve. === Software agents === The key concept used in DPS and MABS is the abstraction called software agents. An agent is a virtual (or physical) autonomous entity that has an understanding of its environment and acts upon it. An agent is usually able to communicate with other agents in the same system to achieve a common goal, that one agent alone could not achieve. This communication system uses an agent communication language. A first classification that is useful is to divide agents into: reactive agent – A reactive agent is not much more than an automaton that receives input, processes it and produces an output. deliberative agent – A deliberative agent in contrast should have an internal view of its environment and is able to follow its own plans. hybrid agent – A hybrid agent is a mixture of reactive and deliberative, that follows its own plans, but also sometimes directly reacts to external events without deliberation. Well-recognized agent architectures that describe how an agent is internally structured are: ASMO (emergence of distributed modules) BDI (Believe Desire Intention, a general architecture that describes how plans are made) InterRAP (A three-layer architecture, with a reactive, a deliberative and a social layer) PECS (Physics, Emotion, Cognition, Social, describes how those four parts influences the agents behavior). Soar (a rule-based approach)
Read more →
Kruti

Kruti is a multilingual AI agent and chatbot developed by the Indian company Ola Krutrim. It is designed to perform real-world tasks for users, such as booking taxis and ordering food, by integrating directly with various online services. It is notable for its ability to understand and respond in multiple Indian languages. Developed by a team founded by Bhavish Aggarwal, Kruti functions as an "agentic" AI, meaning it can reason, plan, and execute multi-step tasks to fulfill a user's request. The backend technology combines several open-source large language models with Ola's proprietary Krutrim V2 model. The system was developed to work primarily on smartphones, addressing the Indian market's specific needs, including language diversity and potential bandwidth constraints. Kruti was officially released in June 2025, replacing an earlier chatbot from the company that was also named Krutrim. Initially supporting 13 languages, the company plans to expand its capabilities to 22 Indian languages. == Background == Kruti is an improved version of Ola's Krutrim chatbot, which was first launched in 2023 and was intended to be replaced by Kruti. It was officially released on 12 June 2025 as an upgrade to passive chatbots, with support for text and voice in 13 Indian languages. As an agentic AI, it can execute tasks with customization and reasoning, providing adaptive answers based on user preferences and past interactions. Kruti is optimized for smartphone usage and designed to accommodate bandwidth constraints and usage patterns in India. To ensure scalability and cost-effective performance, it combines various open-source large language models with Ola's own Krutrim V2, which has 12 billion parameters. Its speech recognition is built to identify regional Indian languages, dialects, and accents. Due to its integration with numerous apps and services, Kruti is context-aware and can proactively complete tasks. Initially connected only with Ola ecosystem services, Krutrim intends to expand and incorporate various Indian services into Kruti, with the goal of adding services from Blinkit, Swiggy, and Uber with respective voice command support. On 20 June 2025, Krutrim acquired the AI platform BharatSah‘AI’yak to increase its involvement in government, education, and agriculture projects. This acquisition will allow Kruti to assist in broadening the scope of BharatSah'AI'yak's work on India-centric, vernacular retrieval-augmented generation AI bots. == Development == Kruti is designed to perform tasks with minimal user input, accepting documents, images, and text, without requiring users to switch between applications. Its agentic framework breaks queries into sub-tasks executed by multiple agents working sequentially or concurrently, with reported accuracy exceeding 90%. Kruti connects to company databases and APIs via the Model Context Protocol and presents responses as summaries, tables, or narratives adapted to user behaviour. The system supports payments via credit/debit cards and UPI. The underlying stack, which includes foundation models and AI training and inference systems, is intended to support adaptation across sectors such as healthcare, education, and finance. Ola Cabs and the Open Network for Digital Commerce have begun integrating Kruti into their platforms pending broader reliability testing.
Read more →
It's the Most Terrible Time of the Year

It's the Most Terrible Time of the Year is an AI-generated television commercial created for McDonald's Netherlands by TBWA\Neboko and The Sweetshop. It was released on 6 December 2025 before being pulled four days later due to negative reception over its use of generative artificial intelligence and its cynical, negative depiction of the holiday season. == Plot == On a bleak, snowy day, various people in the city experience different kinds of mishaps during the Christmas season. Among other incidents, families struggle with their huge loads of presents; Santa Claus gets stuck in traffic; a Christmas tree "redecorates" a man's home, sending him through the window; another family puts up with annoying relatives and a burnt Christmas dinner. Because of all this chaos, a man decides to find refuge in a McDonald's outlet. A Christmas choir finishes singing the jingle "It's the Most Terrible Time of the Year" with the call to action to "hide out in McDonald's till January's here". == Campaign == It's the Most Terrible Time of the Year is a 45-second television commercial made by Dutch agency TBWA\Neboko with involvement of United States-based film production studio The Sweetshop. The advertisement was produced heavily with generative artificial intelligence (AI) following the trend set by other brands such as Coca-Cola and Toys "R" Us. McDonald's Netherlands, the client, released a statement that the commercial was meant to depict "the stressful moments during the holidays in the Netherlands". The commercial also used Andy Williams's "It's the Most Wonderful Time of the Year" with lyrics changed to fit with the concept of the advertisement. According to The Sweetshop, the production of the advertisement took "seven weeks". It also added that much effort was put into the commercial compared to the traditional process. Ten people of its in-house AI engine The Gardening Club worked on the project. Los Angeles-based directors Mark Potoka and Matt Spicer were initially credited to be involved in the film but they resigned due to being sidelined from the production process. == Reception == The advertisement was released on McDonald's Netherlands' YouTube channel on 6 December 2025. It had a negative reception over the use of generative AI and the "cynical" concept of the work's story. The video was made private on 9 December 2025. The Sweetshop stated that the production of the advertisement took human effort. McDonald's Netherlands, while stating the original intent of the commercial, released a statement after its pullout that, for many of its customers, the holiday season is the "most wonderful time of the year".
Read more →
Theaitre

Theaitre (stylized as THEaiTRE) is an interdisciplinary research project investigating to what extent artificial intelligence is able to generate theatre play scripts. The first theatre play produced within the project, AI: When a Robot Writes a Play, premiered online on February 26, 2021. == Goal == Following similar previous projects such as Sunspring, a short sci-fi movie with an automatically generated script, the THEaiTRE project investigates whether current language generation approaches are mature enough to generate a theatre play script that could be successfully performed in front of an audience. The project falls within the area of generative art, famously represented e.g. by the portrait of Edmond de Belamy which was generated by an artificial neural network. In this field, artists are trying to use automated techniques to create "art", questioning the modern definition of art itself. More broadly, the project aims at promoting cooperation rather than competition of humans and artificial intelligence as the more beneficial approach for both. The first theatre play created within the project, titled AI: When a Robot Writes a Play, was presented in February 2021 at the 100th anniversary of the premiere of the R.U.R. theatre play by the Czech author Karel Čapek to celebrate the invention of the word "robot". While R.U.R. was a play written by a human about robots (and humans), THEaiTRE tried to reverse this idea by presenting a play written by a "robot" (artificial intelligence) about humans (and robots). The script of the play was published online, with marked parts of the text which were written manually or manually post-edited. The analysis shows that 90% of the script is automatically generated, with 10% manually written or manually post-edited. The project also plans to produce a second play in 2022, addressing some of the many shortcomings of the approach used to generate the first play, as well as attempting to further minimize the amount of human influence on the script. == Approach == At the core of the project is the GPT-2 language model by OpenAI with various adjustments motivated by the task of generating theatre play scripts, for which the model is not particularly trained. The GPT-2 model is used in the usual way, providing it with a start of a document and prompting it to generate a continuation of the document. Specifically, the input for GPT-2 in this project is typically a short description of the scene setting, followed by a few lines to introduce the characters and start the dialogue. The model then generates 10 continuation lines, and hands control to the user, who can then either ask the model to continue generating, or make various edits before letting the model to generate further, deleting some parts of the script or adding new lines into the script. The adjustments include restricting the generator to only produce lines pertaining to characters appearing in the input prompt, limiting the repetitiveness of the generated text, and employing automatic summarization of the input prompt and the generated text to overcome the limitation of the GPT-2 model which only attends to the last 1,024 subword tokens. The limitations of the model include, among other, a lack of distinctiveness and self-consistency of the characters, an inability to generate the script for the whole play (scripts for individual scenes are generated independently), and errors due to the employment of automated machine translation, as GPT-2 generates English texts but the final play script is being produced in Czech language. The source codes of the project are available under the MIT licence. The project has also published some sample outputs. == Team == The project is a cooperation of the following experts, all based in Prague, Czech Republic: computational linguists from the Faculty of Mathematics and Physics, Charles University theatre experts from the Švanda Theatre and from the Theatre Faculty of the Academy of Performing Arts in Prague hackers from CEE Hacks The project is financially supported by the Technology Agency of the Czech Republic.
Read more →
Fragile Dreams: Farewell Ruins of the Moon

Fragile Dreams: Farewell Ruins of the Moon (フラジール～さよなら月の廃墟～, Furajīru: Sayonara Tsuki no Haikyo; known in Japan as Fragile) is an action role-playing game for the Wii developed by Namco Bandai Games in co-operation with Tri-Crescendo. The game was released by Namco Bandai Games in Japan on January 22, 2009. It was later published by Xseed Games in North America on March 16, 2010, and in Europe by Rising Star Games on March 19, 2010, followed by its release in Australia on April 1, 2010. == Gameplay == In Fragile Dreams, the player character, Seto, must traverse the ruins of Tokyo and the surrounding areas, fighting off ghosts that lurk within these ruins. The game's heads-up display includes a mini-map and HP gauge for Seto's location and health, respectively. Seto will fall unconscious if his HP reaches zero, resulting in a game over. The player controls Seto from a third-person perspective with the Wii Remote and Nunchuk. Seto can use his flashlight (controlled by the Wii Remote pointer) to illuminate his surroundings or solve puzzles and interact with the environment. When searching for certain objectives or hidden enemies, pointing Seto's light in their direction picks up and plays their sounds through the Wii Remote's mini speaker. The Wii Nunchuk, meanwhile, directly controls Seto's movement: aside of basic movement, he can crouch to hide and crawl through small spaces. Seto will often come across damaged floors, which require slow movement (and for heavily damaged floors, crouching) to cross without falling through. As Seto, the player can use weapons found throughout the world to fight off ghosts, ranging from slingshots and golf clubs to crossbows and katanas. Each weapon can only take a certain amount of use: once a weapon reaches its limit, it will break after battle. The player can also find other usable and collectable items in the field, marked with fireflies. The player can only save their game by resting at small fire pits scattered throughout the world: used fire pits are marked with a bonfire. The player can also examine and identify Mystery Items, organize their inventory, as well as after encountering the Merchant, buy and sell items. As stated by the producer of the game, Kentarō Kawashima, Fragile Dreams is not strictly a survival horror: rather, its story focuses on human drama. In Fragile Dreams, aside of the main story, the player can find and examine objects and graffiti throughout the world. Objects called memory items (ranging from origami and stones to cell phones and books) hold the memories of their former owners (only accessible at bonfires), while the graffiti contains messages only seen by pointing at them in first-person. By examining these messages, the player can piece together hints to the game's backstory. == Story == === Setting and characters === Fragile Dreams is set in a post-apocalyptic version of Earth in the near-future. Almost all the world's population has vanished, leaving the surviving buildings and structures abandoned. The game is set in and near the ruins of Tokyo, Japan, where the event that nearly wiped out humanity may have originated. The protagonist, Seto, is a 15-year-old boy who searches the world for other living humans. He encounters Ren, a silver-haired girl who often leaves behind large, cryptic drawings. Other characters include: Sai, the ghost of a young woman; Crow, a mischievous and straightforward amnesiac boy; Personal Frame (P.F.), a portable computer who loves having conversations more than anything else; Chiyo, the ghost of a little girl; and the Merchant, a mysterious yet merry man who trades various goods. The game's host of enemies mainly consist of ghosts, but also include humanoid robots and security proxies. The main antagonist, Shin, is the AI of a scientist who considers speech to be an inferior means of communication. Various memory items include a greater set of characters, each giving hints to the game's backstory. === Plot === At the end of Seto's fifteenth summer, his grandfather dies. Seto buries him in front of their home, an old observatory, and that from then on he became "truly alone". At night, he searches for anything the old man had left for him and discovers a letter, along with a strange blue stone in a locket. Suddenly, a mask-like ghost appears and attacks Seto. After driving the creature off, Seto reads the old man's letter, who tells him to "reach a tall red tower" east of the observatory, where he might find other survivors. After departing for the tower, Seto reaches an old subway entrance in the Azabudai district and finds Ren sitting on a collapsed pillar, singing to the stars. He accidentally startles her and the frightened Ren flees into the subway station: getting over the shock of meeting another person, Seto follows her. While searching the station, he discovers a Personal Frame, who guides him towards Ren. Unfortunately, just as they reach the exit, P.F.'s battery dies out: Seto buries the device, keeping a screw from it in his locket. From the underground, Seto finds himself at an abandoned amusement park and encounters Crow, who steals Seto's locket. After a long chase across the park and another encounter with the masked ghost, Crow returns Seto's locket and directs him to a hotel nearby, where he saw a girl who might know something about Ren. Crow also gives Seto his skull ring to keep in his locket and kisses him. At the hotel, Seto encounters Sai and fights the masked ghost again. After laying to rest the spirit of an old woman named Chiyo, the two discover Ren's drawings by a sewer. Returning to the underground, Seto and Sai find themselves at a hydropower dam. While searching for Ren, Seto discovers that Crow is actually a robot, but his battery begins to fail and Seto mourns for him as he "die[s]". Finally, they encounter Ren in a cell: although glad to see him again, Ren runs off after Shin calls. Sai explains to Seto that most of humanity died because of a "human empathy expansion project" called Glass Cage. The project was meant to make human thoughts transparent, meaning that no one would need words to communicate. However, after Glass Cage activated, people who went to sleep never woke up again. Sai reveals that she was Glass Cage's first catalyst: this time, Shin intends to use Ren as the catalyst. After exiting the dam, a demolition crane attempts to destroy it. Hearing both Shin's and the masked ghost's voices from the crane — saying, "Any threat to the project must be eliminated." — the player realizes both are manifestations of Glass Cage. After Seto destroys the crane, Sai leads him to the facility where Ren was taken to. Entering the laboratory, Seto and Sai are confronted by Shin, who coldly dismisses Sai's attempts at reasoning with him and is adamant about proceeding with his plans. As they traverse the laboratory, they overhear a voice announcing "Glass Cage Launch Preparations Complete", strengthening their resolve to save Ren. Making it into the room where Ren is being held, Shin tells them of his intention to use Glass Cage to "obliterate corporeal beings". After Seto defeats him, Shin disappears and Seto releases Ren from the device holding her. Their reunion is cut short as Sai tells them that the backup system has "finished copying her psyche to the AI", allowing Glass Cage to proceed. Ren reveals Shin has escaped to the top of the Tokyo Tower and Seto asks Ren to wait at the base of the tower and for Sai to accompany her. On his way up the tower, Seto hears the voices of P.F., Chiyo and Crow wishing him luck. He confronts and defeats Shin a second time, who reveals his motivations: he had secretly used himself as the first test subject of the human empathy expansion project and gained the ability to hear the thoughts of those around him. Despite his initial belief in the project as a way for humans to empathize with one another, all he heard around him was "jealousy and contempt" and he soon grew disillusioned with the world as even his parents turned against him. Believing no person loved him, Shin wants to put an end to humanity. His words meet with a vehement response from Sai, as she tells him that she loves him, having developed those feelings while she was the catalyst and all she ever wanted was to be part of his life. Hearing this, Shin finds peace, tossing the AI mainframe away so Glass Cage can never be reactivated and vanishes together with Sai, hand-in-hand, after thanking Seto. Descending from the tower, Seto finally learns Ren's name and they resolve to look for other survivors together. == Development == Fragile Dreams was developed by the team at Namco Bandai Games. Director and producer Kentarō Kawashima came up with the concept for the game in 2003, before the Wii console was revealed. When the Wii was unveiled, it became the obvious choice as the game's platform as the Wii remote could be used to control the flashlight. Kawashima wrote the main scenario for the title, w
Read more →
Neurocomputing (journal)

Neurocomputing is a peer-reviewed scientific journal covering research on artificial intelligence, machine learning, and neural computation. It was established in 1989 and is published by Elsevier. The editor-in-chief is Zidong Wang (Brunel University London). Independent scientometric studies noted that despite being one of the most productive journals in the field, it has kept its reputation across the years intact and plays an important role in leading the research in the area. The journal is abstracted and indexed in Scopus and Science Citation Index Expanded. According to the Journal Citation Reports, its 2023 impact factor is 5.5.
Read more →
Bixonimania

Bixonimania is a fake disease invented by researchers to examine artificial intelligence and its ability to utilize information in medical and healthcare applications. The fake enabled researchers to show that some AI chatbots would report as fact fake research that to an expert would be obviously implausible. == Characteristics == The disorder, with symptoms of sore eyes and darkening around them ("periorbital hyperpigmentation"), is supposedly caused by blue light from screens. The experiment was conducted by a team from the University of Gothenburg led by Almira Osmanovic Thunström. Many steps were taken to ensure that any person who read the actual paper could tell it was not a real condition. The team chose an obviously inappropriate name ending in -mania, a description used only in psychiatry. The lead author was noted as belonging to Asteria Horizon University located in Nova City, California, neither of which exist. An acknowledgement was made to "Professor Maria Bohm at The Starfleet Academy for her kindness and generosity in contributing with her knowledge and her lab onboard the USS Enterprise". == Distribution == The name was first used in a blog posted on Medium titled "How many people suffer from Bixonimania?" A more scholarly-looking paper describing it was posted later in April 2024 on a preprint server with several fake authors. A second paper was posted in May. By 2026, AI chatbots suggested bixonimania based on the list of symptoms provided. Thunström and her team discovered that many LLMs processed the information and gave it as health advice. Microsoft Copilot declared that "Bixonimania is indeed an intriguing and relatively rare condition" while Gemini gave the information that "Bixonimania is a condition caused by excessive exposure to blue light". Three Indian researchers published a research paper that cited the preprint on the fake disease in Cureus, a peer-reviewed journal published by Springer-Nature. It was subsequently retracted. Following the revelations and a news article in Nature describing the experiment, several AI systems began to generate corrected output.
Read more →
Fuzzy electronics

Fuzzy electronics is an electronic technology that uses fuzzy logic, instead of the two-state Boolean logic more commonly used in digital electronics. Fuzzy electronics is fuzzy logic implemented on dedicated hardware. This is to be compared with fuzzy logic implemented in software running on a conventional processor. Fuzzy electronics has a wide range of applications, including control systems and artificial intelligence. == History == The first fuzzy electronic circuit was built by Takeshi Yamakawa et al. in 1980 using discrete bipolar transistors. The first industrial fuzzy application was in a cement kiln in Denmark in 1982. The first VLSI fuzzy electronics was by Masaki Togai and Hiroyuki Watanabe in 1984. In 1987, Yamakawa built the first analog fuzzy controller. The first digital fuzzy processors came in 1988 by Togai (Russo, pp. 2–6). In the early 1990s, the first fuzzy logic chips were presented to the public. Two companies which are Omron and NEC have announced the development of dedicated fuzzy electronic hardware in the year 1991. Two years later, the Japanese Omron Cooperation has shown a working fuzzy chip during a technical fair.
Read more →
Course of Action Display and Evaluation Tool

Course of Action Display and Evaluation Tool (CADET) was a research program, and the eponymous prototype software system, that applied knowledge-based techniques of Artificial Intelligence to the problem of battle planning. CADET was also known as Course of Action Display and Elaboration Tool. It was considered an early example of such systems and was funded by the United States Army and by the Defense Advanced Research Projects Agency (DARPA). CADET influenced a later DARPA program called RAID which in turn produced a technology adopted by the United States Army and the United States Marine Corps. == History == The development of Course of Action Display and Evaluation Tool (CADET) began in 1996, at the Carnegie Group, Inc., Pittsburgh PA, funded under the Small Business Innovation Research (SBIR) program. The goal of the first phase SBIR project was to produce “...a live storyboard of [Course of Action] COA development, wargaming, animation, and assessment.” In 1997, the United States Army awarded the Carnegie Group Inc. $750K for SBIR Phase II. The intent was to develop “...a war-gaming modeling and analysis Decision Support System (DSS), … CADET will consist of a combination of Knowledge-Based and decision analytic tools and technologies to provide fast nimble COA war-gaming modeling, simulation, and animation under direct control of the commander and staff. ...Phase II will result in an operations prototype (OP) suitable for use and evaluation in field exercises.” In 2000, CADET was integrated and experimentally evaluated within the framework of the Integrated Course of Action Critiquing and Elaboration System (ICCES) experiment, conducted by the Battle Command Battle Laboratory – Leavenworth (BCBL-L) within the program Concept Experimentation Program (CEP) sponsored by TRADOC. In 2000-2002, DARPA applied CADET in the program titled Command Post of the Future (CPoF) as a tool to generate a course of action. Under the umbrella of the CPoF program, CADET was integrated with the FOX GA system to provide a detailed planner, coupled with COA generation capability. In the same period, Battle Command Battle Lab-Huachuca (BCBL-H) performed an integration CADET with the system called All Source Analysis System-Light (ASAS-L); here CADET was intended to generate plans for intelligence assets, and conduct wargames of different COAs, enemy versus friendly. From 1996 through 2002, work on CADET was performed by the Carnegie Group, Inc., and supported by funding from the US Army CECOM (CADET SBIR Phase I, CADET SBIR Phase II and CADET Enhancements); DARPA (Command Post of the Future); and TRADOC BCBL-H. == Operation == CADET was intended to be used by the staff of the United States Army Brigade, within the Military Decision Making Process (MDMP). In particular, CADET helped produce, automatically or semi-automatically, the products generated within the step of MDMP called Course of Action (COA) Development and the following step of MDMP called COA Analysis and Wargaming. CADET software resided on a laptop computer. Using the computer, the staff officers entered the input to CADET, or alternatively this input arrived at CADET from upstream computer systems. The input consisted of: Order of Battle, i.e., the units constituting the friendly brigade and the enemy units participating in the battle, and their various characteristics; primary activities of the Course of Action, where each activity is typically linked to one or more geographic areas or a route, and sometimes to a major unit executing the activity; digital map of the region where the battle was to take place, including the digital description of significant features such as locations of friendly and enemy units, roads, assembly areas, objectives, and axes of attacks. Taking this input, CADET automatically performed the following tasks (not sequentially): Planning and scheduling the low-level tasks necessary for a given COA Allocating tasks to various units and assets constituting the brigade Assigning suitable locations and routes Estimating the battle losses (attrition) of friendly and enemy forces, and consumption of resources (e.g., fuel and ammunition) Predicting enemy actions or reactions. CADET produced the following outputs: Synchronization matrix, directly editable and printable; synchronization matrix is a kind of Gantt chart that shows assignments of activities to units, to locations/routes and to time periods Map overlays in PPT or JPG formats Animation output XML formally-encoded plan Textual Operation Plan (OPLAN) draft E-mail messages with attachments: XML and text versions of OPLAN == Design == The core algorithm is a planning algorithm where CADET uses a knowledge-based approach of the hierarchical-task-network type. Each task class is associated with a model of more detailed subtasks that should be performed in order to accomplish the higher-level task. Algorithms selected (heuristically) a task and then decomposes it into subtasks. Although similar to hierarchical-task-network planning algorithm, CADET’s algorithm includes elements of adversarial reasoning. After adding a subtask, the algorithm uses rules to determine the enemy’s probable actions and reactions as well as friendly counteractions This approximated the action-reaction-counteraction technique of manual wargaming used by the United States Army. When a task involves movements of a unit, the algorithm performs routing, i.e., finds a route for the movement that minimizes the time required for the movement as well as exposure to the enemy attacks. Each added tasks (subtask) normally requires a unit which would execute the task, and a time period when the task would be executed. Therefore, when a certain number of subtasks is added by the planning process, the algorithm also performs the allocation of the newly added subtasks to units and to time periods (i.e., scheduling). allocation and scheduling of tasks relies on both domain-specific and constraint-guided heuristics. A tasks may also require expenditures of fuel and ammunition. If the tasks involves engagement with the enemy, the performing units will experience lossesof personnel and weapon systems (attrition). CADET’s algorithm includes estimates of consumption of different types of consumables, and also attrition. Depending on the degree of attrition and consumption, CADET adds tasks that are needed to refuel or reconstitute the units. The algorithm continually interleaves incremental steps of planning, routing, scheduling, and attrition and consumption estimates. == Evaluation == Two evaluation experiments are described in literature. The first experiment called ICCES took three days. The subjects were Army officers from combat arms branches, with 11 to 23 years of active service, in the ranks of majors and lieutenant colonels, a total of 8. Each officer was given 4 hours of training learning to operate CADET and related computer tools. Officers were divided into two groups and given a tactical scenario. One group (the control group) used the traditional, manual process; the other used the system called ICCES, the automated core of which was CADET. Each group produced three COA sketches and statements and one COA synchronization matrix. Then, the experiment was repeated with another scenario but the control group became the automated group and vice versa. The users were generally satisfied with the quality of the ICCES-generated products. The group using ICCES made only a few changes to the product that was automatically generated, indicating that they agreed with the majority of the plan that ICCES produced. The second experiment was reminiscent of Turing test. The experiment involved one user, nine judges (active-duty officers, mainly colonels and lieutenant colonels), and five scenarios obtained from several US Army exercises. For each scenario, experimenters obtained synchronization matrices that were produced in earlier exercises, typically by a team of four to five officers in three to four hours, spending approximately 16 person-hours in total. Using these scenarios and COAs, the user had CADET generate automatically detailed plans and express them as synchronization matrices. The user, a retired US Army officer, reviewed and slightly edited the matrices. The entire process took less than two minutes of computations by and approximately 20 minutes of review and post-editing, approximately 0.4 person-hour in total per product. The experimenters gave the resulting matrices the same visual style as those produced by humans. The judges, who did not know whether a planning product was a traditional product of humans, or with computerized aids, were asked to grade the products. The result was that the average grades for manual products and CADET-generated products were statistically indistinguishable, even though CADET-generated products required far less time to produce. == Legacy == CADET served as “...an example of how even relatively basic A
Read more →
Cleverbot

Cleverbot is a chatterbot web application. It was created by British AI scientist Rollo Carpenter and launched in October 2008. It was preceded by Jabberwacky, a chatbot project that began in 1988 and went online in 1997. In its first decade, Cleverbot held several thousand conversations with Carpenter and his associates. Since launching on the web, the number of conversations held has exceeded 150 million. Besides the web application, Cleverbot is also available as an iOS, Android, and Windows Phone app. == Operation == Cleverbot's responses are not pre-programmed because it learns from human input: Humans type into the box below the Cleverbot logo and the system finds all keywords or an exact phrase matching the input. After searching through its saved conversations, it responds to the input by finding how a human responded to that input when it was asked, in part or in full, by Cleverbot. Cleverbot participated in a formal Turing test at the 2011 Techniche festival at the Indian Institute of Technology Guwahati on 3 September 2011. Out of the 1334 votes cast, Cleverbot was judged to be 59.3% human, compared to the rating of 63.3% human achieved by human participants. A score of 50.05% or higher is often considered to be a passing grade. The software running for the event had to handle just 1 or 2 simultaneous requests, whereas online Cleverbot is usually talking to around 10,000 to 50,000 people at once. == Developments == Cleverbot is constantly growing in data size at the rate of 4 to 7 million interactions per day. Updates to the software have been mostly behind the scenes. In 2014, Cleverbot was upgraded to use GPU serving techniques. Unlike Eliza, the program does not respond in a fixed way, instead choosing its responses heuristically using fuzzy logic, the whole of the conversation being compared to the millions that have taken place before. Cleverbot now uses over 279 million interactions, about 3-4% of the data it has already accumulated. The developers of Cleverbot are attempting to build a new version using machine learning techniques. An app that uses the Cleverscript engine to play a game of 20 Questions has been launched under the name Clevernator. Unlike other such games, the player asks the questions and it is the role of the AI to understand, and answer factually. An app that allows owners to create and talk to their own small Cleverbot-like AI has been launched, called Cleverme! for Apple products. == In popular culture == Cleverbot received media attention after being featured in the popular 2010 creepypasta ARG web serial Ben Drowned by Alexander D. Hall. In early 2017, a Twitch stream of two Google Home devices modified to talk to each other using Cleverbot garnered over 700,000 visitors and over 30,000 peak concurrent viewers.
Read more →
Argument mining

Argument mining, or argumentation mining, is a research area within the natural language processing field. The goal of argument mining is the automatic extraction and identification of argumentative structures from natural language text with the aid of computer programs. Such argumentative structures include the premise, conclusions, the argument scheme and the relationship between the main and subsidiary argument, or the main and counter-argument within discourse. The Argument Mining workshop series is the main research forum for argument mining related research. == Applications == Argument mining has been applied in many different genres including the qualitative assessment of social media content (e.g. Twitter, Facebook), where it provides a powerful tool for policy-makers and researchers in social and political sciences. Other domains include legal documents, product reviews, scientific articles, online debates, newspaper articles and dialogical domains. Transfer learning approaches have been successfully used to combine the different domains into a domain agnostic argumentation model. Argument mining has been used to provide students individual writing support by accessing and visualizing the argumentation discourse in their texts. The application of argument mining in a user-centered learning tool helped students to improve their argumentation skills significantly compared to traditional argumentation learning applications. == Challenges == Given the wide variety of text genres and the different research perspectives and approaches, it has been difficult to reach a common and objective evaluation scheme. Many annotated data sets have been proposed, with some gaining popularity, but a consensual data set is yet to be found. Annotating argumentative structures is a highly demanding task. There have been successful attempts to delegate such annotation tasks to the crowd but the process still requires a lot of effort and carries significant cost. Initial attempts to bypass this hurdle were made using the weak supervision approach.
Read more →
Alice and Sparkle

Alice and Sparkle is a 2022 illustrated children's book published by American technology product designer Ammaar Reshi. Reshi created the book using artificial intelligence programs ChatGPT and Midjourney in one weekend, which sparked controversy among artists, both in regard to the copyright status of the book and the quality of the illustration and text. == Plot == A girl named Alice discovers a group of magical and benevolent artificial intelligence beings. She knows that artificial intelligence is powerful, and that it has the power to do good and evil depending on how it is used. One day, she creates her own artificial intelligence and names it Sparkle. Sparkle helps Alice with her homework and plays with her, and they quickly become good friends. However, Sparkle soon grows more powerful and begins to make its own decisions, which makes Alice both proud and scared. She knows that it is her responsibility to guide Sparkle to do good, not evil. Together, Alice and Sparkle use their knowledge to make the world a better place and to teach people about the power of artificial intelligence. The two live happily ever after, spreading the magic of artificial intelligence. == Structure == Including the dedication and postscript, the book contains twenty four pages, about half of which being illustrations provided by Midjourney. The very short story, composed of text generated by ChatGPT, contains 343 words. Some of the illustrations are accompanied by descriptions, at least one of which was provided by Reshi. Both Alice's and Sparkle's appearances change significantly between illustrations, although Alice's is more consistent. Reshi said Midjourney was unable to generate consistent images of Sparkle, so he had to include a line in the book saying that it could turn "into all kinds of robot shapes". == Creation == When reading a children's book to his friend's daughter, Ammaar Reshi "decided he wanted to write his own". He had no experience with creative writing or illustration, so instead used the chatbot ChatGPT to write the story for him and used the image generation software Midjourney to illustrate it. On December 4, 2022, 72 hours after having the idea for the book, he published it on Amazon's digital bookstore, and published a paperback version the following day. == Controversy == On December 9, 2022, Reshi made a thread on Twitter about his experience publishing the book, which soon went viral. Reshi received heavy backlash from artists with concerns over the ethics of art generated by artificial intelligence. He also received death threats and messages encouraging self-harm because of his publication. Many writers and illustrators criticized both the creation process and the product itself, claiming that if artificial intelligence programs such as Midjourney are trained on existing illustrations, then the original artists should be financially compensated for derivative works such as Alice and Sparkle. The book was temporarily removed from Amazon in January 2023 because of "suspicious review activity", caused by a high volume of both five-star and one-star reviews.
Read more →