AI Art Video

AI Art Video — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Wave Financial

Wave is a Canadian company that provides financial services and software for small businesses. Wave is headquartered in the East Bayfront neighbourhood in Toronto, Canada. The company's first product was free online accounting software designed for businesses with 1–9 employees, followed by invoicing, personal finance and receipt-scanning software (OCR). In 2012, Wave began branching into financial services, initially with Payments by Wave (credit card processing) and Payroll by Wave, followed in February 2017 by Lending by Wave, which has since been discontinued. == History == CEO Kirk Simpson and CPO James Lochrie launched Wave Accounting Inc. in July 2009, Wave Accounting launched to the public on November 16, 2010. In June 2011, Series A funding led by OMERS Ventures was closed. In September 2011, FedDev Ontario invested one million dollars in funding. In October 2011, a $5-million investment led by U.S. venture capital firm Charles River Ventures was announced. In May 2012, Wave Accounting closed its series B financing round led by The Social+Capital Partnership, with follow-on participation from Charles River Ventures and OMERS Ventures. Wave acquired a company called Small Payroll in November 2011, which was later launched as a payroll product called Wave Payroll. In February 2012, Wave officially launched Wave Payroll to the public in Canada, followed by the American release in November of the same year. In August, 2012, the company announced the acquisition of Vuru.co, an online stock-tracking service. Terms of the deal were not disclosed. In December 2012, the company rebranded itself as Wave to emphasize its broadened spectrum of services. On March 14, 2019, the company acquired Every, a Toronto-based fintech company that provides business accounts and debit cards to small businesses. On June 11, 2019, the company announced it was being acquired by tax preparation company, H&R Block, for $537 million. On June 15, 2022, Wave announced that Kirk Simpson would be leaving and being replaced as CEO by Zahir Khoja. In May 2025, US customers of Wave were transitioned to a new Payroll processing system supported by CheckHQ. The new integration improved support for US employers by handling employer tax withholding and payments in all 50 US States. == Products == The company's initial product, Accounting by Wave, is a double entry accounting tool. Services include direct bank data imports, invoicing and expense tracking, customizable chart of accounts, and journal transactions. Accounting by Wave integrates with expense tracking software Shoeboxed and e-commerce website Etsy. The next product launched was Payroll by Wave, which was launched in 2012 after the acquisition of SmallPayroll.ca. Payroll by Wave is only available in the US and Canada. Invoicing by Wave is an offshoot of the company's earlier accounting tools. Additional products launched on or shortly after the company's rebrand in December 2012 include: a credit card processing tool, Payments by Wave, built initially on integration with Stripe credit card processing. However, Wave does not report merchant fees correctly for countries where Stripe charges a tax such as GST. In these cases, the merchant fees are reported without tax and do not match your Stripe account. a receipt scanning tool, Receipts by Wave. In 2017, Wave signed an agreement to provide its platform on RBC's online business banking site. The RBC-Wave service will be co-branded. == Taxes supported == The company's software supports tax-exclusive pricing, such as U.S. sales tax, where taxes are added on top of prices quoted. This has two effects: When scanning receipts users must manually add the tax, and input the amount. When making an invoice, users must put in a price before tax, and the system will add the tax on top. This makes Wave unable to handle taxes in countries like Australia where prices must be quoted inclusive of all taxes, such as GST. There is no way to set an invoice total and have Wave calculate the tax portion as a percentage. == Pricing and business model == As of June 10, 2024, Wave offers two tiers for its software: a free Starter plan with limitations on some features, and a paid Pro plan. In addition to its paid plan, revenue from the company comes from other paid financial services the company offers: Payments by Wave: Card processing which includes debit, credit and prepaid cards as well as ACH (bank payments) in the United States. Fees are a percentage of the transaction. Payroll by Wave: Monthly subscription fee plus usage fees. Wave previously included advertising on its pages as a source of revenue. Advertising was removed in January 2017. In 2017, Wave raised $24m (USD) in funding led by NAB Ventures. In 2019, H&R Block announced the acquisition of Wave in a cash deal worth $405 million USD.
Read more →
Federation of International Robot-soccer Association

The Federation of International Robot-soccer Association (FIRA) is an international organisation organising competitive soccer competitions between autonomous robots. The matches are usually five-a-side. == History == In 1996 and 1997, this competition was known as MiroSot and was held in Daejeon, Korea. The 1996 competition offered a challenging arena to the younger generation and researchers working with autonomous mobile robotic systems. From 1998 through 2008, it was called the FIRA Cup, and in 2009, it became the FIRA RoboWorld Cup & Congress. The 15th RoboWorld Cup was held at Amrita Vishwa Vidyapeetham, Bangalore, India in September 2010. In 2013, it took place in Kuala Lumpur, Malaysia. The championship started on August 24, 2013, and ended on August 29. At that time, it involved five categories: Micro-Robot Soccer Tournament, Amire, Naro, Simulated Robot, Android, Robo and Humanoid Robot. It attracted teams from Singapore, Indonesia, Taiwan, India, China, South Korea, the United Kingdom, Mexico, Canada, Russia and Malaysia. 80 teams from 11 countries participated. In 2018, the competition had 277 teams participating from 12 countries. === Past Events === == FIRA RoboWorld Cup & Congress == This competition has 4 leagues: FIRA AIR, FIRA Sports, FIRA Challenges, and FIRA Youth. Each league has its own competitions, and each competition can have several events. === FIRA AIR === The FIRA AIR league has two associated competitions, Autonomous Race and Emergency Service. === FIRA Sports === The FIRA Sports league has four associated competitions, HuroCup, RoboSot, SimuroSot, and AndroSot. This the robot soccer league. HuroCup consists of single events for bipedal humanoid robots. The events are: archery, sprint, marathon, united soccer, obstacle run, long jump, spartan race, marathon, weightlifting, and basketball. There is an all-round competition for the single robot that performs the best overall. === FIRA Challenges === The FIRA Challenges league has three associated competitions, Autonomous Cars, Autonomous Cars Simulation, Innovation and Business. === FIRA Youth === The FIRA Youth league has six associated challenges, Sport Robots, HuroCup Junior, CityRacer, DRV_Explorer, Cliff Hanger, and Mission Impossible.
Read more →
European Society for Fuzzy Logic and Technology

The European Society for Fuzzy Logic and Technology (EUSFLAT) is a scientific association with the aims to disseminate and promote fuzzy logic and related subjects (sometimes comprised under the collective terms soft computing or computational intelligence) and to provide a platform for exchange between scientists and engineers working in these fields. The society is both open for academic and industrial members. == History == EUSFLAT was founded in 1998 in Spain as the successor of the National Spanish Fuzzy Logic Society, ESTYLF, with the aim to open the society for members from other European countries. Since then, the society managed to attract a large share of members from outside Spain, and even beyond Europe, with the Spanish members still being the largest group inside EUSFLAT. For these historical reasons, the society is officially registered in Spain. == Conferences == Starting with 1999, EUSFLAT has been organizing its biannual conferences in odd years. Previous meetings: Palma de Mallorca, Balearic Islands, Spain, September 22–25, 1999 (jointly with National Spanish conference, ESTYLF) Leicester, United Kingdom, September 5–7, 2001 Zittau, Germany, September 10–12, 2003 Barcelona, Catalonia, Spain, September 7–9, 2005 (jointly with 11th Rencontres Francophones sur la Logique Floue et ses Applications) Ostrava, Czech Republic, September 11–14, 2007 Lisbon, Portugal, July 20–24, 2009 (jointly with 13th World Congress of the International Fuzzy Systems Association) Aix-les-Bains, France, July 18–22, 2011 (jointly with Les Rencontres Francophones sur la Logique Floue et ses Applications) Milan, Italy, September 11–13, 2013 Gijón, Spain, June, 30–3 July 2015 == Publications == EUSFLAT publishes the proceedings of its conferences in an open access manner. Until 2010, Mathware & Soft Computing was the official journal of EUSFLAT. On July 1, 2010, the International Journal of Computational Intelligence Systems (Atlantis Press, ISSN 1875-6891 (print) / ISSN 1875-6883 (on-line)) became the official journal of EUSFLAT. EUSFLAT publishes an electronic newsletter with three issues a year. == Presidents == EUSFLAT is led by the President, who is elected for a two-year period, and cannot serve for more than two consecutive periods. Francesc Esteva (1998–2011) Luis Magdalena (2001–2005) Ulrich Bodenhofer (2005–2009) Javier Montero (2009–2013) Gabriella Pasi (2013–present)
Read more →
Painworth

PainWorth is a justice, legal and insurance services application founded by Canadian entrepreneurs Mike Zouhri, Chris Trudel and Ryan Bencic. The application is a "robot lawyer" that uses artificial intelligence to automate personal injury claims for injury victims. It is currently available in Canada and the United States. PainWorth has been featured by several news outlets, including CTV, Global News, CBC, and has also been featured by the American Bar Association and LexisNexis for its role addressing social issues such as access to justice and other systemic issues in the legal and insurance industry. == Application == PainWorth began as a tool for calculating non-pecuniary damages for injury victims but has since expanded beyond a personal injury calculator to include features that help injury victims and business users with pecuniary damages, economic calculations, prescribed rates and providing informational guides to help navigate settlement negotiation, managing claims records and other issues encountered by self-represented litigants or claims managers. The platform makes use of automation to provide free user-guided calculations, steps and processes to successfully settle an injury claim. The application is supported by Microsoft Azure. == Personal Injury Calculator == PainWorth is the first service to use Artificial Intelligence to interpret case law in order to determine the value of pain and suffering incurred by specific injury types and injury severities. The cited case law is used as evidence and presented in statistical models to determine an accurate valuation compliant with the jurisdiction, regulatory rules and case complexities. == General Damages Calculator == PainWorth also offers a personal injury settlement calculator that assesses general damages based on specific case complexities and jurisdiction. The service takes into account medical complications and recovery in order to calculate the fair valuation. == Injury Settlement Platform == PainWorth insurance settlement platform facilitates a direct and automated way resolution center to settle cases for their assessed value without enduring the hardship of litigation. In 2021, Painworth won the title of World's Best Emerging Insurance Product for the development of this platform. == History == In 2019, Mike Zouhri was struck by a drunk driver which left him seriously injured and resulted in a lawsuit. Frustrated by the slow and expensive process, Zouhri went down to the law library and learned how to manage injury claims. After learning the process, he partnered lawyers and legal advisors to create an app to allow users to quickly settle their own injury claims fairly and accurately. Immediately after its launch, PainWorth quickly became widely used by thousands of users and gained significant media coverage. Global News reported that the bot had successfully helped people with more than $10 million in claims in only a few short months, all free of charge. In July 2020, PainWorth began raising concern over injustices and gender bias in the legal system. in Canadian courts.
Read more →
Language Server Protocol

The Language Server Protocol (LSP) is an open, JSON-RPC-based protocol for use between source-code editors or integrated development environments (IDEs) and servers that provide "language intelligence tools": programming language-specific features like code completion, syntax highlighting and marking of warnings and errors, as well as refactoring routines. The goal of the protocol is to allow programming language support to be implemented and distributed independently of any given editor or IDE. In the early 2020s, LSP quickly became a "norm" for language intelligence tools providers. == History == LSP was originally developed for Microsoft Visual Studio Code and is now an open standard. On June 27, 2016, Microsoft announced a collaboration with Red Hat and Codenvy to standardize the protocol's specification. Its specification is hosted and developed on GitHub. == Background == Modern IDEs provide programmers with sophisticated features like code completion, refactoring, navigating to a symbol's definition, syntax highlighting, and error and warning markers. For example, in a text-based programming language, a programmer might want to rename a method read. The programmer could either manually edit the respective source code files and change the appropriate occurrences of the old method name into the new name, or instead use an IDE's refactoring capabilities to make all the necessary changes automatically. To be able to support this style of refactoring, an IDE needs a sophisticated understanding of the programming language that the program's source is written in. A programming tool without such an understanding—for example, one that performs a naive search-and-replace instead—could introduce errors. When renaming a read method, for example, the tool should not replace the partial match in a variable that might be called readyState, nor should it replace the portion of a code comment containing the word "already". Neither should renaming a local variable read, for example, end up altering identically-named variables in other scopes. Conventional compilers or interpreters for a specific programming language are typically unable to provide these language services, because they are written with the goal of either transforming the source code into object code or immediately executing the code. Additionally, language services must be able to handle source code that is not well-formed, e.g. because the programmer is in the middle of editing and has not yet finished typing a statement, procedure, or other construct. Additionally, small changes to a source code file which are done during typing usually change the semantics of the program. In order to provide instant feedback to the user, the editing tool must be able to very quickly evaluate the syntactical and semantical consequences of a specific modification. Compilers and interpreters therefore provide a poor candidate for producing the information needed for an editing tool to consume. Prior to the design and implementation of the Language Server Protocol for the development of Visual Studio Code, most language services were generally tied to a given IDE or other editor. In the absence of the Language Server Protocol, language services are typically implemented by using a tool-specific extension API. Providing the same language service to another editing tool requires effort to adapt the existing code so that the service may target the second editor's extension interfaces. The Language Server Protocol allows for decoupling language services from the editor so that the services may be contained within a general-purpose language server. Any editor can inherit sophisticated support for many different languages by making use of existing language servers. Similarly, a programmer involved with the development of a new programming language can make services for that language available to existing editing tools. Making use of language servers via the Language Server Protocol thus also reduces the burden on vendors of editing tools, because vendors do not need to develop language services of their own for the languages the vendor intends to support, as long as the language servers have already been implemented. The Language Server Protocol also enables the distribution and development of servers contributed by an interested third party, such as end users, without additional involvement by either the vendor of the compiler for the programming language in use or the vendor of the editor to which the language support is being added. LSP is not restricted to programming languages. It can be used for any kind of text-based language, like specifications or domain-specific languages (DSL). == Technical overview == When a user edits one or more source code files using a language server protocol-enabled tool, the tool acts as a client that consumes the language services provided by a language server. The tool may be a text editor or IDE and the language services could be refactoring, code completion, etc. The client informs the server about what the user is doing, e.g., opening a file or inserting a character at a specific text position. The client can also request the server to perform a language service, e.g. to format a specified range in the text document. The server answers a client's request with an appropriate response. For example, the formatting request is answered either by a response that transfers the formatted text to the client or by an error response containing details about the error. The Language Server Protocol defines the messages to be exchanged between client and language server. They are JSON-RPC preceded by headers similar to HTTP. Messages may originate from the server or client. The protocol does not make any provisions about how requests, responses and notifications are transferred between client and server. For example, client and server could be components within the same process exchanging JSON strings via method calls. They could also be different processes on the same or on different machines communicating via network sockets. == Registry == There are lists of LSP-compatible implementations, maintained by the community-driven Langserver.org or Microsoft.
Read more →
2024 Bilderberg Conference

The 2024 Bilderberg Conference was held between May 30–June 2, 2024 in Madrid, Spain at the Eurostars Suites Mirasierra hotel. The 2024 meeting was the 70th edition of the event. A Bilderberg Group press release stated that there were 131 participants from around 25 countries. Established in 1954 by Prince Bernhard of the Netherlands, Bilderberg conferences (or meetings) are an annual private gathering of the European and North American political and business elite. Events are attended by between 120 and 150 people each year invited by the Bilderberg Group's steering committee; including prominent politicians, CEOs, national security experts, academics and journalists. Several US presidents have attended the meetings before winning a presidential election. These politicians include Bill Clinton and Barack Obama. Bilderberg conferences operate under the Chatham House Rule, meaning that participants are sworn to secrecy and cannot disclose the identity or affiliation of any particular speaker. == Agenda == The key topics for discussion were announced on the Bilderberg website shortly before the meeting. These topics included: == Participants == A list of 131 participants was published on the Bilderberg website. This list may not be complete, as a source connected to the Bilderberg group told The Daily Telegraph in 2013 that some attendees do not have their names publicized. King Felipe VI of Spain was reported to have attended the meeting despite his name not being on the list.
Read more →
Agents of S.H.I.E.L.D. season 4

The fourth season of the American television series Agents of S.H.I.E.L.D., based on the Marvel Comics spy organization S.H.I.E.L.D., follows Phil Coulson and other S.H.I.E.L.D. agents and allies after the signing of the Sokovia Accords. It is set in the Marvel Cinematic Universe (MCU) and acknowledges the continuity of the franchise's films. The season was produced by ABC Studios, Marvel Television, and Mutant Enemy Productions, with Jed Whedon, Maurissa Tancharoen, and Jeffrey Bell serving as showrunners. Clark Gregg reprises his role as Coulson from the film series, starring alongside the returning series regulars Ming-Na Wen, Chloe Bennet, Iain De Caestecker, Elizabeth Henstridge, and Henry Simmons. They are joined by John Hannah who was promoted from his recurring guest role in the third season. The fourth season was ordered in March 2016, with production taking place from that July until the following April. Due to its broadcast schedule, the season was split into three "pods": Ghost Rider for the first eight episodes, featuring recurring guest star Gabriel Luna as the supernatural Robbie Reyes / Ghost Rider and exploring mysticism in the MCU alongside the film Doctor Strange (2016); LMD, referring to the new Life Model Decoy program, for the next seven episodes which focus on recurring guest star Mallory Jansen as the LMD Aida; and Agents of Hydra for the final seven episodes, partly set in a "what if" virtual reality that allowed the return of former series regular Brett Dalton as Grant Ward. The season is also affected by the events of the film Captain America: Civil War (2016), and continues storylines established in the canceled series Agent Carter. The first episode premiered at a screening on September 19, 2016, with the season then airing for 22 episodes on ABC, from September 20, 2016, until May 16, 2017. The premiere debuted to 3.58 million viewers, down from previous season premieres but average for the series. Critical response to the season was positive, with many feeling that each pod was better than the last and in particular praising the visual effects and tone of Ghost Rider, the writing and acting of LMD, and the character development and political commentary explored during Agents of Hydra. The season saw series low viewership, but was still considered to have solved ABC's problem during its new Tuesday night timeslot, and the series was renewed for a fifth season in May 2017. == Episodes == == Cast and characters == == Production == === Development === Agents of S.H.I.E.L.D. was renewed for a fourth season on March 3, 2016, earlier than usual for the series. Executive producer Jed Whedon said on this, "We're thrilled to know going into the end of [season three] with certainty that we will be returning, because we can build our story accordingly." Executive producer Maurissa Tancharoen also noted that logistics for hiring directors for the season in advance would be easier, "which is a very nice privilege to have...that's a luxury". The end of the episode "What If..." features an onscreen tribute to Bill Paxton, who died in February 2017 and had portrayed John Garrett in the series' first season. The series paid additional tribute to Paxton in "All the Madame's Men" with promos during The Bakshi Report news segment showcasing John Garrett as a fallen American hero. The end of "World's End" features a similar onscreen tribute to Powers Boothe, who died in May 2017 and had portrayed Gideon Malick in the series' third season. === Writing === The season shifted to the later 10 pm timeslot, allowing it to take on a darker, more mature tone than previous seasons. According to Tancharoen, "The whole tagline for this year is 'Agents of S.H.I.E.L.D. After Dark'". The timeslot gave the series the opportunity to present an increased level of violence and partial nudity, as well as take more risks and present edgier themes. Following the third-season finale, Tancharoen stated that the fourth season would explore the guilt Daisy Johnson has over Lincoln Campbell's death. Executive producer Jeffrey Bell noted the writers tried to continue the tradition of "finding new combinations and new conflicts" between different sets of characters, given "a lot of procedurals [see] the same people doing the same thing for five years". Pairings that would be explored included Coulson and Mack, continuing from the end of season three, who have a mutual respect for one another due to their relationships with Daisy, and Leo Fitz and Holden Radcliffe, who work together. The Fitz-Simmons relationship was also explored more, examining the new challenges it presented for the two "working together, loving each other and living together". Following the third season's dealing with the themes of Captain America: Civil War (2016), such as the opposing reactions to the Inhumans, Whedon said that the question of "How do you deal with a war with powered people at that level, a government level?" was one that they wanted to answer in the fourth season. Tancharoen called the Inhumans "a permanent part of our universe now", with Whedon adding, "we have a quick-fire way of introducing people with powers. It gives us a lot of leeway in our world, and it lets us explore the metaphors of what it is like to be different. We will never close that chapter." With the Inhumans film being removed from Marvel Studios' release schedule, the series had "a little more freedom" and were "able to do a little bit more" with the species, including the potential of introducing some of the "classic" Inhumans, though the series would focus less on Inhumans than the third season which saw "a real significant Inhuman agenda story". It was not intended to be a spin-off of Agents of S.H.I.E.L.D. On the evolution of S.H.I.E.L.D. to featuring so many powered characters, Whedon said "the dynamic in the world has changed. There was one person with powers, and then by The Avengers there were maybe six total ... now they're much more prevalent, so there's reaction from the public based on that." The season is structured into three "pods" based on its airing schedule: the first eight episodes, subtitled Ghost Rider; LMD (Life Model Decoy) for the subsequent seven episodes; and a third pod for the final seven episodes called Agents of Hydra. Elements and characters cross over between the different pods, but the sections "definitely have a different feel" from one another, as Bell explained that 22 episodes "is a long time to hold a big bad or a single plot line, especially for an audience", and for the past two seasons, the series was able to have two separated halves that "allows us to introduce a big bad. And then, something happens and we rise somebody new ... Now, there's three of those." "Financial considerations" were also taken into account in creating the pods for the season, as using LMDs does not "cost as much as setting a guy's head on fire via CGI". In terms of writing the "complicated season", Whedon said the writers were "aware that our fans are our fans and have spent some time with these characters and are clever and see things coming sometimes ... Part of our job is to create not just what we are presenting on plot, but letting the audience be one step ahead of us and being one step ahead of that." He added that the writers knew that they wanted to tell a Ghost Rider story, an LMD story, and a "what if" scenario, and the hardest part was making each pod still fit together as a single season. The major connection ultimately became the Darkhold, which leads from the magic of Ghost Rider to the advanced science of LMD and then the Framework in Agents of Hydra. Ghost Rider also reappears in the final episode of the season, "World's End", as an additional connection. ==== Ghost Rider ==== While planning the fourth season, Marvel suggested that the series introduce Ghost Rider, after the character's film rights had returned to Marvel from Sony in May 2013. Loeb felt that this made the season unquestionably "the series' biggest" with the "most ambitious story yet". He added that "one of the things that we talked about is, S.H.I.E.L.D. always looked out for the weird, the unusual, the things that were and could be a problem for the public", and Marvel realized that Ghost Rider's abilities, which are more mystical than anything seen in the series to date, opened up "a quarter of the universe that we haven't really spent a lot of time exploring ... what happens if our very real, our very grounded agents who are very much a family have to take on something that is as bizarre and powerful and unique as Ghost Rider." Bell added that the producers would have been willing to give an entire season of the show to a Ghost Rider arc if the season was 13 episodes or less, but 22 episodes seemed too long to "feel like one flavor". The Robbie Reyes version of Ghost Rider was chosen over other versions of the character from the comics because of his relationship with his brother Gabe, w
Read more →
Computing Machinery and Intelligence

"Computing Machinery and Intelligence" is a paper written by Alan Turing on the topic of artificial intelligence. The paper, published in 1950 in Mind, was the first to introduce his concept of what is now known as the Turing test to the general public. Turing's paper considers the question "Can machines think?" Turing says that since the words "think" and "machine" cannot clearly be defined, we should "replace the question by another, which is closely related to it and is expressed in relatively unambiguous words." To achieve this objective, Turing proposes a three-step approach. First, he identifies a simple and unambiguous concept to substitute for the term "think." Second, he delineates the specific "machines" under consideration. Third, armed with these tools, he poses a new question related to the first, which he believes he can answer in the affirmative. == Turing's test == Rather than trying to determine if a machine is thinking, Turing suggests we should ask if the machine can win a game, called the "Imitation Game". The original Imitation game, that Turing described, is a simple party game involving three players. Player A is a man, player B is a woman and player C (who plays the role of the interrogator) can be of either sex. In the Imitation Game, player C is unable to see either player A or player B (and knows them only as X and Y), and can communicate with them only through written notes or any other form that does not give away any details about their gender. By asking questions of player A and player B, player C tries to determine which of the two is the man and which is the woman. Player A's role is to trick the interrogator into making the wrong decision, while player B attempts to assist the interrogator in making the right one. Turing proposes a variation of this game that involves the computer: We now ask the question, "What will happen when a machine takes the part of A in this game?" Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, "Can machines think?" So the modified game becomes one that involves three participants in isolated rooms: a computer (which is being tested), a human, and a (human) judge. The human judge can converse with both the human and the computer by typing into a terminal. Both the computer and the human try to convince the judge that they are the human. If the judge cannot consistently tell which is which, then the computer wins the game. Researchers in the United Kingdom had been exploring "machine intelligence" for up to ten years prior to the founding of the field of artificial intelligence (AI) research in 1956. It was a common topic among the members of the Ratio Club, an informal group of British cybernetics and electronics researchers that included Alan Turing. Turing, in particular, had been running the notion of machine intelligence since at least 1941 and one of the earliest-known mentions of "computer intelligence" was made by him in 1947. As Stevan Harnad notes, the question has become "Can machines do what we (as thinking entities) can do?" In other words, Turing is no longer asking whether a machine can "think"; he is asking whether a machine can act indistinguishably from the way a thinker acts. This question avoids the difficult philosophical problem of pre-defining the verb "to think" and focuses instead on the performance capacities that being able to think makes possible, and how a causal system can generate them. Since Turing introduced his test, it has been both highly influential and widely criticised, and has become an important concept in the philosophy of artificial intelligence. Some of its criticisms, such as John Searle's Chinese room, are themselves controversial. Some have taken Turing's question to have been "Can a computer, communicating over a teleprinter, fool a person into believing it is human?" but it seems clear that Turing was not talking about fooling people but about generating human cognitive capacity. == Digital machines == Turing also notes that we need to determine which "machines" we wish to consider. He points out that a human clone, while man-made, would not provide a very interesting example. Turing suggested that we should focus on the capabilities of digital machinery—machines which manipulate the binary digits of 1 and 0, rewriting them into memory using simple rules. He gave two reasons. First, there is no reason to speculate whether or not they can exist. They already did in 1950. Second, digital machinery is "universal". Turing's research into the foundations of computation had proved that a digital computer can, in theory, simulate the behaviour of any other digital machine, given enough memory and time. (This is the essential insight of the Church–Turing thesis and the universal Turing machine.) Therefore, if any digital machine can "act like it is thinking", then every sufficiently powerful digital machine can. Turing writes, "all digital computers are in a sense equivalent." This allows the original question to be made even more specific. Turing now restates the original question as "Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man?" Hence, Turing states that the focus is not on "whether all digital computers would do well in the game nor whether the computers that are presently available would do well, but whether there are imaginable computers which would do well". What is more important is to consider the advancements possible in the state of our machines today regardless of whether we have the available resource to create one or not. == Nine common objections == Having clarified the question, Turing turned to answering it: he considered the following nine common objections, which include all the major arguments against artificial intelligence raised in the years since his paper was first published. Religious Objection: This states that thinking is a function of man's immortal soul; therefore, a machine cannot think. "In attempting to construct such machines," wrote Turing, "we should not be irreverently usurping His power of creating souls, any more than we are in the procreation of children: rather we are, in either case, instruments of His will providing mansions for the souls that He creates." 'Heads in the Sand' Objection: "The consequences of machines thinking would be too dreadful. Let us hope and believe that they cannot do so." This thinking is popular among intellectual people, as they believe superiority derives from higher intelligence and the possibility of being overtaken is a threat (as machines have efficient memory capacities and processing speed, machines exceeding the learning and knowledge capabilities are highly probable). This objection is a fallacious appeal to consequences, confusing what should not be with what can or cannot be (Wardrip-Fruin, 56). The Mathematical Objection: This objection uses mathematical theorems, such as Gödel's incompleteness theorem, to show that there are limits to what questions a computer system based on logic can answer. Turing suggests that humans are too often wrong themselves and pleased at the fallibility of a machine. (This argument would be made again by philosopher John Lucas in 1961 and physicist Roger Penrose in 1989, and later would be called Penrose–Lucas argument.) Argument From Consciousness: This argument, suggested by Professor Geoffrey Jefferson in his 1949 Lister Oration (acceptance speech for his 1948 award of Lister Medal) states that "not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain." Turing replies by saying that we have no way of knowing that any individual other than ourselves experiences emotions, and that therefore we should accept the test. He adds, "I do not wish to give the impression that I think there is no mystery about consciousness ... [b]ut I do not think these mysteries necessarily need to be solved before we can answer the question [of whether machines can think]." (This argument, that a computer can't have conscious experiences or understanding, would be made in 1980 by philosopher John Searle in his Chinese room argument. Turing's reply is now known as the "other minds reply". See also Can a machine have a mind? in the philosophy of AI.) Arguments from various disabilities. These arguments all have the form "a computer will never do X". Turing offers a selection:Be kind, resourceful, beautiful, friendly, have initiative, have a sense of humour, tell right from wrong, make mistakes, fall in love, enjo
Read more →
Apache OpenNLP

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing and coreference resolution. These tasks are usually required to build more advanced text processing services.
Read more →
Grokipedia

Grokipedia is an AI-generated online encyclopedia operated by the American company xAI. The site was launched on October 27, 2025. Some entries are generated by Grok, a large language model owned by the same company, while others were forked from Wikipedia, with some altered and some used nearly verbatim. Articles cannot be directly edited, though logged-in visitors to the encyclopedia can suggest new articles or corrections via a pop-up form, which are reviewed by Grok. The xAI founder Elon Musk suggested Grokipedia could be an alternative to Wikipedia that would "purge out the propaganda" he believes is promoted by the latter, describing Wikipedia as "woke" and an "extension of legacy media propaganda". External analysis of Grokipedia's content has focused on its accuracy and biases due to hallucinations and potential algorithmic bias, which reviewers have described as promoting right-wing perspectives and Musk's views. The majority of coverage has described the website as validating, promoting, and legitimizing a variety of debunked conspiracy theories and ideas against scientific consensus on topics such as HIV/AIDS denialism, vaccines and autism, climate change, and race and intelligence. The site has been accused of whitewashing far-right extremism, such as by falsely claiming a white genocide is actively occurring. Several right-wing figures have welcomed the site. Studies have highlighted its use of sources deemed as having very low credibility such as X conversations and neo-Nazi websites, and for writing about far-right figures and topics in a promotional manner. == Background == Wikipedia is an online encyclopedia written and maintained by a community of volunteers. Its possible bias has been studied and debated. In 2018, Haaretz noted "Wikipedia has succeeded in being accused of being both too liberal and too conservative, and has critics from across the spectrum". xAI is an American AI company founded by Elon Musk in 2023. Its flagship product is the family of large language models called Grok. == History == In 2021, Musk expressed affection for Wikipedia on its 20th anniversary. In 2022, however, Musk argued that Wikipedia was "losing its objectivity", and in 2023, said he would donate US$1 billion to the project if it was pejoratively renamed "Dickipedia". In December 2024, Musk called for a boycott of donations to Wikipedia over its perceived left-wing bias, calling it "Wokepedia". In January 2025, Musk made a series of statements on Twitter denouncing Wikipedia for its description of the incident where he made a controversial gesture, which many viewed as resembling a Nazi salute, at president Donald Trump's second inauguration. Musk has since positioned Grokipedia as an alternative to Wikipedia that would "purge out the propaganda" in the latter, with Musk describing Wikipedia as "woke" and an "extension of legacy media propaganda". === Idea and announcement === In September 2025, Musk spoke at the All-In podcast conference with David O. Sacks, the White House advisor on AI and cryptocurrency, about how Grok consumed data from Wikipedia and other sources to gain more complete knowledge of the world. Sacks suggested publishing its knowledge base as an artifact called "Grokipedia", saying "Wikipedia is so biased, it's a constant war". Following the conversation, Musk announced that xAI was building a new AI-generated online encyclopedia called Grokipedia. According to Musk's announcement, it would be an AI-powered knowledge base designed to rival Wikipedia by addressing its perceived biases, errors, and ideological slants. The project positioned itself within a history of ideologically driven alternatives to Wikipedia, such as the conservative Conservapedia (launched in 2006) and the Russian-government-friendly Ruwiki (launched in 2023). However, Grokipedia is distinct in its core reliance on artificial intelligence rather than human community editing. === Launch and traffic === On October 6, 2025, Musk announced that the early version of Grokipedia was scheduled for release in two weeks, but the project was postponed briefly to address content quality issues. It launched on October 27, 2025, labeled "v 0.1", with over 800,000 articles, compared to over seven million English Wikipedia articles as of September 1, 2025. According to an initial analysis of usage figures by Similarweb, which evaluates data from registered users and partners, Grokipedia recorded a peak of over 460,000 website visits in the US on October 28, 2025. After that, traffic dropped significantly and settled at around 35,000 visits per day between November 8 and 11, 2025. As of early 2026, it had over 5.6 million articles. In January 2026, The Guardian reported that GPT-5.2 frequently cited Grokipedia as a source in responses, raising concerns of misinformation on ChatGPT. The same month, The Verge reported that Google's AI Overviews, AI Mode, and Gemini language model, as well as Microsoft Copilot and Perplexity AI, used Grokipedia to answer niche, obscure, or highly specific factual questions or "non-sensitive queries." According to a case study published by SEO Engico, the site received only 19 clicks from Google Search in November 2025 but reached approximately 3.2 million monthly clicks by January 2026, with over 900,000 pages indexed and millions of ranking keywords. Analysts attributed the surge in part to the site's technical structure and large-scale AI-generated content production. In early February 2026, Grokipedia's visibility in Google Search declined sharply. SEO analysts, including Glenn Gabe and Malte Landwehr, reported a significant drop in rankings across Google organic results as well as in Google AI Overviews and AI Mode. The same case study cited independent reviews that identified citation quality concerns, including references to low-credibility sources and instances of self-citation. By mid-February 2026, Grokipedia had reportedly lost much of its previous search visibility, and Wikipedia ranked above it for searches related to its own name. === Updates === ==== Future ==== In November 2025, Musk announced that he eventually plans to change the name of the site to Encyclopedia Galactica when Grokipedia is "good enough", saying that it had a "long way to go". This name is taken from the publication of that title in the works of Isaac Asimov and Douglas Adams. Musk said that he hoped to send copies of the encyclopedia to "the Moon and Mars and out to deep space". == Content == The Grok large language model generates and fact-checks articles on Grokipedia. Users cannot directly edit Grokipedia articles, but logged-in users can suggest edits and report errors, with such submissions being reviewed and implemented by the Grok AI. Some articles are nearly identical to their Wikipedia entries, but the format of Grokipedia citations is different, and some Grokipedia articles were republished almost verbatim, accompanied by a disclaimer noting that the content was "adapted from Wikipedia" under a Creative Commons license. Others were completely rewritten from scratch using Musk's AI chatbot, Grok. Forbes identified the articles AMD, Lamborghini, and PlayStation 5 as examples of copied Wikipedia articles. Articles attributed to Wikipedia carry a Creative Commons Attribution-ShareAlike license, while the license of other articles is licensed under the "X Community License", a license that accepts reuse and remixing for "non-commercial and research purposes" and commercial use that abides to "all of the guardrails provided in xAI's Acceptable Use Policy". On October 31, 2025, Musk clarified that the duplication of Wikipedia articles was intentional, saying that the Grokipedia team instructed Grok to compile Wikipedia's top 1 million articles and make content changes to them. The site's design has been described as minimalist with a simple homepage including little more than a large search bar. In a comparative textual analysis of the most heavily edited matched article pairs from Grokipedia and Wikipedia, Grokipedia entries are substantially longer and less densely referenced, indicating that AI-produced encyclopedias prioritize exposition rather than source-based validation. Starting in version 0.2, Grok reviews and implements approved suggested edits, and a small panel rotates through a display of the names of several recently edited articles. In February 2026, the Columbia Journalism Review reported on an analysis by the Tow Center for Digital Journalism finding that Grok, the AI behind Grokipedia, had increasingly begun suggesting and approving edits to the site itself without human involvement. According to the report, AI-generated edit suggestions overtook human submissions in December 2025 and accounted for more than three-quarters of proposed changes. The analysis raised concerns about transparency, editorial oversight, and fact-checking standards, particularly after instances in which Grok proposed or modified politically s
Read more →
Riffusion

Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio. The resulting music has been described as "de otro mundo" (otherworldly), although unlikely to replace man-made music. The model was made available on December 15, 2022, with the code also freely available on GitHub. The first version of Riffusion was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text prompts, on spectrograms, resulting in a model which used text prompts to generate image files which could then be put through an inverse Fourier transform and converted into audio files. While these files were only several seconds long, the model could also use latent space between outputs to interpolate different files together (using the img2img capabilities of SD). It was one of many models derived from Stable Diffusion. In December 2022, Mubert similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM. Forsgren and Martiros formed a startup, also called Riffusion, and raised $4 million in venture capital funding in October 2023.
Read more →
Resistance Database Initiative

HIV Resistance Response Database Initiative (RDI) was formed in 2002 to use artificial intelligence (AI) to predict how patients will respond to HIV drugs using data from more 250,000 patients from around 50 countries around the world. The RDI used its models to power its HIV Treatment Response Prediction System (HIV-TRePS). Launched in 2010, this free online tool enabled healthcare professionals to upload their patient’s data and obtain highly accurate predictions of how they would respond to different combinations of the 30 or more drugs available. The tool enabled physicians to individualize their patients’ treatment, using these predictions based on more than a million patient-years of treatment experience. HIV-TRePS was possibly the first ever AI-based system for medical decision-making to be developed, successfully tested, and used in clinical practice. It has since been used by thousands of healthcare professionals to optimise the treatment of tens of thousands of patients. Since the RDI’s inception the treatment of HIV infection has progressed enormously, with more effective and better tolerated drugs available in ever more convenient combination formulations. In most countries HIV is now considered a chronic, manageable condition. Moreover, the success of the drugs in reducing the amount of virus is substantially reducing the onward transmission of the virus and cases of new infections are falling in many settings. This improvement in HIV treatment means the need for sophisticated AI to support HIV treatment decisions has significantly reduced. In response, the RDI ceased development of further models and, in March 2024, withdrew its HIV-TRePS system. == Background == Human immunodeficiency virus (HIV) is the virus that causes acquired immunodeficiency syndrome (AIDS), a condition in which the immune system begins to fail, leading to life-threatening opportunistic infections. There are approximately 30 HIV antiretroviral drugs that have been approved for the treatment of HIV infection, from six different classes, based on the point in the HIV life-cycle at which they act. They are used in combination; typically 3 or more drugs from 2 or more different classes, a form of therapy known as highly active antiretroviral therapy (HAART). The aim of therapy is to suppress the virus to very low, ideally undetectable, levels in the blood. This prevents the virus from depleting the immune cells that it preferentially attacks CD4 cells and prevents or delays illness and death. Despite the expanding availability of these drugs and the impact of their use, treatments continue to fail, often involving to the development of resistance. During drug therapy, low-level virus replication may still occur, particularly when a patient misses a dose. HIV makes errors in copying its genetic material and, if a mutation makes the virus resistant to one or more of the drugs in the patient's treatment, it may begin to replicate more successfully in the presence of that drug and undermine the effect of the treatment. If this happens, the treatment needs to be changed to re-establish control over the virus. == RDI's Approach == The RDI’s approach was to use artificial intelligence (including neural network and random forest models), trained with data from hundreds of thousands of patients, treated with different drugs in a variety of clinical settings all over the world, to predict how an individual patient will respond to any new combination of HIV drugs. The models were tested with independent data sets and consistently achieved accuracy of approximately 80%.
Read more →
Moving object detection

Moving object detection is a technique used in computer vision and image processing. Multiple consecutive frames from a video are compared by various methods to determine if any moving object is detected. Moving objects detection has been used for wide range of applications like video surveillance, activity recognition, road condition monitoring, airport safety, monitoring of protection along marine border, etc. == Definition == Moving object detection is to recognize the physical movement of an object in a given place or region. By acting segmentation among moving objects and stationary area or region, the moving objects' motion can be tracked and thus analyzed later. To achieve this, consider a video is a structure built upon single frames, moving object detection is to find the foreground moving target(s), either in each video frame or only when the moving target shows the first appearance in the video. == Traditional methods == Among all the traditional moving object detection methods, we could categorize them into four major approaches: Background subtraction, Frame differencing, Temporal Differencing, and Optical Flow. === Frame differencing === Instead of using traditional approach, to use image subtraction operator by subtracting second and images afterwards, the frame differencing method makes comparisons between two successive frames to detect moving targets. === Temporal differencing === The temporal differencing method identifies the moving object by applying pixel-wise difference method with two or three consecutive frames.
Read more →
Iron Man 2020 (event)

"Iron Man 2020" is a storyline published by Marvel Comics in 2020 which follows the character Arno Stark as he attempts to take over Stark Industries and the mantle of his estranged brother Tony Stark (Iron Man). The crossover characters of two different brands meeting up in one storyline received mixed reviews from critics. == Publication history == Marvel Comics released the teaser for the event at New York Comic Con in November 2019. It was also alluded to in December 2019's Incoming! In the original checklist released for the event, 2020 Force Works was originally titled Force Works 2020, while 2020 Machine Man was previously named Machine Man 2020, and so on. Additionally, 2020 Wolverine was going to be called Weapon.EXE 2020. The publication of this event was intended to span from January to June 2020, however, due to the COVID-19 pandemic, Diamond Comic Distributors suspended the distribution of new print titles between April 1 and May 27, which also caused digital releases by Marvel Entertainment to be postponed. The rescheduling of the postponed issues to new dates pushed the event's conclusion to August, and certain issues, namely 2020 Force Works #3 and 2020 Ironheart #1–2, were released exclusively in a digital format. == Main plot == Arno Stark wakes up from a nightmare involving the Extinction Entity, a monstrous amalgamation of alien and machine. He dreams that the Extinction Entity is going to come to Earth in a matter of weeks and create an artificial intelligence (A.I.) army to consume humanity. After eating breakfast with duplicates of Howard Stark and Maria Stark, Arno suits up as Iron Man and saves a construction worker from a hostage situation involving several Nick Fury Life Model Decoys, which represent the A.I. army trying to liberate construction robots. Over different news outlets, the media wonders about the whereabouts of Tony Stark, who declared himself as nothing more than a simulation of the real, late Tony Stark. At the A.I. army's base, Machine Man is commanding the robots' moves when Arno appears, having planned for the A.I. army's leader to show himself. Machine Man activates the bomb, forcing Arno to fly it away so it explodes somewhere safe while he escapes. Machine Man reaches the Thirteenth Floor, a dimensional-shunted plane of existence made of solid light, and a haven for robotkind that humans cannot access or comprehend. Aaron meets with the leader of the A.I. army and creator of Thirteenth Floor: Tony Stark -- who is now going by the name Mark One, having embraced his nature as artificial intelligence. Also in the A.I. army are Albert, Awesome Android, H.E.R.B.I.E., Machinesmith, and Quasimodo. The A.I. army continues its efforts to liberate artificial life forms by raiding places where robots are being subjugated. Iron Man intercepts an attack on a Futura Motors testing site by Quasimodo and H.E.R.B.I.E. and manages to recover an Un-Inhibitor allowing him to take control of all A.I.s. On the Thirteenth Floor, Mark One receives a transmission from a mole inside Baintronics -- codenamed Ghost in the Machine --revealing that Arno used the submission code on Jocasta, who received a new body, making her entirely compliant. Stark plans to upload the submission code to the internet to instantly infect robots. With only three hours before the code is transmitted to Stark Unlimited's satellite network, Mark One devises a heist on Bain Tower to tamper with the code before launch. Having discovered the secret behind the Thirteenth Floor, Arno shuts out the A.I. army, uses Jocasta to lure Machine Man away from the tower, infects Machinesmith with the submission code, and confronts Mark One. H.E.R.B.I.E., Awesome Android, and Machinesmith escape from Bain Tower and call for help to every robot in New York City. Mark One is left to fight Iron Man and is defeated. Meanwhile, Sunset Bain confronts and fires Andy Bhang under the accusation of working as a mole inside Stark Unlimited and feeding Bethany Cabe information to relay to the A.I. army. Arno takes Mark One inside Bain Tower to meet Howard and Maria Stark and asks Tony to join him, but he refuses and dismisses his rationale as lunacy. The robotic mob assembled by Machine Man reaches Bain Tower, giving Mark a distraction which allows him to fly off and disable the transmission dish from which Arno intends to broadcast the obedience O.S. to subjugate every robot. Tony manages to stop the upload and make the antenna unusable. In retaliation, Arno fires all of his armor's firepower at Tony as he falls to the ground. Tony Stark's remaining allies escape with his body as Arno attacks the robot protesters. Tony wakes up inside the Thirteenth Floor and is greeted by F.R.I.D.A.Y., who had plucked Tony's consciousness from his body during his fall. In the streets, Arno Stark tracks down Howard and Maria, who die from an illness inherited from Arno. When Sunset Bain objects to Arno creating new bodies for his parents and trying to control people, he reveals she is an A.I., a duplicate of the real Bain whom Arno replaced back when she solicited him to heal a scar on her face. He makes new bodies for Howard and Maria by recreating the Arsenal and Mistress bodies from the eScape. After learning of Arno's new plan, Dr. Shapiro (who is the actual mole) sneaks into a computer and warns F.R.I.D.A.Y. about it. When F.R.I.D.A.Y. relays that only Tony Stark can stop Arno, Tony insists that he is not the real Tony Stark, but is confronted by holographic manifestations of himself in different points of his life, until they all merge into him and he acknowledges that he has always been Tony. As Arno Stark sets off to the Stark Space Station to install his mind-controlling device to enslave all of humanity, Tony Stark's allies assault the Stark Unlimited HQ, confronting Sunset Bain's duplicate and Arno's Iron Legion. Jocasta uploads a submission code to Bain and they place Tony's body inside a bio-pod that restores his body to normalcy, uploads his consciousness back into his body. Using the Thirteenth Floor's access mechanisms, Tony and his allies reach the Stark Space Station from one of the elevators within. Employing his new Virtual Armor, Tony defeats Arno in combat. When Arno prepares to activate his mind-controlling device, the Extinction Entity suddenly appears. Arno ultimately defeats the Extinction Entity by willingly assimilating with it, causing it to explode. The entity is revealed to be a delusion caused by Arno's terminal disease, of which he would die by the end of 2020. Unable to stop Arno, Tony placed him in a simulation where he successfully stopped the entity. Afterwards, Jocasta uses the submission code to force Sunset Bain's duplicate to confess all of Baintronics' crimes, also claiming responsibility for tricking Tony into thinking he was an artificial intelligence and pulling the strings of the A.I. Army, putting an end to the robot revolution. Tony gives up Stark Unlimited to Bhang Robotics and he flies off in a new armor, reasserting himself as Iron Man. == Issues involved == === Main issues === Iron Man 2020 (vol. 2) #1–6 === Tie-In issues === 2020 Force Works #1–3 2020 Iron Age #1 2020 Ironheart #1–2 2020 Machine Man #1–2 2020 Rescue #1–2 2020 iWolverine #1–2 == Critical reception == According to Comic Book Roundup, the entire crossover received an average score of 6.4 out of 10 based on 36 reviews. William Tucker from ButWhyTho Podcast stated "Iron Man 2020 #6 is an initially exciting end to a great event that eventually feels deflated. There is absolutely nothing wrong with the art, Woods has been incredible throughout, but the ending that Slott and Gage chose to round out an epic tale like this left me feeling cold. And while there were loads of enjoyable cameos, their involvement ultimately didn't seem important to the story as a whole. Which is disappointing, as the rest of the event really was a fun and exciting ride." Anthony Wendel from MonkeysFightingRobots wrote "The 2020 event seems like it is taking some big risk, and it doesn't inspire a lot of confidence from the start. Iron Man 2020 #1 has set the stakes and shown some very intense players on both sides of the board. Sadly, if it doesn't unfold just the right way, many may feel cheated about defending the path characters are taking." == Collected editions ==
Read more →
Model collapse

Model collapse, also known by other names such as "AI inbreeding", "AI cannibalism", "Habsburg AI", and "model autophagy disorder" or "MAD" is a phenomenon noted in artificial intelligence studies, where machine learning models gradually degrade due to errors coming from uncurated synthetic data, or due to training on the outputs of another model such as prior versions of itself. It is unclear to what extent the phenomenon threatens the long-term development of such models, and some techniques have been proposed to mitigate the effect. == Characteristics == Shumailov et al. coined the term to describe two specific stages to the degradation of machine learning models: early model collapse and late model collapse: In early model collapse, the model begins losing information about the tails of the distribution – mostly affecting minority data. Later work highlighted that early model collapse is hard to notice, since overall performance may appear to improve, while the model loses performance on minority data. In late model collapse, the model loses a significant proportion of its performance, confusing concepts and losing most of its variance. == Mechanism == Using synthetic data as training data can lead to issues with the quality and reliability of the trained model. Model collapse occurs for three main reasons: functional approximation errors sampling errors learning errors Importantly, it happens in even the simplest of models, where not all of the error sources are present. In more complex models the errors often compound, leading to faster collapse. == Disagreement over real-world impact == Some researchers and commentators on model collapse warn that the phenomenon could fundamentally threaten future generative AI development: As AI-generated data is shared on the Internet, it will inevitably end up in future training datasets, which are often crawled from the Internet. If training on "slop" (large quantities of unlabeled synthetic data) inevitably leads to model collapse, this could therefore pose a difficult problem. However, recently, other researchers have disagreed with this argument, showing that if synthetic data accumulates alongside human-generated data, model collapse is avoided. The researchers argue that data accumulating over time is a more realistic description of reality than deleting all existing data every year, and that the real-world impact of model collapse may not be as catastrophic as feared. An alternative branch of the literature investigates the use of machine learning detectors and watermarking to identify model generated data and filter it out. == Mathematical models of the phenomenon == === 1D Gaussian model === In 2024, a first attempt has been made at illustrating collapse for the simplest possible model — a single dimensional normal distribution fit using unbiased estimators of mean and variance, computed on samples from the previous generation. To make this more precise, we say that original data follows a normal distribution X 0 ∼ N ( μ , σ 2 ) {\displaystyle X^{0}\sim {\mathcal {N}}(\mu ,\sigma ^{2})} , and we possess M 0 {\displaystyle M_{0}} samples X j 0 {\displaystyle X_{j}^{0}} for j ∈ { 1 , … , M 0 } {\displaystyle j\in {\{\,1,\dots ,M_{0}\,{}\}}} . Denoting a general sample X j i {\displaystyle X_{j}^{i}} as sample j ∈ { 1 , … , M i } {\displaystyle j\in {\{\,1,\dots ,M_{i}\,{}\}}} at generation i {\displaystyle i} , then the next generation model is estimated using the sample mean and variance: μ i + 1 = 1 M i ∑ j X j i ; σ i + 1 2 = 1 M i − 1 ∑ j ( X j i − μ i + 1 ) 2 . {\displaystyle \mu _{i+1}={\frac {1}{M_{i}}}\sum _{j}X_{j}^{i};\quad \sigma _{i+1}^{2}={\frac {1}{M_{i}-1}}\sum _{j}(X_{j}^{i}-\mu _{i+1})^{2}.} Leading to a conditionally normal next generation model X j i + 1 | μ i + 1 , σ i + 1 ∼ N ( μ i + 1 , σ i + 1 2 ) {\displaystyle X_{j}^{i+1}|\mu _{i+1},\;\sigma _{i+1}\sim {\mathcal {N}}(\mu _{i+1},\sigma _{i+1}^{2})} . In theory, this is enough to calculate the full distribution of X j i {\displaystyle X_{j}^{i}} . However, even after the first generation, the full distribution is no longer normal: It follows a variance-gamma distribution. To continue the analysis, instead of writing the probability density function at each generation, it is possible to explicitly construct them in terms of independent random variables using Cochran's theorem. To be precise, μ 1 {\displaystyle \mu _{1}} and σ 1 {\displaystyle \sigma _{1}} are independent, with μ 1 ∼ N ( μ , σ 2 M 0 ) {\displaystyle \mu _{1}\sim {\mathcal {N}}\left(\mu ,{\frac {\sigma ^{2}}{M_{0}}}\right)} and ( M 0 − 1 ) σ 1 2 ∼ σ 2 Γ ( M 0 − 1 2 , 1 2 ) {\displaystyle (M_{0}-1)\,\sigma _{1}^{2}\sim \sigma ^{2}\,\Gamma \left({\frac {M_{0}-1}{2}},{\frac {1}{2}}\right)} , following a Gamma distribution. Denoting with Z {\displaystyle Z} Gaussian random variables distributed according to N ( 0 , 1 ) {\displaystyle {\mathcal {N}}(0,1)} and with S i {\displaystyle S^{i}} random variables distributed with 1 M i − 1 − 1 Γ ( M i − 1 − 1 2 , 1 2 ) {\displaystyle {\frac {1}{M_{i-1}-1}}\Gamma \left({\frac {M_{i-1}-1}{2}},{\frac {1}{2}}\right)} , it turns out to be possible to write samples at each generation as X j 0 = μ + σ Z j 0 , {\textstyle X_{j}^{0}=\mu +\sigma Z_{j}^{0},} X j 1 = μ + σ M 0 Z 1 + σ S 1 Z j 1 , {\textstyle X_{j}^{1}=\mu +{\frac {\sigma }{\sqrt {M_{0}}}}Z^{1}+\sigma {\sqrt {S^{1}}}Z_{j}^{1},} and more generally X j n = μ + σ M 0 Z 1 + σ M 1 S 1 Z 2 + ⋯ + σ M n − 1 S 1 × ⋯ × S n − 1 Z n + σ S 1 × ⋯ × S n Z j n . {\displaystyle X_{j}^{n}=\mu +{\frac {\sigma }{\sqrt {M_{0}}}}Z^{1}+{\frac {\sigma }{\sqrt {M_{1}}}}{\sqrt {S^{1}}}Z^{2}+\dots +{\frac {\sigma }{\sqrt {M_{n-1}}}}{\sqrt {S^{1}\times \dots \times S^{n-1}}}Z^{n}+\sigma {\sqrt {S^{1}\times \dots \times S^{n}}}Z_{j}^{n}.} Note, that these are not joint distributions, as Z n {\displaystyle Z^{n}} and S n {\displaystyle S^{n}} depend directly on Z j n − 1 {\displaystyle Z_{j}^{n-1}} , but when considering X j n {\displaystyle X_{j}^{n}} on its own the formula above provides all the information about the full distribution. To analyse the model collapse, we can first calculate variance and mean of samples at generation n {\displaystyle n} . This would tell us what kind of distributions we expect to arrive at after n {\displaystyle n} generations. It is possible to find its exact value in closed form, but the mean and variance of the square root of gamma distribution are expressed in terms of gamma functions, making the result quite clunky. Following, it is possible to expand all results to second order in each of 1 / M i {\displaystyle 1/M_{i}} , assuming each sample size to be large. It is then possible to show that 1 σ 2 Var ⁡ ( X j n ) = 1 M 0 + 1 M 1 + ⋯ + 1 M n − 1 + 1 + O ( M i − 2 ) . {\displaystyle {\frac {1}{\sigma ^{2}}}\operatorname {Var} (X_{j}^{n})={\frac {1}{M_{0}}}+{\frac {1}{M_{1}}}+\dots +{\frac {1}{M_{n-1}}}+1+{\mathcal {O}}\left(M_{i}^{-2}\right).} And if all sample sizes M i = M {\displaystyle M_{i}=M} are constant, this diverges linearly as n → ∞ {\displaystyle n\to \infty } : Var ⁡ ( X j n ) = σ 2 ( 1 + n M ) ; E ( X j n ) = μ . {\displaystyle \operatorname {Var} (X_{j}^{n})=\sigma ^{2}\left(1+{\frac {n}{M}}\right);\quad \mathbb {E} (X_{j}^{n})=\mu .} This is the same scaling as for a single dimensional Gaussian random walk. However, divergence of the variance of X j n {\displaystyle X_{j}^{n}} does not directly provide any information about the corresponding estimates of μ n + 1 {\displaystyle \mu _{n+1}} and σ n + 1 {\displaystyle \sigma _{n+1}} , particularly how different they are from the original μ {\displaystyle \mu } and σ {\displaystyle \sigma } . It turns out to be possible to calculate the distance between the true distribution and the approximated distribution at step n + 1 {\displaystyle n+1} , using the Wasserstein-2 distance (which is also sometimes referred to as risk): E [ W 2 2 ( N ( μ , σ 2 ) , N ( μ n + 1 , σ n + 1 2 ) ) ] = 3 2 σ 2 ( 1 M 0 + 1 M 1 + ⋯ + 1 M n ) + O ( M i − 2 ) , {\displaystyle \mathbb {E} \left[\mathbb {W} _{2}^{2}\left({\mathcal {N}}(\mu ,\sigma ^{2}),{\mathcal {N}}(\mu _{n+1},\sigma _{n+1}^{2})\right)\right]={\frac {3}{2}}\sigma ^{2}\left({\frac {1}{M_{0}}}+{\frac {1}{M_{1}}}+\dots +{\frac {1}{M_{n}}}\right)+{\mathcal {O}}\left(M_{i}^{-2}\right),} Var ⁡ [ W 2 2 ( N ( μ , σ 2 ) , N ( μ n + 1 , σ n + 1 2 ) ) ] = 1 2 σ 4 ( 3 M 0 2 + 3 M 1 2 + ⋯ + 3 M n 2 + ∑ i ≠ j 4 M i M j ) + O ( M i − 3 ) . {\displaystyle \operatorname {Var} \left[\mathbb {W} _{2}^{2}\left({\mathcal {N}}(\mu ,\sigma ^{2}),{\mathcal {N}}(\mu _{n+1},\sigma _{n+1}^{2})\right)\right]={\frac {1}{2}}\sigma ^{4}\left({\frac {3}{M_{0}^{2}}}+{\frac {3}{M_{1}^{2}}}+\dots +{\frac {3}{M_{n}^{2}}}+\sum _{i\neq j}{\frac {4}{M_{i}M_{j}}}\right)+{\mathcal {O}}\left(M_{i}^{-3}\right).} This directly shows why model collapse occurs in this simple model. Due to errors from re-sampling the approximated distribution, each generation ends up corresponding to a
Read more →