AI Art Pragmata

AI Art Pragmata — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Deep learning

    Deep learning

    In machine learning, deep learning (DL) focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience and revolves around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers (ranging from three to several hundred or thousands) in the network. Methods used can be supervised, semi-supervised or unsupervised. Some common deep learning network architectures include fully connected networks, deep belief networks, recurrent neural networks, convolutional neural networks, generative adversarial networks, transformers, and neural radiance fields. These architectures have been applied to fields including computer vision, speech recognition, natural language processing, machine translation, bioinformatics, drug design, medical image analysis, climate science, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance. Early forms of neural networks were inspired by information processing and distributed communication nodes in biological systems, particularly the human brain. However, current neural networks do not intend to model the brain function of organisms, and are generally seen as low-quality models for that purpose. == Overview == Most modern deep learning models are based on multi-layered neural networks such as convolutional neural networks and transformers, although they can also include propositional formulas or latent variables organized layer-wise in deep generative models such as the nodes in deep belief networks and deep Boltzmann machines. Fundamentally, deep learning refers to a class of machine learning algorithms in which a hierarchy of layers is used to transform input data into a progressively more abstract and composite representation. For example, in an image recognition model, the raw input may be an image (represented as a tensor of pixels). The first representational layer may attempt to identify basic shapes such as lines and circles, the second layer may compose and encode arrangements of edges, the third layer may encode a nose and eyes, and the fourth layer may recognize that the image contains a face. Importantly, a deep learning process can learn which features to optimally place at which level on its own. Prior to deep learning, machine learning techniques often involved hand-crafted feature engineering to transform the data into a more suitable representation for a classification algorithm to operate on. In the deep learning approach, features are not hand-crafted and the model discovers useful feature representations from the data automatically. This does not eliminate the need for hand-tuning; for example, varying numbers of layers and layer sizes can provide different degrees of abstraction. The word "deep" in "deep learning" refers to the number of layers through which the data is transformed. More precisely, deep learning systems have a substantial credit assignment path (CAP) depth. The CAP is the chain of transformations from input to output. CAPs describe potentially causal connections between input and output. For a feedforward neural network, the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited. No universally agreed-upon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves CAP depth higher than two. CAP of depth two has been shown to be a universal approximator in the sense that it can emulate any function. Beyond that, more layers do not add to the function approximator ability of the network. Deep models (CAP > two) are able to extract better features than shallow models and hence, extra layers help in learning the features effectively. Deep learning architectures can be constructed with a greedy layer-by-layer method. Deep learning helps to disentangle these abstractions and pick out which features improve performance. Deep learning algorithms can be applied to unsupervised learning tasks. This is an important benefit because unlabeled data is more abundant than labeled data. Examples of deep structures that can be trained in an unsupervised manner are deep belief networks. The term deep learning was introduced to the machine learning community by Rina Dechter in 1986, and to artificial neural networks by Igor Aizenberg and colleagues in 2000, in the context of Boolean threshold neurons. The etymology of the term is more complicated. == Interpretations == Deep neural networks are generally interpreted in terms of the universal approximation theorem or probabilistic inference. The classic universal approximation theorem concerns the capacity of feedforward neural networks with a single hidden layer of finite size to approximate continuous functions. In 1989, the first proof was published by George Cybenko for sigmoid activation functions and was generalised to feed-forward multi-layer architectures in 1991 by Kurt Hornik. Recent work also showed that universal approximation also holds for non-bounded activation functions such as Kunihiko Fukushima's rectified linear unit. The universal approximation theorem for deep neural networks concerns the capacity of networks with bounded width but the depth is allowed to grow. Lu et al. proved that if the width of a deep neural network with ReLU activation is strictly larger than the input dimension, then the network can approximate any Lebesgue integrable function; if the width is smaller or equal to the input dimension, then a deep neural network is not a universal approximator. The probabilistic interpretation derives from the field of machine learning. It features inference, as well as the optimization concepts of training and testing, related to fitting and generalization, respectively. More specifically, the probabilistic interpretation considers the activation nonlinearity as a cumulative distribution function. The probabilistic interpretation led to the introduction of dropout as regularizer in neural networks. The probabilistic interpretation was introduced by researchers including Hopfield, Widrow and Narendra and popularized in surveys such as the one by Bishop. == History == === Before 1980 === There are two types of artificial neural network (ANN): feedforward neural network (FNN) or multilayer perceptron (MLP) and recurrent neural networks (RNN). RNNs have cycles in their connectivity structure, whereas FNNs do not. In the 1920s, Wilhelm Lenz and Ernst Ising created the Ising model which is essentially a non-learning RNN architecture consisting of neuron-like threshold elements. In 1972, Shun'ichi Amari made this architecture adaptive. His learning RNN was republished by John Hopfield in 1982. Other early recurrent neural networks were published by Kaoru Nakano in 1971. Already in 1948, Alan Turing produced work on "Intelligent Machinery" that was not published in his lifetime, containing "ideas related to artificial evolution and learning RNNs". Frank Rosenblatt (1958) proposed the perceptron, an MLP with 3 layers: an input layer, a hidden layer with randomized weights that did not learn, and an output layer. He later published a 1962 book that also introduced variants and computer experiments, including a version with four-layer perceptrons "with adaptive preterminal networks" where the last two layers have learned weights (here he credits H. D. Block and B. W. Knight). The book cites an earlier network by R. D. Joseph (1960) "functionally equivalent to a variation of" this four-layer system (the book mentions Joseph over 30 times). Should Joseph therefore be considered the originator of proper adaptive multilayer perceptrons with learning hidden units? Unfortunately, the learning algorithm was not a functional one, and fell into oblivion. The first working deep learning algorithm was the Group method of data handling, a method to train arbitrarily deep neural networks, published by Alexey Ivakhnenko and Lapa in 1965. They regarded it as a form of polynomial regression, or a generalization of Rosenblatt's perceptron to handle more complex, nonlinear, and hierarchical relationships. A 1971 paper described a deep network with eight layers trained by this method, which is based on layer by layer training through regression analysis. Superfluous hidden units are pruned using a separate validation set. Since the activation functions of the nodes are Kolmogorov-Gabor polynomials, these were also the first deep networks with multiplicative units or "gates". The first deep learning multilayer perceptron trained by stochastic gradient descent was published in 1967 by Shun'ichi

    Read more →
  • Computer-assisted legal research

    Computer-assisted legal research

    Computer-assisted legal research (CALR) or computer-based legal research is a mode of legal research that uses databases of court opinions, statutes, court documents, and secondary material. Electronic databases make large bodies of case law easily available. Databases also have additional benefits, such as Boolean searches, evaluating case authority, organizing cases by topic, and providing links to cited material. Databases are available through paid subscription or for free. Subscription-based services include Westlaw, LexisNexis, JustCite, HeinOnline, Bloomberg Law, Lex Intell, VLex and LexEur. As of 2015, the commercial market grossed $8 billion. Free services include OpenJurist, Google Scholar, AltLaw, Ravel Law, WIPO Lex, Law Delta and the databases of the Free Access to Law Movement. == Purposes == Computer-assisted legal research is undertaken by a variety of actors. It is taught as a topic in many law degrees and is used extensively by undergraduate and postgraduate law students in meeting the work requirements of their degree courses. Professors of Law rely on the digitization of primary and secondary sources of law when conducting their research and writing the material that they submit for publication. Professional lawyers rely on computer-assisted legal research in order to properly understand the status of the law and so to act effectively in the best interest of their client. They may also consult the text of case judgements and statutes specifically, as well as wider academic comment, in order to form the basis of (or response to) an appeal. The availability of legal information online differs by type, jurisdiction and subject matter. The types of information available include: Texts of statutes, statutory instruments, civil codes, etc. Explanatory notes and government publications relating to statutes and their operation Texts of governing documents such as constitutions and treaties Case judgements Journals on legal matters or legal theory Dictionaries and legal encyclopedia Legal texts and materials in the form of e-books Current affairs and market information Educational information on the law and its operation == Before the Internet == Prior to the advent and popularization of the World Wide Web, access to digital legal information was largely through the use of CD-ROMs, designed and sold by commercial organizations. Dial-up services were also available from the 1970s. As the use of the Internet spread in the early 1990s, companies such as LexisNexis and Westlaw incorporated Internet connectivity into their software packages. Browser-based legal information started to be published by Legal Information Institutes from 1992. == Publicly available information == The first effort to provide free computer access to legal information was made by two academics, Peter Martin and Tom Bruce, in 1992. Today, the Legal Information Institute freely publishes such resources as the text of the United States Constitution, judgements of the United States Supreme Court, and the text of the United States Code. The Australasian Legal Information Institute (AusLII) was established soon after in 1995. Other legal information institutes, such as those of Great Britain and Ireland (BAILII), Canada (CII) and South Africa (SAfLI) soon followed. LIIs were partially formalized in 2002 following the signing of the Declaration of Free Access to the Law, which has been signed by 54 countries. At the time of writing, the World Legal Information Institute contains in excess of 1800 databases from 123 jurisdictions. Many governments also publish legal information online. For example, UK legislation and statutory instruments have been publicly available online since 2010. Depending on the jurisdiction in question, the decisions of higher appellate courts may also be published online, either by the Legal Information Institute or by the court service directly. Sources of European Union Law are published for free by EUR-Lex in 23 languages, including judgments of the European Courts. Similarly, judgements of the European Court of Human Rights are published on its website.

    Read more →
  • Dry Drowning

    Dry Drowning

    Dry Drowning is a cyberpunk mystery visual novel developed by Studio V and published by VLG Publishing and WhisperGames for Microsoft Windows on August 2, 2019. It was released on the Nintendo Switch on February 22, 2021. == Gameplay == The player takes control of Mordred Foley and has to read through the story, while making decisions at certain points. Depending on the choices, the player can influence the relationship to other characters as well as the course of the game, discovering more than 150 story branches, and eventually reach one out of three different endings with variations. The game also includes passages where the player has to find clues or items on the screen by clicking on them. These can be used in interrogation scenes with certain characters in order to unmask them and discover their lies. Throughout the game, the player has access to an in-game operating system called AquaOS. With that, they can re-read their conversations, look at their found items, and read biographies of the characters encountered. == Plot == The game is set in the fictional and totalitarian city Nova Polemos in Europa in 2066. Mordred Foley and Hera Kairis are private investigators and before the events of the game, they sent two of the most dangerous serial killers ever, Jennifer Kingston and Robert Herrington, to the electric chair. However, after their execution, their agency underwent an investigation for falsifying the evidence presented during the case, which completely destroyed its reputation. Now they want to restart their careers and lives, while dealing with their past traumas. Soon, Mordred is caught up in several cases that all led him to believe that the dreaded serial killer named Pandora has returned. In order to solve these cases, both Mordred and Hera have to face their pasts and fears, all while a racist political party is about to make the lives of refugees in Nova Polemos even worse. == Development == The game was initially conceived by Giacomo Masi and Samuele Zolfanelli, then developed by Studio V and directed and written by Giacomo Masi. It was originally written in Italian and translated into English, Chinese, Japanese, Korean, and German. The soundtrack was composed, written, and performed by Giorgio Maioli. The ending theme and Hera's pieces, performed on piano, were created by Alessandro Masi. The background and character artworks were made by Giulia Carli, other graphic elements such as the UI were created by Samuele Zolfanelli. The developers cited L.A. Noire, Ace Attorney, Blade Runner and Heavy Rain as some of their inspirations for the game. === Releases === Dry Drowning was originally released on Microsoft Windows through Steam, GOG, Itch.io, and Utomik in August 2019. In July 2019, Giacomo Masi announced the game would be released for Xbox One in 2020, though it was not released that year. A Nintendo Switch port was released on February 22, 2021, and a version for PlayStation 4 is set to release in 2021. == Reception == According to review aggregator platform Metacritic, Dry Drowning received "mixed or average reviews" for PC based on 11 reviews and "generally favorable reviews" for Nintendo Switch based on 6 reviews. Fellow review aggregator OpenCritic assessed that the game received fair approval, being recommended by 55% of critics. 4players.de gave a positive rating of 80% and wrote: "Stylish noir thriller with an interesting story, but mechanical limitations – despite a variety of possible interactions." Screen Rant gave a mixed rating of 3 out of 5 stars and wrote, "Dry Drowning may be a fair bit messy, but there's charm here. Players who are willing to embrace the cheesier elements will find some joy in its well-crafted setting and a decent murder mystery plot. The game is constrictive and lacks the genuine shock and engagement of top tier visual novels like Doki Doki Literature Club!, but there are some moments of clever world building and a strong enough mystery propelling it." The Italian review site SpazioGames gave a positive rating of 8.5 out of 10 points and wrote: "Dry Drowning is a very good game with great narrative experience. Every relationship between the characters is layered to increase player involvement, and each choice has different consequences. A thriller game that deserves to be played." === Awards === The game won Best of EGS 2019 and Best of JOIN 2019 awards, an honorable mention at GAMEROME and was nominated as "Best Italian Debut Game" at the Italian Video Game Awards 2020. It was also declared Best Game at Join The Indie 2019.

    Read more →
  • TCEC Season 14

    TCEC Season 14

    The 14th season of the Top Chess Engine Championship took place between 17 November 2018 and 24 February 2019. Stockfish was the defending champion, having defeated Komodo in the previous season's superfinal. The season is notable for two things: the emergence of two strong, new engines, the Komodo variant Komodo Monte Carlo tree search (MCTS) and the neural network engine Leela Chess Zero, and the dramatic superfinal. Komodo MCTS and Leela fought their way from Division 4 and Division 3 respectively to the Premier Division, with Leela further qualifying for the superfinal against Stockfish. The superfinal was a topsy-turvy affair with the lead changing hands several times. It finished as the closest superfinal TCEC has ever seen, with Stockfish winning by a single game, 50.5–49.5 (+10 =81 -9). == Overview == === Structure === The season comprised five divisions: from the lowest Division 4 to the Premier Division. The top two engines of each division promote to the division above, while the bottom two engines relegate. The top two engines of the Premier Division contest a 100-game superfinal. The lengths of the opening books used increases as the divisions progress. The superfinal itself used a custom opening book designed by Jeroen Noomen. === Rules === The TCEC draw and win rules were slightly modified for Season 14. The game is now adjudicated as drawn if, after move 30, both engines have evals ±0.08 for five consecutive moves, and there are neither pawn moves nor a capture. Win adjudication now occurs if both engines have an eval of ±10 for five consecutive moves. Following the controversy over DeusX's participation last season, the uniqueness rule for neural networks was modified such that at least two of the following three hallmarks must be unique: The code for training the neural network The neural network (and weights file) itself The engine that executes this network This change meant DeusX did not meet the uniqueness criteria and therefore did not participate. Aside from this change, the season used the standard rules of the TCEC. == Results == === Division 4 === New entrant Komodo MCTS dominated Division 4, winning by a clear four points, although it did lose a game to second-place finisher rofChade. Fellow new entrant Scorpio NN performed badly and finished last, drawing only one game and losing the rest. === Division 3 === The neural network engine Leela Chess Zero had just missed promotion to Division 2 in the previous season. Since its relatively weak performance last season was partly due to hardware problems, and since it had shown a lot of improvement in strength, it was the hot favourite in this division. Leela lived up to its billing by comprehensively defeating everyone else. In a portent of future divisions however, Leela surprisingly dropped a game to third-place Arasan. Komodo MCTS was also improving quickly, and an updated version finished second behind Leela. The gap between second and third was 6.5 points, illustrating the gulf in class. === Division 2 === Although Division 2 engines are significantly stronger than Division 3, Leela and Komodo MCTS continued to dominate the competition, and again finished first and second. Komodo MCTS only lost one game to Leela, while Leela's tendency to occasionally lose to weaker engines saw her losing a game to 4th-placed Booot. Third place finisher Xiphos gave Leela and Komodo MCTS a run for their money, and was in the running up until the final rounds when it lost a crucial game to Leela. This loss left it one point behind Komodo MCTS in the final standings. === Division 1 === Leela and Komodo MCTS's rampage through the lower divisions continued, and they again finished first and second. In a demonstration of how much it had improved, Leela scored 20/28 in this division, the same score it had achieved in Division 2. This was also a TCEC points record for this division. However, Leela dropped a game against fourth-place finisher Chiron. Komodo MCTS, which had yet to lose a game in the lower divisions except to Leela, also conceded its first loss to third-place Fizbo. At the other end of the table, former champions Jonny and Fritz, which had not been updated, found themselves outclassed and finished second-last and last respectively; however with fellow competitor Ginkgo crashing five times (and therefore being disqualified), Jonny managed to stay in the division. The penultimate game for this division set a new TCEC moves record for a decisive game: 308 moves before Leela defeated Fritz. === Premier division === This was the strongest premier division ever, with multiple-time champions Stockfish, Komodo, and Houdini in the mix. Right from the start it became clear that Stockfish was in a league of its own, and it dominated the division, scoring wins against every other engine without losing a game. Second place however was a hotly-contested affair, with Leela, Komodo and Houdini neck-and-neck for most of the division. Houdini took the early lead, but Komodo gained second after winning two games by forfeit when its sibling Komodo MCTS crashed. This led to murmurs of a "Konspiracy". However, when both Komodo and Houdini failed to score more wins against the lower half of the field, Leela was able to take the lead. Halfway through the division the race was upended again when Leela went through a bad streak, losing three games in a row to Stockfish, Komodo, and Fire. This led to Komodo regaining second place, only for Komodo MCTS to crash yet again. By TCEC rules this meant Komodo MCTS was disqualified and all its scores were zeroed out, which put Leela back in second place. With three games left, Leela missed a win against Andscacs, which would've more or less secured her a place in the superfinal. Meanwhile, Komodo kept the division interesting by winning two of its last three games. Because Komodo had superior tiebreakers to Leela, this meant Komodo would qualify for the superfinal unless Leela managed to hold Stockfish to a draw with Black in the last game of the division. In a tense final game, Stockfish came close to winning, but missed the winning line. Leela managed to draw and qualified for the superfinal. At the other end of the table, it was quickly apparent that Ethereal and Andscacs were the weakest engines and would likely relegate. However, when Komodo MCTS was disqualified (and therefore relegated), it threw both engines a lifeline, since they could now stay in the division by beating the other. Andscacs was able to score a head-to-head win against Ethereal, but was crushed by Stockfish (+0 =2 -4) and Leela (+0 =3 -3). Ethereal didn't manage to score a win in the entire division, but did manage to score more draws than Andscacs, condemning Andscacs to relegation. === Superfinal === Going into the superfinal expectations were high for Leela: she had received a new network and had just won her first major competition when she defeated Houdini in the second TCEC cup. However, she had won the tournament without having played Stockfish (who had been surprisingly eliminated by Houdini in the semifinals). That, plus the fact that Stockfish dominated Premier Division and had never lost a match to Leela, left it unclear which engine was superior, although most spectators favored Stockfish. The superfinal turned out to be a roller-coaster. It began with Stockfish drawing first blood in game 7, and then scoring another win in game 10. Leela hit back with wins in game 11 and 13, but then lost games 20, 21, and 22. This gave Stockfish a 3-point lead. However, in the next 30 games, Leela was the only one to score wins: it first equalized by winning games 25, 27, and 29, and then took the lead by winning games 49 and 53. Stockfish won game 56, but Leela won game 63, maintaining her lead. There followed two dramatic games. In game 65, Leela built up a winning position. Stockfish showed a +153 evaluation, indicating that it had found a forced line leading to an endgame tablebase win; indeed analysis with 7-piece tablebases showed that Leela's position was winning. Under previous seasons' rules, the game would have been adjudicated as a win because Leela's evaluation was above 6.5. However under the new rules, Leela's +8.92 evaluation was not enough to adjudicate. It turned out that Leela could not see the winning line, and shuffled her pieces aimlessly, leading to a 50-move draw. In game 66, Stockfish was given a substantial advantage by the opening, but failed to make the most of it. The evaluations were leveling out to zero when the internet connection to the GPU servers was cut off. By tournament rules, this meant the game was replayed from scratch. After a further internet disconnection and restart, Stockfish handled the opening better and won, leaving Leela with a 1-point lead. In the last third of the superfinal, there followed more drama as Leela often built up strong advantages, but Stockfish showed great resourcefulness in defending inferior positions. Meanwh

    Read more →
  • History of machine translation

    History of machine translation

    Machine translation is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one natural language to another. In the 1950s, machine translation became a reality in research, although references to the subject can be found as early as the 17th century. The Georgetown experiment, which involved successful fully automatic translation of more than sixty Russian sentences into English in 1954, was one of the earliest recorded projects. Researchers of the Georgetown experiment asserted their belief that machine translation would be a solved problem within a few years. In the Soviet Union, similar experiments were performed shortly after. Consequently, the success of the experiment ushered in an era of significant funding for machine translation research in the United States. The achieved progress was much slower than expected; in 1966, the ALPAC report found that ten years of research had not fulfilled the expectations of the Georgetown experiment and resulted in dramatically reduced funding. Interest grew in statistical models for machine translation, which became more common and also less expensive in the 1980s as available computational power increased. Although there exists no autonomous system of "fully automatic high quality translation of unrestricted text," there are many programs now available that are capable of providing useful output within strict constraints. Several of these programs are available online, such as Google Translate and the SYSTRAN system that powers AltaVista's BabelFish (which was replaced by Microsoft Bing translator in May 2012). == The beginning == The origins of machine translation can be traced back to the work of Al-Kindi, a 9th-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation. The idea of machine translation later appeared in the 17th century. In 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. In the mid-1930s the first patents for "translating machines" were applied for by Georges Artsrouni, for an automatic bilingual dictionary using punched tape. Russian Peter Troyanskii submitted a more detailed proposal that included both the bilingual dictionary and a method for dealing with grammatical roles between languages, based on the grammatical system of Esperanto. This system was separated into three stages: stage one consisted of a native-speaking editor in the source language to organize the words into their logical forms and to exercise the syntactic functions; stage two required the machine to "translate" these forms into the target language; and stage three required a native-speaking editor in the target language to normalize this output. Troyanskii's proposal remained unknown until the late 1950s, by which time computers were well-known and utilized. == The early years == The first set of proposals for computer based machine translation was presented in 1949 by Warren Weaver, a researcher at the Rockefeller Foundation, "Translation memorandum". These proposals were based on information theory, successes in code breaking during the Second World War, and theories about the universal principles underlying natural language. A few years after Weaver submitted his proposals, research began in earnest at many universities in the United States. On 7 January 1954 the Georgetown–IBM experiment was held in New York at the head office of IBM. This was the first public demonstration of a machine translation system. The demonstration was widely reported in the newspapers and garnered public interest. The system itself, however, was no more than a "toy" system. It had only 250 words and translated 49 carefully selected Russian sentences into English – mainly in the field of chemistry. Nevertheless, it encouraged the idea that machine translation was imminent and stimulated the financing of the research, not only in the US but worldwide. Early systems used large bilingual dictionaries and hand-coded rules for fixing the word order in the final output which was eventually considered too restrictive in linguistic developments at the time. For example, generative linguistics and transformational grammar were exploited to improve the quality of translations. During this period operational systems were installed. The United States Air Force used a system produced by IBM and Washington University in St. Louis, while the Atomic Energy Commission and Euratom, in Italy, used a system developed at Georgetown University. While the quality of the output was poor it met many of the customers' needs, particularly in terms of speed. At the end of the 1950s, Yehoshua Bar-Hillel was asked by the US government to look into machine translation, to assess the possibility of fully automatic high-quality translation by machines. Bar-Hillel described the problem of semantic ambiguity or double-meaning, as illustrated in the following sentence: Little John was looking for his toy box. Finally he found it. The box was in the pen. The word pen may have two meanings: the first meaning, something used to write in ink with; the second meaning, a container of some kind. To a human, the meaning is obvious, but Bar-Hillel claimed that without a "universal encyclopedia" a machine would never be able to deal with this problem. At the time, this type of semantic ambiguity could only be solved by writing source texts for machine translation in a controlled language that uses a vocabulary in which each word has exactly one meaning. == The 1960s, the ALPAC report and the seventies == Research in the 1960s in both the Soviet Union and the United States concentrated mainly on the Russian–English language pair. The objects of translation were chiefly scientific and technical documents, such as articles from scientific journals. The rough translations produced were sufficient to get a basic understanding of the articles. If an article discussed a subject deemed to be confidential, it was sent to a human translator for a complete translation; if not, it was discarded. A great blow came to machine-translation research in 1966 with the publication of the ALPAC report. The report was commissioned by the US government and delivered by ALPAC, the Automatic Language Processing Advisory Committee, a group of seven scientists convened by the US government in 1964. The US government was concerned that there was a lack of progress being made despite significant expenditure. The report concluded that machine translation was more expensive, less accurate and slower than human translation, and that despite the expenditures, machine translation was not likely to reach the quality of a human translator in the near future. The report recommended, however, that tools be developed to aid translators – automatic dictionaries, for example – and that some research in computational linguistics should continue to be supported. The publication of the report had a profound impact on research into machine translation in the United States, and to a lesser extent the Soviet Union and United Kingdom. Research, at least in the US, was almost completely abandoned for over a decade. In Canada, France and Germany, however, research continued. In the US the main exceptions were the founders of SYSTRAN (Peter Toma) and Logos (Bernard Scott), who established their companies in 1968 and 1970 respectively and served the US Department of Defense. In 1970, the SYSTRAN system was installed for the United States Air Force, and subsequently by the Commission of the European Communities in 1976. The METEO System, developed at the Université de Montréal, was installed in Canada in 1977 to translate weather forecasts from English to French, and was translating close to 80,000 words per day or 30 million words per year until it was replaced by a competitor's system on 30 September 2001. While research in the 1960s concentrated on limited language pairs and input, demand in the 1970s was for low-cost systems that could translate a range of technical and commercial documents. This demand was spurred by the increase of globalisation and the demand for translation in Canada, Europe, and Japan. == The 1980s and early 1990s == By the 1980s, both the diversity and the number of installed systems for machine translation had increased. A number of systems relying on mainframe technology were in use, such as SYSTRAN, Logos, Ariane-G5, and Metal. As a result of the improved availability of microcomputers, there was a market for lower-end machine translation systems. Many companies took advantage of this in Europe, Japan, and the USA. Systems were also brought onto the market in China, Eastern Europe, Korea, and the Soviet Union. During the 1980s there was a lot of activity in MT in Japan especially. With the fifth-generation co

    Read more →
  • Nature Manifesto

    Nature Manifesto

    Nature Manifesto is an Immersive sound piece and multimedia installation by Icelandic artist Björk and artist and curator Aleph Molinari, created in collaboration with the French Institute for Research and Coordination in Acoustics/Music (IRCAM). The installation was showcased at the Centre Pompidou in Paris, France from November 20, 2024 to December 9, 2024, as part of the museum's "Biodiversity: Which Culture for Which Future?" forum. It combines natural soundscapes, calls of extinct animals reconstructed through artificial intelligence, and Björk's narration to address damages to biodiversity and the collapse of ecosystems. == Background == Björk's work intricately weaves themes of nature and technology, reflecting her deep engagement with both realms. In 2008, she co-founded the Náttúra campaign to protest the construction of foreign-backed aluminum factories in Iceland, aiming to protect the country's natural landscapes. She released the single "Náttúra" featuring Thom Yorke, with all proceeds supporting this environmental initiative. Her 2011 album Biophilia further exemplifies this synthesis, exploring the relationships between music, nature, and technology through a multimedia project that included interactive apps, custom-made instruments, and educational workshops. Björk's Cornucopia tour (2019-2023) seamlessly integrates themes of nature preservation and environmental activism, and featured a recorded message by Swedish climate activist Greta Thunberg. The tour's fusion of music, technology, and natural imagery reflects Björk's vision of a harmonious coexistence between humanity and nature, advocating for sustainable futures. Björk has previously used artificial intelligence in her works. In 2020, she collaborated with Microsoft to create Kórsafn, a sound installation for the Sister City Hotel lobby in New York City which used an AI-powered model that elaborated choral recordings from her discography through a sensor on the rooftop of the building that would generate music according to data like the weather and the seasons. For her charity single "Oral", featuring Spanish singer Rosalía, she released a music video directed by photographer and visual artist Carlota Guerrero, who used AI-generated deepfake versions of the artists. == Concept == Nature Manifesto is a three-minute and forty-second immersive sound piece. The composition merges Björk's voice, as she articulates a manifesto on biodiversity and the climate crisis, with cries of extinct and endangered animals, harmonizing them with natural soundscapes. The installation was curated by Chloé Siganos and Aleph Molinari, with associate curator Delphine Le Gatt. The primary goal of Nature Manifesto is to foster a deeper understanding of humanity's impact on the natural world. Conceived as a "post-optimistic" manifesto, Aleph Molinari stated that the project's purpose was to "offer a voice to nature". He stated that "the modern concept of nature itself is problematic [...] because it’s a concept born in the Romantic period and, with the rise of the industrial era, became an antithesis to human civilisation and everything urban. Nature came to define what was outside, the savage Other... But nature is everything that we’re part of." The soundscape features recreated calls of extinct and endangered species, developed in collaboration with the French sound research institute IRCAM. Artificial intelligence was employed to simulate the vocalizations of animals that no longer exist in the wild. To save energy and lessen the ecological impact of the use of AI, the research institute developed a "frugal AI" model capable of generating audio in real-time on local servers without a graphics processing unit. The sounds were then produced and edited by Björk in collaboration with Robin Meier Wiratunga and Bergur Þórisson. The installation was located within the Centre Pompidou's escalator, known as the "caterpillar". The installation was further supported by videos created by visual artist Sam Balfus (also known as Balfua) by using artificial intelligence, and edited by Santiago Molinari. == Activism == To sustain and broaden the themes presented in Nature Manifesto, Björk publicly urged French President Emmanuel Macron to prohibit bottom trawling within France's marine protected areas (MPA). She criticized the French government's claim of protecting 30% of its marine territories, highlighting that over 90% of these MPAs exist only on paper, allowing destructive practices like bottom trawling to continue unchecked. She collaborated with non-governmental organizations Sustainable Ocean Alliance, Ungir umhverfissinnar and Bloom, to advocate for genuine ocean conservation. Björk promoted the cause through her social media profiles by sharing petitions. In November 2024, Björk lent her Instagram account to French environmental activists to directly address Macron. The activists used the platform to call for stronger protection of the ocean, urging Macron to impose stricter restrictions on harmful fishing practices, particularly bottom trawling. == Reception == Nature Manifesto received mixed to positive reviews from critics. Some critiques focused on the installation's setting, suggesting that the movement inherent to the escalator space diminished the immersive potential of the soundscape. The choice of using artificial intelligence was also questioned. Björk and Molinari defended this, as both see AI as a tool that can be used creatively and sustainably, with Björk focusing on the importance of human input to give AI a "soul", and Molinari stressing the need for sustainable technological practices in the broader context of digital life. After the exhibition ended, Björk further opinionated: "this is how we will work in the future. [...] if there is no soul in tomorrow's music made by AI it is because [no one] put it there and we have to speak out and guard this as listeners", further stating that there is already "soulless muzak" [sic] on Spotify, "mass manufactured without the attention of creativity".

    Read more →
  • WebCrow

    WebCrow

    The WebCrow is a research project carried out at the Information Engineering Department of the University of Siena with the purpose of automatically solving crosswords. == The Project == The scientific relevance of the project can be understood considering that cracking crosswords requires human-level knowledge. Unlike chess and related games and there is no closed world configuration space. A first nucleus of technology, such as search engines, information retrieval, and machine learning techniques enable computers to enfold with semantics real-life concepts. The project is based on a software system whose major assumption is to attack crosswords making use of the Web as its primary source of knowledge. WebCrow is very fast and often thrashes human challengers in competitions, especially on multi language crossword schemes. A distinct feature of the WebCrow software system is to combine properly natural language processing (NLP) techniques, the Google web search engine, and constraint satisfaction algorithms from artificial intelligence to acquire knowledge and to fill the schema. The most important component of WebCrow is the Web Search Module (WSM), which implements a domain specific web based question answering algorithm. The way WebCrow approaches crosswords solving is quite different with respect to humans: Whereas we tend to first answer clues we are sure of and then proceed filling the schema by exploiting the already answered clues as hints, WebCrow uses two clearly distinct stages. In the first one, it processes all the clues and tries to answer them all: For each clue it finds many possible candidates and sorts them according to complex ranking models mainly based on a probability criteria. In the second stage, WebCrow uses constraint satisfaction algorithms to fill the grid with the overall most likely combination of clue answers. In order to interact with Google, first of all, WebCrow needs to compose queries on the basis of the given clues. This is done by query expansion, whose purpose is to convert the clue into a query expressed by a simplified and more appropriate language for Google. The retrieved documents are parsed so as to extract a list of word candidates that are congruent with the crossword length constraints. Crosswords can hardly be faced by using encyclopedic knowledge only, since many clues are wordplays or are otherwise purposefully very ambiguous. This enigmatic component of crosswords is faced by a massive use of database of solved crosswords, and by automatic reasoning on a properly organized knowledge base of wired rules. Last but not the least, the final constraint satisfaction step is very effective to fill the correct candidate, even though, unlike humans, the system can not rely on very high confidence on the correctness of the answer. == Competitions == WebCrow speed and effectiveness has been tested many times in man-machine competitions on Italian, English and multi-language crosswords The outcome of the tests is that WebCrow can successfully compete with average human players on single language schemes and reaches expert level performance in multi-language crosswords. However, WebCrow has not reached expert level in single-language crosswords, yet. === ECAI-06 Competition === On August 30, 2006, at the European Conference on Artificial Intelligence (ECAI2006), 25 conference attendees and 53 internet connected crosswords lovers, competed with WebCrow in an official challenge organized within the conference program. The challenge consisted in 5 different crosswords (2 in Italian, 2 in English and one multi-language in Italian and English) and 15 minutes were assigned for each crossword. WebCrow ranked 21 out of 74 participants in the Italian competition, and won both the bilingual and English competitions. === Other Competitions === Several competitions have been held in Florence, Italy within the Creativity Festival in December 2006, and another official conference competition took place in Hyderabad, India in January 2007, within the International Conference of Artificial Intelligence, where it ranked second out of 25 participants.

    Read more →
  • Grokipedia

    Grokipedia

    Grokipedia is an AI-generated online encyclopedia operated by the American company xAI. The site was launched on October 27, 2025. Some entries are generated by Grok, a large language model owned by the same company, while others were forked from Wikipedia, with some altered and some used nearly verbatim. Articles cannot be directly edited, though logged-in visitors to the encyclopedia can suggest new articles or corrections via a pop-up form, which are reviewed by Grok. The xAI founder Elon Musk suggested Grokipedia could be an alternative to Wikipedia that would "purge out the propaganda" he believes is promoted by the latter, describing Wikipedia as "woke" and an "extension of legacy media propaganda". External analysis of Grokipedia's content has focused on its accuracy and biases due to hallucinations and potential algorithmic bias, which reviewers have described as promoting right-wing perspectives and Musk's views. The majority of coverage has described the website as validating, promoting, and legitimizing a variety of debunked conspiracy theories and ideas against scientific consensus on topics such as HIV/AIDS denialism, vaccines and autism, climate change, and race and intelligence. The site has been accused of whitewashing far-right extremism, such as by falsely claiming a white genocide is actively occurring. Several right-wing figures have welcomed the site. Studies have highlighted its use of sources deemed as having very low credibility such as X conversations and neo-Nazi websites, and for writing about far-right figures and topics in a promotional manner. == Background == Wikipedia is an online encyclopedia written and maintained by a community of volunteers. Its possible bias has been studied and debated. In 2018, Haaretz noted "Wikipedia has succeeded in being accused of being both too liberal and too conservative, and has critics from across the spectrum". xAI is an American AI company founded by Elon Musk in 2023. Its flagship product is the family of large language models called Grok. == History == In 2021, Musk expressed affection for Wikipedia on its 20th anniversary. In 2022, however, Musk argued that Wikipedia was "losing its objectivity", and in 2023, said he would donate US$1 billion to the project if it was pejoratively renamed "Dickipedia". In December 2024, Musk called for a boycott of donations to Wikipedia over its perceived left-wing bias, calling it "Wokepedia". In January 2025, Musk made a series of statements on Twitter denouncing Wikipedia for its description of the incident where he made a controversial gesture, which many viewed as resembling a Nazi salute, at president Donald Trump's second inauguration. Musk has since positioned Grokipedia as an alternative to Wikipedia that would "purge out the propaganda" in the latter, with Musk describing Wikipedia as "woke" and an "extension of legacy media propaganda". === Idea and announcement === In September 2025, Musk spoke at the All-In podcast conference with David O. Sacks, the White House advisor on AI and cryptocurrency, about how Grok consumed data from Wikipedia and other sources to gain more complete knowledge of the world. Sacks suggested publishing its knowledge base as an artifact called "Grokipedia", saying "Wikipedia is so biased, it's a constant war". Following the conversation, Musk announced that xAI was building a new AI-generated online encyclopedia called Grokipedia. According to Musk's announcement, it would be an AI-powered knowledge base designed to rival Wikipedia by addressing its perceived biases, errors, and ideological slants. The project positioned itself within a history of ideologically driven alternatives to Wikipedia, such as the conservative Conservapedia (launched in 2006) and the Russian-government-friendly Ruwiki (launched in 2023). However, Grokipedia is distinct in its core reliance on artificial intelligence rather than human community editing. === Launch and traffic === On October 6, 2025, Musk announced that the early version of Grokipedia was scheduled for release in two weeks, but the project was postponed briefly to address content quality issues. It launched on October 27, 2025, labeled "v 0.1", with over 800,000 articles, compared to over seven million English Wikipedia articles as of September 1, 2025. According to an initial analysis of usage figures by Similarweb, which evaluates data from registered users and partners, Grokipedia recorded a peak of over 460,000 website visits in the US on October 28, 2025. After that, traffic dropped significantly and settled at around 35,000 visits per day between November 8 and 11, 2025. As of early 2026, it had over 5.6 million articles. In January 2026, The Guardian reported that GPT-5.2 frequently cited Grokipedia as a source in responses, raising concerns of misinformation on ChatGPT. The same month, The Verge reported that Google's AI Overviews, AI Mode, and Gemini language model, as well as Microsoft Copilot and Perplexity AI, used Grokipedia to answer niche, obscure, or highly specific factual questions or "non-sensitive queries." According to a case study published by SEO Engico, the site received only 19 clicks from Google Search in November 2025 but reached approximately 3.2 million monthly clicks by January 2026, with over 900,000 pages indexed and millions of ranking keywords. Analysts attributed the surge in part to the site's technical structure and large-scale AI-generated content production. In early February 2026, Grokipedia's visibility in Google Search declined sharply. SEO analysts, including Glenn Gabe and Malte Landwehr, reported a significant drop in rankings across Google organic results as well as in Google AI Overviews and AI Mode. The same case study cited independent reviews that identified citation quality concerns, including references to low-credibility sources and instances of self-citation. By mid-February 2026, Grokipedia had reportedly lost much of its previous search visibility, and Wikipedia ranked above it for searches related to its own name. === Updates === ==== Future ==== In November 2025, Musk announced that he eventually plans to change the name of the site to Encyclopedia Galactica when Grokipedia is "good enough", saying that it had a "long way to go". This name is taken from the publication of that title in the works of Isaac Asimov and Douglas Adams. Musk said that he hoped to send copies of the encyclopedia to "the Moon and Mars and out to deep space". == Content == The Grok large language model generates and fact-checks articles on Grokipedia. Users cannot directly edit Grokipedia articles, but logged-in users can suggest edits and report errors, with such submissions being reviewed and implemented by the Grok AI. Some articles are nearly identical to their Wikipedia entries, but the format of Grokipedia citations is different, and some Grokipedia articles were republished almost verbatim, accompanied by a disclaimer noting that the content was "adapted from Wikipedia" under a Creative Commons license. Others were completely rewritten from scratch using Musk's AI chatbot, Grok. Forbes identified the articles AMD, Lamborghini, and PlayStation 5 as examples of copied Wikipedia articles. Articles attributed to Wikipedia carry a Creative Commons Attribution-ShareAlike license, while the license of other articles is licensed under the "X Community License", a license that accepts reuse and remixing for "non-commercial and research purposes" and commercial use that abides to "all of the guardrails provided in xAI's Acceptable Use Policy". On October 31, 2025, Musk clarified that the duplication of Wikipedia articles was intentional, saying that the Grokipedia team instructed Grok to compile Wikipedia's top 1 million articles and make content changes to them. The site's design has been described as minimalist with a simple homepage including little more than a large search bar. In a comparative textual analysis of the most heavily edited matched article pairs from Grokipedia and Wikipedia, Grokipedia entries are substantially longer and less densely referenced, indicating that AI-produced encyclopedias prioritize exposition rather than source-based validation. Starting in version 0.2, Grok reviews and implements approved suggested edits, and a small panel rotates through a display of the names of several recently edited articles. In February 2026, the Columbia Journalism Review reported on an analysis by the Tow Center for Digital Journalism finding that Grok, the AI behind Grokipedia, had increasingly begun suggesting and approving edits to the site itself without human involvement. According to the report, AI-generated edit suggestions overtook human submissions in December 2025 and accounted for more than three-quarters of proposed changes. The analysis raised concerns about transparency, editorial oversight, and fact-checking standards, particularly after instances in which Grok proposed or modified politically s

    Read more →
  • Kernel (image processing)

    Kernel (image processing)

    In image processing, a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an image. Or more simply, when each pixel in the output image is a function of the nearby pixels (including itself) in the input image, the kernel is that function. == Details == The general expression of a convolution is g x , y = ω ∗ f x , y = ∑ i = − a a ∑ j = − b b ω i , j f x − i , y − j , {\displaystyle g_{x,y}=\omega f_{x,y}=\sum _{i=-a}^{a}{\sum _{j=-b}^{b}{\omega _{i,j}f_{x-i,y-j}}},} where g ( x , y ) {\displaystyle g(x,y)} is the filtered image, f ( x , y ) {\displaystyle f(x,y)} is the original image, ω {\displaystyle \omega } is the filter kernel. Every element of the filter kernel is considered by − a ≤ i ≤ a {\displaystyle -a\leq i\leq a} and − b ≤ j ≤ b {\displaystyle -b\leq j\leq b} . Depending on the element values, a kernel can cause a wide range of effects: The above are just a few examples of effects achievable by convolving kernels and images. === Origin === The origin is the position of the kernel which is above (conceptually) the current output pixel. This could be outside of the actual kernel, though usually it corresponds to one of the kernel elements. For a symmetric kernel, the origin is usually the center element. == Convolution == Convolution is the process of adding each element of the image to its local neighbors, weighted by the kernel. This is related to a form of mathematical convolution. The matrix operation being performed—convolution—is not traditional matrix multiplication, despite being similarly denoted by . For example, if we have two three-by-three matrices, the first a kernel, and the second an image piece, convolution is the process of flipping both the rows and columns of the kernel and multiplying locally similar entries and summing. The element at coordinates [2, 2] (that is, the central element) of the resulting image would be a weighted combination of all the entries of the image matrix, with weights given by the kernel: ( [ a b c d e f g h i ] ∗ [ 1 2 3 4 5 6 7 8 9 ] ) [ 2 , 2 ] = {\displaystyle \left({\begin{bmatrix}a&b&c\\d&e&f\\g&h&i\end{bmatrix}}{\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}}\right)[2,2]=} ( i ⋅ 1 ) + ( h ⋅ 2 ) + ( g ⋅ 3 ) + ( f ⋅ 4 ) + ( e ⋅ 5 ) + ( d ⋅ 6 ) + ( c ⋅ 7 ) + ( b ⋅ 8 ) + ( a ⋅ 9 ) . {\displaystyle (i\cdot 1)+(h\cdot 2)+(g\cdot 3)+(f\cdot 4)+(e\cdot 5)+(d\cdot 6)+(c\cdot 7)+(b\cdot 8)+(a\cdot 9).} The other entries would be similarly weighted, where we position the center of the kernel on each of the boundary points of the image, and compute a weighted sum. The values of a given pixel in the output image are calculated by multiplying each kernel value by the corresponding input image pixel values. This can be described algorithmically with the following pseudo-code: for each image row in input image: for each pixel in image row: set accumulator to zero for each kernel row in kernel: for each element in kernel row: if element position corresponding to pixel position then multiply element value corresponding to pixel value add result to accumulator endif set output image pixel to accumulator corresponding input image pixels are found relative to the kernel's origin. If the kernel is symmetric then place the center (origin) of the kernel on the current pixel. The kernel will overlap the neighboring pixels around the origin. Each kernel element should be multiplied with the pixel value it overlaps with and all of the obtained values should be summed. This resultant sum will be the new value for the current pixel currently overlapped with the center of the kernel. If the kernel is not symmetric, it has to be flipped both around its horizontal and vertical axis before calculating the convolution as above. The general form for matrix convolution is [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x m 1 x m 2 ⋯ x m n ] ∗ [ y 11 y 12 ⋯ y 1 n y 21 y 22 ⋯ y 2 n ⋮ ⋮ ⋱ ⋮ y m 1 y m 2 ⋯ y m n ] = ∑ i = 0 m − 1 ∑ j = 0 n − 1 x ( m − i ) ( n − j ) y ( 1 + i ) ( 1 + j ) {\displaystyle {\begin{bmatrix}x_{11}&x_{12}&\cdots &x_{1n}\\x_{21}&x_{22}&\cdots &x_{2n}\\\vdots &\vdots &\ddots &\vdots \\x_{m1}&x_{m2}&\cdots &x_{mn}\\\end{bmatrix}}{\begin{bmatrix}y_{11}&y_{12}&\cdots &y_{1n}\\y_{21}&y_{22}&\cdots &y_{2n}\\\vdots &\vdots &\ddots &\vdots \\y_{m1}&y_{m2}&\cdots &y_{mn}\\\end{bmatrix}}=\sum _{i=0}^{m-1}\sum _{j=0}^{n-1}x_{(m-i)(n-j)}y_{(1+i)(1+j)}} === Edge handling === Kernel convolution usually requires values from pixels outside of the image boundaries. There are a variety of methods for handling image edges. Extend The nearest border pixels are conceptually extended as far as necessary to provide values for the convolution. Corner pixels are extended in 90° wedges. Other edge pixels are extended in lines. Wrap The image is conceptually wrapped (or tiled) and values are taken from the opposite edge or corner. Mirror The image is conceptually mirrored at the edges. For example, attempting to read a pixel 3 units outside an edge reads one 3 units inside the edge instead. Crop / Avoid overlap Any pixel in the output image which would require values from beyond the edge is skipped. This method can result in the output image being slightly smaller, with the edges having been cropped. Move kernel so that values from outside of image is never required. Machine learning mainly uses this approach. Example: Kernel size 10x10, image size 32x32, result image is 23x23. Kernel Crop Any pixel in the kernel that extends past the input image isn't used and the normalizing is adjusted to compensate. Constant Use constant value for pixels outside of image. Usually black or sometimes gray is used. Generally this depends on application. === Normalization === Normalization is defined as the division of each element in the kernel by the sum of all kernel elements, so that the sum of the elements of a normalized kernel is unity. This will ensure the average pixel in the modified image is as bright as the average pixel in the original image. === Optimization === Fast convolution algorithms include: separable convolution ==== Separable convolution ==== 2D convolution with an M × N kernel requires M × N multiplications for each sample (pixel). If the kernel is separable, then the computation can be reduced to M + N multiplications. Using separable convolutions can significantly decrease the computation by doing 1D convolution twice instead of one 2D convolution. === Implementation === Here a concrete convolution implementation done with the GLSL shading language :

    Read more →
  • Emi Kusano

    Emi Kusano

    Emi Kusano (Japanese: 草野 絵美, Hepburn: Kusano Emi; born August 4, 1990) is a Tokyobased Japanese multidisciplinary artist known for creating photography, video, and installations using generative AI technology. Her work explores themes of nostalgia, pop culture, and collective memory. Her work explores themes of nostalgia, pop culture, and collective memory. She is recognized as one of the early practitioners of generative AI art. Her work has been exhibited at the 21st Century Museum of Contemporary Art, Kanazawa, and screened at the M+ Museum’s Asian Avant-Garde Film Festival. Additionally, she has participated in prestigious international art fairs, including Paris Photo and Art Basel Hong Kong. In 2025, she was named one of the World Economic Forum's Young Global Leaders. In 2026, she was selected as a fellow for the AI x Arts Fellowship at Mohamed bin Zayed University of Artificial Intelligence. Kusano serves as a part-time lecturer at the Tokyo University of the Arts and is the producer and vocalist for the Synthwave music unit, Satellite Young. == Early life == === Photography === Kusano was born and raised in Tokyo. Kusano's career began during her high school years before 2008 when she became involved in street fashion photography. Her photographs, primarily taken in Harajuku, were published on "Japanese Streets", "Metropolis", CNN's travel guide magazine "CNN GO","WGSN". Her photography was exhibited at the FIT Museum in New York and the Victoria and Albert Museum in London. == Career == === Music and Installation work === Since 2014, in collaboration with BelleMaison Sekine, Kusano has led "Satellite Young," a synthwave music unit s the lead vocalist, she sings about blending 1980s idol culture with lyrics that tackle contemporary issues such as planned obsolescence ("Sony Timer"), online dating, artificial intelligence, and social media. Their music, known for its conceptual depth, has earned international niche recognition. "Satellite Young" has participated in music festivals, including "South by Southwest," showcasing their unique fusion of retro aesthetics and modern critiques. In 2018, she was selected to participate in "Art Hack Day," an interdisciplinary art hackathon held at The National Museum of Emerging Science and Innovation. where she presented "Singing Dream," a karaoke machine endowed with artificial life, earning the Jury Prize. "Instababy Generator," a 2019 installation co-created with Junichi Yamaoka, explored the concept of designer babies and received recognition at the SIGGRAPH Art Gallery. In October 2020, operating under the name Emi Satellite, she debuted as a solo singer with her first single "Glass Ceiling," an empowerment anthem that addresses the challenges faced by women and encourages progress towards the future. The music video for this song features a direction where strong women rewrite the roles of protagonists in a Bishōjo game, a type of dating simulation game. This concept later served as a prototype for Shinsei Galverse. === Challenge for Blockchain Art === In 2021, she explored the financial world through her single "IPO" and entered the NFT space with "Love Is an IPO," her first NFT work on Ethereum, sold on Foundation. In April 2022, she co-founded the crowdfunded anime project "Shinsei Galverse" with Ayaka Ohira, Devin Mancuso, and Jack Baldwin. serving as one of the executive directors overseeing the creative direction and story. The project's NFT collection of 8,888 ranked #1 on OpenSea's "Top NFTs" for several days, marking one of Japan's first globally successful blockchain art projects. In 2023, Shinsei Galverse produced the official "I like u" music video by Grammy-nominated singer Tove Lo as an initial anime endeavor. Kusano also contributed to discussions on Web3.0 and blockchain technology as a panelist in seminars organized by the Digital Agency of Japan. === AI art === In May 2023, Kusano's first AI art collection "Neural Fad" depicting imaginary fashion history sold out 100 pieces within 24 hours at the "Bright Moments Tokyo" In June, she created WWDJAPAN's first AI-generated magazine cover using her own face. It is the first AI cover in Japanese fashion media. She was also appointed t to the Cultural Affairs Agency's Copyright Subcommittee, she participates in discussions on generative AI and copyright. Her "Synthetic Reflections" self-portrait series debuted on SuperRare, with the first piece auctioned for 3.5 ETH (equivalent to 6,480 US dollars at the time). In July 2023, she co-exhibited a 3D AI-generated dress at Christie's "Future Frequencies" auction with Gucci, alongside Claire Silver. In September, her 30-piece "Pixelated Perception" exhibit at Art Blocks Marfa explored 1990s media and gender, also showcased at the 21st Century Museum of Contemporary Art, Kanazawa. In December, her "Techno-Animism" AI art collection fused Japanese animism with technology. Collaborating with a U.S. gallery, she unveiled 336 pieces during a two-week Art Basel world tour. Throughout the two-week tour, she sold a total of 336 pieces, generating 11.2 ETH (equivalent to 21,264 US dollars at the time). === Generative art === In February 2024, the generative art platform Art Blocks selected the work "Melancholic Magical Maiden," for its Curated category. This piece reconstructs the aesthetics of 1990s magical girl anime, offering a critique of past anime heroines. It sold out within an hour, with all 300 pieces going for a total of 57 ETH (equivalent to approximately 215,385US dollars at the time). In April 2024, Emi Kusano spoke at the Standing Committee on Copyright and Other Rights at the World Intellectual Property Organization (WIPO) in Geneva, Switzerland, where she presented AI-specific information for discussion. == Style and technique == Kusano draws inspiration from Japanese retro-futurism as a foundation for her artwork, which explores the cutting-edge of technology. This approach is fueled by nostalgia for the pre-internet era, specifically the postwar period when Japanese mass media held significant sway. By blending modern technology with retro-culture, she captures the complex feelings of love, hate, and ambivalence towards present and future accelerationism. While at university, Kusano was profoundly influenced by Naoki Sakai, the industrial designer responsible for igniting the retro-futurism movement. In her musical project "Satellite Young", Kusano dons the persona of an '80s female idol and sings about contemporary technology. In her installation piece "Singing Dream", she investigates the concept of an artificial life form inhabiting a karaoke machine, which has been popular since the 1980s, compelling people to sing. In the collaborative NFT art project "Shinsei Galverse", Kusano reimagines a cyberpunk anime primarily featuring female characters, incorporating elements of magical girls popular in the early Heisei period. == Personal life == Kusano has two sons. In August 2021, she minted her older son Zombie Zoo Keeper's pixel art on "OpenSea" as part of his summer research project. The artwork was purchased by notable figures including Brud CEO Trevor McFedries and Steve Aoki, who bought the piece for the equivalent of 21.82 thousand US dollars, highlighting the intersection of art, technology, and family in her work.

    Read more →
  • CogX Festival

    CogX Festival

    CogX Festival is a global festival focusing on the impact of artificial intelligence (AI) and emerging technology on industry, government, and society. It takes place annually, usually in September, in London, England. Founded by Charlie Muirhead and Tabitha Goldstaub in 2017, CogX aims to facilitate dialogue and understanding about AI and its implications across various sectors. CogX Festival 2023 was held from September 12 to September 14 across multiple sites in London. == History == The inaugural CogX event took place in 2017, intending to bring together experts from diverse fields to discuss the role and impact of AI and emerging technologies. Since then, it has evolved to include a broader range of topics and attract a diverse audience. In 2018, the first CogX Awards festival was hosted. That year, over 50 awards were shown to 300 guests. In 2021, CogX and Hopin, a video conferencing software, signed an agreement lasting 4 years to make CogX a hybrid conference due to the COVID-19 pandemic. CogX 2021 attracted over 5,000 attendees in-person and over 100,000 virtually. In 2022, they returned to a live event format after two years of hybrid events and controlled physical attendance. They also launched the CogX app, which curated insights from the world's top podcasts. In 2023, after he had delivered the keynote address guest speaker Stephen Fry fell off the stage and subsequently broke his leg, hip, pelvis and a "bunch of ribs". A court filing in 2026 revealed that Fry was seeking £100,000 in damages from CogX Festival Ltd and creative agency Blonstein Events. == Programming == The festival features sessions, discussions, workshops, and exhibitions, encompassing various domains of AI and technology. In recent CogX Festivals, they have featured summits encompassing topics like global leadership and industry transformation.

    Read more →
  • Computer-automated design

    Computer-automated design

    Design Automation usually refers to electronic design automation, or Design Automation which is a Product Configurator. Extending Computer-Aided Design (CAD), automated design and Computer-Automated Design (CAutoD) are more concerned with a broader range of applications, such as automotive engineering, civil engineering, composite material design, control engineering, dynamic system identification and optimization, financial systems, industrial equipment, mechatronic systems, steel construction, structural optimisation, and the invention of novel systems. The concept of CAutoD perhaps first appeared in 1963, in the IBM Journal of Research and Development, where a computer program was written. to search for logic circuits having certain constraints on hardware design to evaluate these logics in terms of their discriminating ability over samples of the character set they are expected to recognize. More recently, traditional CAD simulation is seen to be transformed to CAutoD by biologically-inspired machine learning, including heuristic search techniques such as evolutionary computation, and swarm intelligence algorithms. == Guiding designs by performance improvements == To meet the ever-growing demand of quality and competitiveness, iterative physical prototyping is now often replaced by 'digital prototyping' of a 'good design', which aims to meet multiple objectives such as maximised output, energy efficiency, highest speed and cost-effectiveness. The design problem concerns both finding the best design within a known range (i.e., through 'learning' or 'optimisation') and finding a new and better design beyond the existing ones (i.e., through creation and invention). This is equivalent to a search problem in an almost certainly, multidimensional (multivariate), multi-modal space with a single (or weighted) objective or multiple objectives. == Normalized objective function: cost vs. fitness == Using single-objective CAutoD as an example, if the objective function, either as a cost function J ∈ [ 0 , ∞ ) {\displaystyle J\in [0,\infty )} , or inversely, as a fitness function f ∈ ( 0 , 1 ] {\displaystyle f\in (0,1]} , where f = J 1 + J {\displaystyle f={\tfrac {J}{1+J}}} , is differentiable under practical constraints in the multidimensional space, the design problem may be solved analytically. Finding the parameter sets that result in a zero first-order derivative and that satisfy the second-order derivative conditions would reveal all local optima. Then comparing the values of the performance index of all the local optima, together with those of all boundary parameter sets, would lead to the global optimum, whose corresponding 'parameter' set will thus represent the best design. However, in practice, the optimization usually involves multiple objectives and the matters involving derivatives are a lot more complex. == Dealing with practical objectives == In practice, the objective value may be noisy or even non-numerical, and hence its gradient information may be unreliable or unavailable. This is particularly true when the problem is multi-objective. At present, many designs and refinements are mainly made through a manual trial-and-error process with the help of a CAD simulation package. Usually, such a posteriori learning or adjustments need to be repeated many times until a ‘satisfactory’ or ‘optimal’ design emerges. == Exhaustive search == In theory, this adjustment process can be automated by computerised search, such as exhaustive search. As this is an exponential algorithm, it may not deliver solutions in practice within a limited period of time. == Search in polynomial time == One approach to virtual engineering and automated design is evolutionary computation such as evolutionary algorithms. === Evolutionary algorithms === To reduce the search time, the biologically-inspired evolutionary algorithm (EA) can be used instead, which is a (non-deterministic) polynomial algorithm. The EA based multi-objective "search team" can be interfaced with an existing CAD simulation package in a batch mode. The EA encodes the design parameters (encoding being necessary if some parameters are non-numerical) to refine multiple candidates through parallel and interactive search. In the search process, 'selection' is performed using 'survival of the fittest' a posteriori learning. To obtain the next 'generation' of possible solutions, some parameter values are exchanged between two candidates (by an operation called 'crossover') and new values introduced (by an operation called 'mutation'). This way, the evolutionary technique makes use of past trial information in a similarly intelligent manner to the human designer. The EA based optimal designs can start from the designer's existing design database, or from an initial generation of candidate designs obtained randomly. A number of finely evolved top-performing candidates will represent several automatically optimized digital prototypes. There are websites that demonstrate interactive evolutionary algorithms for design. allows you to evolve 3D objects online and have them 3D printed. allows you to do the same for 2D images.

    Read more →
  • Color balance

    Color balance

    In photography and image processing, color balance is the global adjustment of the intensities of the colors (typically red, green, and blue primary colors). An important goal of this adjustment is to render specific colors – particularly neutral colors like white or grey – correctly. Hence, the general method is sometimes called gray balance, neutral balance, or white balance. Color balance changes the overall mixture of colors in an image and is used for color correction. Generalized versions of color balance are used to correct colors other than neutrals or to deliberately change them for effect. White balance is one of the most common kinds of balancing, and is when colors are adjusted to make a white object (such as a piece of paper or a wall) appear white and not a shade of any other colour. Image data acquired by sensors – either film or electronic image sensors – must be transformed from the acquired values to new values that are appropriate for color reproduction or display. Several aspects of the acquisition and display process make such color correction essential – including that the acquisition sensors do not match the sensors in the human eye, that the properties of the display medium must be accounted for, and that the ambient viewing conditions of the acquisition differ from the display viewing conditions. The color balance operations in popular image editing applications usually operate directly on the red, green, and blue channel pixel values, without respect to any color sensing or reproduction model. In film photography, color balance is typically achieved by using color correction filters over the lights or on the camera lens. == Generalized color balance == Sometimes the adjustment to keep neutrals neutral is called white balance, and the phrase color balance refers to the adjustment that in addition makes other colors in a displayed image appear to have the same general appearance as the colors in an original scene. It is particularly important that neutral (gray, neutral, white) colors in a scene appear neutral in the reproduction. === Psychological color balance === Humans relate to flesh tones more critically than other colors. Trees, grass and sky can all be off without concern, but if human flesh tones are 'off' then the human subject can look sick or dead. To address this critical color balance issue, the tri-color primaries themselves are formulated to not balance as a true neutral color. The purpose of this color primary imbalance is to more faithfully reproduce the flesh tones through the entire brightness range. == Illuminant estimation and adaptation == Most digital cameras have means to select color correction based on the type of scene lighting, using either manual lighting selection, automatic white balance, or custom white balance. The algorithms for these processes perform generalized chromatic adaptation. Many methods exist for color balancing. Setting a button on a camera is a way for the user to indicate to the processor the nature of the scene lighting. Another option on some cameras is a button which one may press when the camera is pointed at a gray card or other neutral colored object. This captures an image of the ambient light, which enables a digital camera to set the correct color balance for that light. There is a large literature on how one might estimate the ambient lighting from the camera data and then use this information to transform the image data. A variety of algorithms have been proposed, and the quality of these has been debated. A few examples and examination of the references therein will lead the reader to many others. Examples are Retinex, an artificial neural network or a Bayesian method. == Chromatic colors == Color balancing an image affects not only the neutrals, but other colors as well. An image that is not color balanced is said to have a color cast, as everything in the image appears to have been shifted towards one color. Color balancing may be thought in terms of removing this color cast. Color balance is also related to color constancy. Algorithms and techniques used to attain color constancy are frequently used for color balancing, as well. Color constancy is, in turn, related to chromatic adaptation. Conceptually, color balancing consists of two steps: first, determining the illuminant under which an image was captured; and second, scaling the components (e.g., R, G, and B) of the image or otherwise transforming the components so they conform to the viewing illuminant. Viggiano found that white balancing in the camera's native RGB color model tended to produce less color inconstancy (i.e., less distortion of the colors) than in monitor RGB for over 4000 hypothetical sets of camera sensitivities. This difference typically amounted to a factor of more than two in favor of camera RGB. This means that it is advantageous to get color balance right at the time an image is captured, rather than edit later on a monitor. If one must color balance later, balancing the raw image data will tend to produce less distortion of chromatic colors than balancing in monitor RGB. == Mathematics of color balance == Color balancing is sometimes performed on a three-component image (e.g., RGB) using a 3x3 matrix. This type of transformation is appropriate if the image was captured using the wrong white balance setting on a digital camera, or through a color filter. Changing the color balance of an image can improve classifier results on a trained ML model. === Scaling monitor R, G, and B === In principle, one wants to scale all relative luminances in an image so that objects which are believed to be neutral appear so. If, say, a surface with R = 240 {\displaystyle R=240} was believed to be a white object, and if 255 is the count which corresponds to white, one could multiply all red values by 255/240. Doing analogously for green and blue would result, at least in theory, in a color balanced image. In this type of transformation the 3x3 matrix is a diagonal matrix. [ R G B ] = [ 255 / R w ′ 0 0 0 255 / G w ′ 0 0 0 255 / B w ′ ] [ R ′ G ′ B ′ ] {\displaystyle \left[{\begin{array}{c}R\\G\\B\end{array}}\right]=\left[{\begin{array}{ccc}255/R'_{w}&0&0\\0&255/G'_{w}&0\\0&0&255/B'_{w}\end{array}}\right]\left[{\begin{array}{c}R'\\G'\\B'\end{array}}\right]} where R {\displaystyle R} , G {\displaystyle G} , and B {\displaystyle B} are the color balanced red, green, and blue components of a pixel in the image; R ′ {\displaystyle R'} , G ′ {\displaystyle G'} , and B ′ {\displaystyle B'} are the red, green, and blue components of the image before color balancing, and R w ′ {\displaystyle R'_{w}} , G w ′ {\displaystyle G'_{w}} , and B w ′ {\displaystyle B'_{w}} are the red, green, and blue components of a pixel which is believed to be a white surface in the image before color balancing. This is a simple scaling of the red, green, and blue channels, and is why color balance tools in Photoshop have a white eyedropper tool. It has been demonstrated that performing the white balancing in the phosphor set assumed by sRGB tends to produce large errors in chromatic colors, even though it can render the neutral surfaces perfectly neutral. === Scaling X, Y, Z === If the image may be transformed into CIE XYZ tristimulus values, the color balancing may be performed there. This has been termed a "wrong von Kries" transformation. Although it has been demonstrated to offer usually poorer results than balancing in monitor RGB, it is mentioned here as a bridge to other things. Mathematically, one computes: [ X Y Z ] = [ X w / X w ′ 0 0 0 Y w / Y w ′ 0 0 0 Z w / Z w ′ ] [ X ′ Y ′ Z ′ ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\left[{\begin{array}{ccc}X_{w}/X'_{w}&0&0\\0&Y_{w}/Y'_{w}&0\\0&0&Z_{w}/Z'_{w}\end{array}}\right]\left[{\begin{array}{c}X'\\Y'\\Z'\end{array}}\right]} where X {\displaystyle X} , Y {\displaystyle Y} , and Z {\displaystyle Z} are the color-balanced tristimulus values; X w {\displaystyle X_{w}} , Y w {\displaystyle Y_{w}} , and Z w {\displaystyle Z_{w}} are the tristimulus values of the viewing illuminant (the white point to which the image is being transformed to conform to); X w ′ {\displaystyle X'_{w}} , Y w ′ {\displaystyle Y'_{w}} , and Z w ′ {\displaystyle Z'_{w}} are the tristimulus values of an object believed to be white in the un-color-balanced image, and X ′ {\displaystyle X'} , Y ′ {\displaystyle Y'} , and Z ′ {\displaystyle Z'} are the tristimulus values of a pixel in the un-color-balanced image. If the tristimulus values of the monitor primaries are in a matrix P {\displaystyle \mathbf {P} } so that: [ X Y Z ] = P [ L R L G L B ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\mathbf {P} \left[{\begin{array}{c}L_{R}\\L_{G}\\L_{B}\end{array}}\right]} where L R {\displaystyle L_{R}} , L G {\displaystyle L_{G}} , and L B {\displaystyle L_{B}} are the un-gamma corrected monitor RGB, one may use: [ L R L G L B ] = P − 1 [ X w / X w ′ 0 0

    Read more →
  • Deepfake

    Deepfake

    Deepfakes (a portmanteau of 'deep learning' and 'fake') are images, videos, or audio that have been edited or generated using artificial intelligence, AI-based tools or audio-video editing software. They may depict real or fictional people and are considered a form of synthetic media, that is media that is usually created by artificial intelligence systems by combining various media elements into a new media artifact. While the act of creating fake content is not new, deepfakes uniquely leverage machine learning and artificial intelligence techniques, including facial recognition algorithms and artificial neural networks such as variational autoencoders and generative adversarial networks (GANs). In turn, the field of image forensics has worked to develop techniques to detect manipulated images. Deepfakes have garnered widespread attention for their potential use in creating child sexual abuse material, celebrity pornographic videos, revenge porn, fake news, hoaxes, bullying, and financial fraud. Academics have raised concerns about the potential for deepfakes to promote disinformation and hate speech, as well as interfere with elections. In response, the information technology industry and governments have proposed recommendations and methods to detect and mitigate their use. Academic research has also delved deeper into the factors driving deepfake engagement online as well as potential countermeasures to malicious application of deepfakes. From traditional entertainment to gaming, deepfake technology has evolved to be increasingly convincing and available to the public, allowing for the disruption of the entertainment and media industries. == History == Photo manipulation was developed in the 19th century and soon applied to motion pictures. Technology steadily improved during the 20th century, and more quickly with the advent of digital video. Deepfake technology has been developed by researchers at academic institutions beginning in the 1990s, and later by amateurs in online communities. More recently, the methods have been adopted by industry. The development of generative adversarial networks (GANs) in the mid-2010s represented a key technical turning point in the evolution of deepfakes. GANs allowed for the creation of highly realistic fake images and videos by training competing neural networks, achieving a much improved visual fidelity over previous methods of creating the content using rules or by using autoencoders, and formed the basis for modern deepfake methods. === Academic research === Academic research related to deepfakes is split between the field of computer vision, a sub-field of computer science, which develops techniques for creating and identifying deepfakes, and humanities and social science approaches that study the social, ethical, aesthetic implications as well as journalistic and informational implications of deepfakes. As deepfakes have risen in prominence in popularity with innovations provided by AI tools, significant research has gone into detection methods and defining the factors driving engagement with deepfakes on the internet. Deepfakes have been shown to appear on social media platforms and other parts of the internet for purposes ranging from entertainment and education related to deepfakes to misinformation to elicit strong reactions. There are gaps in research related to the propagation of deepfakes on social media. Negativity and emotional response are the primary driving factors for users sharing deepfakes. === Social science and humanities approaches to deepfakes === In cinema studies, deepfakes illustrate how "the human face is emerging as a central object of ambivalence in the digital age". Video artists have used deepfakes to "playfully rewrite film history by retrofitting canonical cinema with new star performers". Film scholar Christopher Holliday analyses how altering the gender and race of performers in familiar movie scenes destabilizes gender classifications and categories. The concept of "queering" deepfakes is also discussed in Oliver M. Gingrich's discussion of media artworks that use deepfakes to reframe gender, including British artist Jake Elwes' Zizi: Queering the Dataset, an artwork that uses deepfakes of drag queens to intentionally play with gender. The aesthetic potentials of deepfakes are also beginning to be explored. Theatre historian John Fletcher notes that early demonstrations of deepfakes are presented as performances, and situates these in the context of theater, discussing "some of the more troubling paradigm shifts" that deepfakes represent as a performance genre. While most English-language academic studies of deepfakes focus on the Western anxieties about disinformation and pornography, digital anthropologist Gabriele de Seta has analyzed the Chinese reception of deepfakes, which are known as huanlian, which translates to "changing faces". The Chinese term does not contain the "fake" of the English deepfake, and de Seta argues that this cultural context may explain why the Chinese response has centered on practical regulatory measures to "fraud risks, image rights, economic profit, and ethical imbalances". === Computer science research on deepfakes === A landmark early project was the "Video Rewrite" program, published in 1997. The program modified existing video footage of a person speaking to depict that person mouthing the words from a different audio track. It was the first system to fully automate this kind of facial reanimation, and it did so using machine learning techniques to make connections between the sounds produced by a video's subject and the shape of the subject's face. Contemporary academic projects have focused on creating more realistic videos and improving deepfake techniques. The "Synthesizing Obama" program, published in 2017, modifies video footage of former president Barack Obama to depict him mouthing the words contained in a separate audio track. The project lists as a main research contribution to its photorealistic technique for synthesizing mouth shapes from audio. The "Face2Face" program, published in 2016, modifies video footage of a person's face to depict them mimicking another person's facial expressions. The project highlights its primary research contribution as the development of the first method for re-enacting facial expressions in real time using a camera that does not capture depth, enabling the technique to work with common consumer cameras. Researchers have also shown that deepfakes are expanding into other domains such as medical imagery. In this work, it was shown how an attacker can automatically inject or remove lung cancer in a patient's 3D CT scan. The result was so convincing that it fooled three radiologists and a state-of-the-art lung cancer detection AI. To demonstrate the threat, the authors successfully performed the attack on a hospital in a White hat penetration test. A survey of deepfakes, published in May 2020, provides a timeline of how the creation and detection of deepfakes have advanced over the last few years. The survey identifies that researchers have been focusing on resolving the following challenges of deepfake creation: Generalization. High-quality deepfakes are often achieved by training on hours of footage of the target. This challenge is to minimize the amount of training data and the time to train the model required to produce quality images and to enable the execution of trained models on new identities (unseen during training). Paired Training. Training a supervised model can produce high-quality results, but requires data pairing. This is the process of finding examples of inputs and their desired outputs for the model to learn from. Data pairing is laborious and impractical when training on multiple identities and facial behaviors. Some solutions include self-supervised training (using frames from the same video), the use of unpaired networks such as Cycle-GAN, or the manipulation of network embeddings. Identity leakage. This is where the identity of the driver (i.e., the actor controlling the face in a reenactment) is partially transferred to the generated face. Some solutions proposed include attention mechanisms, few-shot learning, disentanglement, boundary conversions, and skip connections. Occlusions. When part of the face is obstructed with a hand, hair, glasses, or any other item then artifacts can occur. A common occlusion is a closed mouth which hides the inside of the mouth and the teeth. Some solutions include image segmentation during training and in-painting. Temporal coherence. In videos containing deepfakes, artifacts such as flickering and jitter can occur because the network has no context of the preceding frames. Some researchers provide this context or use novel temporal coherence losses to help improve realism. As the technology improves, the interference is diminishing. Overall, deepfakes are expected to have several implications in media and society, med

    Read more →
  • Generative AI pornography

    Generative AI pornography

    Generative AI pornography or simply AI pornography is a digitally created pornography produced through generative artificial intelligence (AI) technologies. Unlike traditional pornography, which involves real actors and cameras, this content is synthesized entirely by AI algorithms. These algorithms, including generative adversarial networks (GANs) and text-to-image models, generate lifelike images, videos, or animations from textual descriptions or datasets. == Functions and production strategies == AI pornography platforms, beyond account creation and social media linking, primarily enable users to generate sexual images through feature selection or text prompting. Users can customize bodies, clothing, and sociodemographic traits, and browse categorized galleries of user‑generated content. Several sites also support short pornographic videos or GIFs and modification tools such as nudifiers, deepfakes, and facemorphing. Platforms often allow fine‑tuning of parameters such as settings, style, or theme, and provide prompt enhancers or suggestions to improve outputs. Users may edit generated images, refine prior prompts, modify others’ work, or upload personal material as a basis, with iterative and collaborative content creation. Some websites additionally host interactive “erobots,” customizable in real time for appearance, personality, memories, speech, and profession, enabling tailored sexual and non‑sexual interactions. Less common features include VR integration, AI porn games, audio or doodle prompts, and consensual replication of individuals with verification. == History == The use of generative AI in the adult industry began in the late 2010s, initially focusing on AI-generated art, music, and visual content. This trend accelerated in 2022 with Stability AI's release of Stable Diffusion (SD), an open-source text-to-image model that enables users to generate images, including NSFW content, from text prompts using the LAION-Aesthetics subset of the LAION-5B dataset. Despite Stability AI's warnings against sexual imagery, SD's public release led to dedicated communities exploring both artistic and explicit content, sparking ethical debates over open-access AI and its use in adult media. By 2020, AI tools had advanced to generate highly realistic adult content, amplifying calls for regulation. === AI-generated influencers === One application of generative AI technology is the creation of AI-generated influencers on platforms such as OnlyFans and Instagram. These AI personas interact with users in ways that can mimic real human engagement, offering an entirely synthetic but convincing experience. While popular among niche audiences, these virtual influencers have prompted discussions about authenticity, consent, and the blurring line between human and AI-generated content, especially in adult entertainment. === The growth of AI porn sites === By 2023, websites dedicated to AI-generated adult content had gained traction, catering to audiences seeking customizable experiences. These platforms allow users to create or view AI-generated pornography tailored to their preferences. These platforms enable users to create or view AI-generated adult content appealing to different preferences through prompts and tags, customizing body type, facial features, and art styles. Tags further refine the output, creating niche and diverse content. Many sites feature extensive image libraries and continuous content feeds, combining personalization with discovery and enhancing user engagement. AI porn sites, therefore, attract those seeking unique or niche experiences, sparking debates on creativity and the ethical boundaries of AI in adult media. == Ethical concerns and misuse == The growth of generative AI pornography has also attracted some cause for criticism. AI technology can be exploited to create non-consensual pornographic material, posing risks similar to those seen with deepfake revenge porn and AI-generated NCII (Non-Consensual Intimate Image). A 2023 analysis found that 98% of deepfake videos online are pornographic, with 99% of the victims being women. Some famous celebrities victims of deepfake include Scarlett Johansson, Taylor Swift, and Maisie Williams. OpenAI is exploring whether NSFW content, such as erotica, can be responsibly generated in age-appropriate contexts while maintaining its ban on deepfakes. This proposal has attracted criticism from child safety campaigners who argue it undermines OpenAI's mission to develop "safe and beneficial" AI. Additionally, the Internet Watch Foundation has raised concerns about AI being used to generate sexual abuse content involving children. === AI-generated non-consensual intimate imagery (AI Undress) === Generative AI have extensively been used to produce pornography images and videos of non-consenting individuals. 404 Media reported a particular AI generated porn bot on Telegram has more than 100,000 monthly users. Alibaba, the Chinese tech company, released an AI video generation model in 2025 called Wan 2.1, which was modified to produce non-consensual pornography. Several US states are taking actions against using deepfake apps and sharing them on the internet. In 2024, San Francisco filed a landmark lawsuit to shut down "undress" apps that allow users to generate non-consensual AI nude images, citing violations of state laws. The case aligns with California's recent legislation—SB 926, SB 942, and SB 981—championed by Senators Aisha Wahab and Josh Becker and signed by Governor Gavin Newsom. These bills aim to protect individuals from AI-generated explicit images by criminalizing non-consensual distribution, mandating disclosures, and empowering victims to report and remove harmful content from platforms. === Differences from deepfake pornography === While both generative AI pornography and deepfake pornography rely on synthetic media, they differ in their methods and ethical considerations. Deepfake pornography typically involves altering existing footage of real individuals, often without their consent, using AI to superimpose faces, undress said persons, or modify scenes. In contrast, generative AI pornography is created using algorithms, producing hyper-realistic content without the need to upload real pictures of people. Hany Farid, digital image analysis expert, also described the difference between "AI porn" and "deepfake porn." == Legality == The legality of generative AI pornography varies widely by jurisdiction and remains an evolving issue. In some countries, laws addressing digital impersonation, obscenity, or deepfake technologies may indirectly apply, particularly when AI-generated content involves the likeness of real individuals without consent. The absence of a physical performer further complicates traditional regulatory frameworks, which are often grounded in performer protection and distribution laws. In the United States, legal responses have primarily focused on non-consensual deepfakes and impersonation. Some states, such as Virginia, California, and Texas, have enacted legislation criminalising the creation or distribution of non-consensual explicit deepfake content. However, there is no comprehensive federal law addressing AI-generated pornography, leaving a patchwork of legal interpretations and enforcement standards across different jurisdictions. According to a 2023 report, South Korea accounts for approximately 53% of global deepfake pornography production. In September 2024, South Korea's National Assembly amended the Act on Special Cases Concerning the Punishment of Sexual Crimes, introducing two significant reforms related to deepfake content. The first criminalises the possession, viewing, purchase, and storage of non-consensual deepfake material, with penalties of up to three years in prison or fines of up to 30 million won (approximately USD 20,000). The second reform specifically addresses the exploitation of minors, establishing that individuals who use deepfakes to threaten or blackmail minors face a minimum of three years' imprisonment, and at least five years if they coerce minors into unwanted acts. In England and Wales the Data (Use and Access) Act 2025 has legislated against the creation, or the request for creation, of intimate images by nudifying software or websites of another person who has not consented to this. However as of January 2026 this has not yet been brought into force.

    Read more →