AI Assistant Qt

AI Assistant Qt — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Nolot

    Nolot

    Nolot is a chess test suite with 11 positions from real games. They were compiled by Pierre Nolot (French: [nɔ.lo]) for the French chess magazine Gambisco and posted on the rec.games.chess Usenet group in 1994. They were designed to be particularly hard to solve for chess engines to solve at the time, although modern engines can find a solution near-instantaneously. == Problem 1 == FEN: r3qb1k/1b4p1/p2pr2p/3n4/Pnp1N1N1/6RP/1B3PP1/1B1QR1K1 w - - 0 1 26.Nxh6!! c3 (26... Rxh6 27.Nxd6 Qh5 (best) 28.Rg5! Qxd1 29.Nf7+ Kg8 30.Nxh6+ Kh8 31.Rxd1 c3 32.Nf7+ Kg8 33.Bg6! Nf4 34.Bxc3 Nxg6 35.Bxb4 Kxf7 36.Rd7+ Kf6 37.Rxg6+ Kxg6 38.Rxb7 ±) 27.Nf5! cxb2 28.Qg4 Bc8 (28... g6!? 29.Kh2! 29.Qd7 30.Nh4 Bc6 31.Nc5! dxc 32.Rxe6 Nf6 33.Nxg6+ Kg7 34.Qg5 Nbd5 35.Ne5 Kh8 36.Nxd7 ±) 29.Qh4+ Rh6 30.Nxh6 gxh6 31.Kh2! Qe5 32.Ng5 Qf6 33.Re8 Bf5 34.Qxh6 (missing a mate in 6: 34.Nf7+ Qxf7 35.Qxh6+ Bh7 36.Rxa8 Nf6 37.Rxf8 Qxf8 38.Qxf8+ Ng8 39.Qg7#) 34...Qxh6 35.Nf7+ Kh7 36.Bxf5+ Qg6 37.Bxg6+ Kg7 38.Rxa8 Be7 39.Rb8 a5 40.Be4+ Kxf7 41.Bxd5+ 1–0 The best Novag computer, the Diablo 68000, finds 26. Nxh6 after seven and a half months (Pierre Nolot has let it run on the position for 14 months and one day, until a power failure stopped an analysis of over 80,000,000,000 nodes.) but for wrong reasons: it evaluates white's position as inferior and thinks this move would enable it to draw. Today Gambit Tiger 2.0 for example can find it quite quickly: Most free engines running on 64-bit processors in 2010 could solve this problem and the others in a few seconds. 1.Qd4 c3 2.Bxc3 Nxc3 3.Qxb4 Nxe4 4.Qxb7 Rb8 5.Qxb8 Qxb8 6.Bxe4 d5 7.Rb1 μ (-1.20) Depth: 12 00:00:09 6055 kN 1.Nxh6 c3 2.Nf5 cxb2 3.Qg4 Rb8 4.Nxg7 Rg6 5.Qxg6 Qxg6 6.Rxg6 Bxg7 7.Nxd6 ³ (-0.48) Depth: 12 00:00:21 14368 kN 1.Nxh6 c3 2.Nf5 cxb2 3.Qg4 Rc8 4.Nxg7 Rg6 5.Nxe8 Rxg4 6.Rxg4 Rxe8 7.Rg6 μ (-0.74) Depth: 13 00:00:55 38455 kN 1.Ne3 Rxe4 2.Bxe4 Qxe4 3.Nxd5 Qxd5 4.Qc1 Qf5 5.Qxh6+ Qh7 6.Qe6 Nd3 7.Re2 Nxb2 8.Rxb2 ³ (-0.58) Depth: 13 00:01:30 62979 kN 1.Ne3 Rxe4 ³ (-0.58) Depth: 14 00:02:02 84941 kN 1.Ne3 Nxe3 2.Rexe3 Bxe4 3.Qg4 Rg6 4.Qxe4 Qxe4 5.Bxe4 Rxg3 6.Rxg3 d5 7.Bf5 Re8 8.Bc3 ³ (-0.30) Depth: 15 00:03:05 128968 kN 1.Nxh6 ² (0.32) Depth: 15 00:07:58 350813 kN With the next ply showing a clear advantage. Stockfish 14dev 64bit 4CPU running on 2020 hardware recognises the significance of Nxh6!! in 1 second. Stockfish_21092606_x64_avx2: NNUE evaluation using nn-13406b1dcbe0.nnue enabled. 19/32 00:01 7708k 4882k +3,00 Nxh6 Rxh6 Nxd6 Qh5 Bg6 Qxd1 Nf7+ Kg8 Nxh6+ gxh6 Bh5+ Kh7 Rxd1 c3 Bxc3 Nxc3 Rd7+ Kh8 Rxb7 Ne4 Re3 Nxf2 Kxf2 Bc5 Ke2 Bxe3 Kxe3 Nd5+ Kf2 49/73 15:02 5118270k 5673k +6,15 Nxh6 Rxh6 Nxd6 Qh5 Rg5 Qxd1 Nf7+ Kg8 Nxh6+ Kh8 Rxd1 c3 Nf7+ Kg8 Bg6 Nf4 Bxc3 Nbd5 Rb1 Bc6 Bd2 Nxg6 Rxg6 Ne7 Rxc6 Nxc6 Rb6 Rc8 Ng5 a5 Ra6 Bb4 Be3 Ne5 Bd4 Nc6 Bb6 Bd2 h4 Kf8 Bc5+ Kg8 Be3 Bxe3 fxe3 Kf8 Kf2 Ke7 Nf3 Kd7 Rb6 Ne7 Rb5 Kd6 Rxa5 Rc2+ Kg3 Re2 Nd4 Rxe3+ Kf4 Rd3 Nf5+ Kc7 Nxe7 == Problem 2 == FEN: r4rk1/pp1n1p1p/1nqP2p1/2b1P1B1/4NQ2/1B3P2/PP2K2P/2R5 w - - 0 1 22.Rxc5!! Nxc5 23.Nf6+ Kh8 24.Qh4 Qb5+ (computers think there is perpetual check here, but...) 25.Ke3! 25... h5 26.Nxh5 Qxb3+ (26... d5+ 27.Bxd5 Qd3 28.Kf2 Ne4+ 29.Bxe4 Qd4+ 30.Kg2 Qxb2+ 31.Kh3 ±) and White won in 41 moves. Today Deep Junior 8.ZX for example finds it very quickly (around 1 minute): 1.Kd1 Rac8 2.Bh6 Qb5 3.Rc3 Qf1+ 4.Kc2 Rc6 5.Bxf8 −+ (-2.11) Depth: 12 00:00:04 10422 kN 1.Nxc5 Nxc5 2.Rxc5 Qxc5 3.e6 Rae8 4.e7 Nc8 5.Kf1 Nxd6 6.Bf6 b5 −+ (-2.10) Depth: 12 00:00:14 25054 kN 1.Bf6! μ (-1.35) Depth: 12 00:00:17 34601 kN 1.Bf6 Qb5+ 2.Ke1 Bb4+ 3.Kf2 Bc5+ = (0.00) Depth: 12 00:00:20 34601 kN 1.Bf6 Qb5+ 2.Ke1 Nxf6 3.Nxf6+ Kg7 4.Nh5+ gxh5 5.Qf6+ Kg8 6.Qg5+ Kh8 7.Qf6+ = (0.00) Depth: 15 00:01:01 130544 kN 1.Rxc5! = (0.15) Depth: 15 00:01:12 145875 kN 1.Rxc5 Nxc5 2.Nf6+ Kh8 3.Qh4 Qb5+ 4.Ke3 h5 5.Nxh5 Qd3+ 6.Kf2 Ne4+ 7.fxe4 Qd4+ 8.Kf1 Qd3+ 9.Ke1 Qb1+ 10.Bd1 ± (2.18) Depth: 15 00:01:18 145875 kN Stockfish 14dev 64bit 4CPU running on 2020 hardware recognises the significance of Rxc5!! in 1 second. Stockfish_21092606_x64_avx2: NNUE evaluation using nn-13406b1dcbe0.nnue enabled. 21/25 00:01 5822k 5545k +6,61 Rxc5 Qxc5 Nxc5 Nxc5 Bh6 Nbd7 Bxf8 Rxf8 Qe3 Rc8 f4 Nxe5 Qxe5 Ne6 Bxe6 Rc2+ Kd3 Rxh2 46/86 11:27 5057055k 7355k +7,61 Rxc5 Qxc5 Nxc5 Nxc5 Bf6 Ne6 Qh6 Nd4+ Kf2 Nf5 Qg5 Nd7 h4 Nxf6 Qxf6 Ng7 d7 b5 Bd5 Rab8 b4 Nh5 Bxf7+ Rxf7 d8R+ Rxd8 Qxd8+ Rf8 Qd5+ Kg7 e6 Kf6 Qd7 Ng7 Qd4+ Kxe6 Qxg7 Rf7 Qc3 Ke7 Qc5+ Ke8 Qc8+ Ke7 h5 gxh5 Kg3 h4+ Kh2 h6 Qc5+ Kf6 Qxb5 Kg7 f4 Rxf4 Qe5+ Rf6 b5 h3 Qd4 Kg8 Qxf6 h5 Blacks 22. .. Nxc5 is suboptimal and leads faster mate 77/44 09:18 6987714k 12518k +M22 Nf6+ Kh8 Qh4 Qb5+ Ke3 Qxb3+ axb3 h5 Nxh5 Nd5+ Kd4 Ne6+ Kxd5 Nxg5 Qxg5 gxh5 f4 Rad8 f5 f6 Qxh5+ Kg7 Qg6+ Kh8 e6 b6 e7 Rb8 exf8Q+ Rxf8 Ke6 b5 Ke7 Rb8 Qh5+ Kg7 Qf7+ Kh8 Kxf6 Rf8 Qxf8+ Kh7 Qg7+ == Problem 3 == FEN: r2qk2r/ppp1b1pp/2n1p3/3pP1n1/3P2b1/2PB1NN1/PP4PP/R1BQK2R w KQkq - 0 1 12.Nxg5!! Bxd1 13.Nxe6 Qb8 14.Nxg7+!! Kf8 15.Bh6! Bg4 16.0-0+ Kg8 17.Rf4 ± White wins with a queen sac but black has defensive resources. Stockfish 8 64bit 3CPU running on 2016 hardware recognizes the significance of Nxg5!! in 55 seconds. Stockfish 14 dev (Stockfish_21092606_x64_avx2) 64bit 4CPU running on 2020 hardware recognizes the significance of Nxg5!! in 1 second. NNUE evaluation using nn-13406b1dcbe0.nnue enabled. 21/34 00:01 8291k 4530k +2,78 Nxg5 Bxd1 Nxe6 Qb8 Nxg7+ Kd8 Kxd1 b5 N3f5 Bf8 Rf1 Kc8 Nh5 Kb7 Bxb5 Ne7 g4 a6 Ba4 Nxf5 gxf5 Ka7 Nf4 c5 47/59 37:49 10390430k 4578k +3,16 Nxg5 Bxd1 Nxe6 Qb8 Nxg7+ Kd8 Kxd1 b5 Rf1 Kc8 N3f5 Bf8 Ne6 Kd7 Nf4 Ne7 g4 a5 Ke2 Qb7 h4 Ra6 a3 Kc8 Be3 Kb8 Kf3 Rb6 Bd2 Qc8 Kg3 c5 Be3 c4 Nxe7 Bxe7 Bf5 Qd8 h5 Qg8 Kh3 Bg5 Rf3 Ra6 Raf1 b4 Nxd5 Qxd5 Bxg5 bxc3 bxc3 Rb6 Be3 Rb3 Blacks 14 .. Kf8 is suboptimal and leads loss fast 41/68 06:31 3269727k 8350k +9,28 Bh6 Kg8 Rxd1 Bf8 N3h5 Bxg7 Nxg7 Qf8 Nf5 Ne7 Bxf8 Nxf5 Bxf5 Rxf8 Be6+ Kg7 Rd3 Rf4 Bxd5 c6 Rg3+ Kf8 Rf3 Rxf3 Bxf3 Kg7 Rf1 Re8 Be4 Re6 Ke2 a5 Ke3 Rh6 h3 a4 Kf4 Re6 h4 Re8 Ke3 h6 h5 Rf8 Rxf8 Kxf8 == Problem 4 == FEN: r1b1kb1r/1p1n1ppp/p2ppn2/6BB/2qNP3/2N5/PPP2PPP/R2Q1RK1 w kq - 0 1 10.Nxe6!! Qxe6 11.Nd5 Kd8 12.Bg4 Qe5 13.f4 Qxe4 (13...Qxb2 stronger but not sufficient: 14.Bxd7 Bxd7 15.Rb1 Qa3 16.Nxf6 Bb5 17.Qd4 Qc5 18.Rfd1 ±) 14.Bxd7 Bxd7 15.Nxf6 gxf6 16.Bxf6+ Kc7 17.Bxh8 and Black resigned on move 27. Stockfish 14dev 64bit 4CPU running on 2020 hardware recognises the significance of 10.Nxe6 in 1 second. Stockfish_21092606_x64_avx2: NNUE evaluation using nn-13406b1dcbe0.nnue enabled. 22/37 00:01 6955k 5367k +4,00 Nxe6 Qxe6 Nd5 Kd8 Bg4 Qe5 f4 Qxb2 Rb1 Qa3 Bxd7 Bxd7 Nxf6 Bb5 Rf3 Qxa2 c4 Bxc4 Rf2 Qa5 Nd5+ f6 Nxf6 Kc7 Rc1 b5 Qd5 gxf6 Bxf6 Kb8 Rxc4 Qe1+ Rf1 51/70 47:10 14538911k 5137k +5,76 Nxe6 Qxe6 Nd5 Kd8 Bg4 Qe5 f4 Qxe4 Bxd7 Bxd7 Nxf6 Qf5 Qd4 Kc8 Nd5 Bc6 c4 f6 Nb6+ Kb8 Bh4 Be7 Rae1 Bd8 Nxa8 Kxa8 Bf2 Kb8 Qxd6+ Bc7 Ba7+ Kc8 Qe6+ Qxe6 Rxe6 h5 h4 Rd8 Re7 g6 Be3 Ba5 Kf2 Rd6 Rc1 Bd8 Rg7 Be4 Rg8 Kd7 c5 Rd3 Rc4 Bd5 Rg7+ Ke6 Rd4 Rxd4 Bxd4 Kf5 Rd7 Bc6 Rxd8 Kxf4 Bxf6 == Problem 5 == FEN: r2qrb1k/1p1b2p1/p2ppn1p/8/3NP3/1BN5/PPP3QP/1K3RR1 w - - 0 1 21.e5!! dxe5 22.Ne4! Nh5 23.Qg6!? (stronger is 23.Qg4!! Nf4 24.Nf3 Qc7 25.Nh4 ± ) 23...exd4? (23...Nf4 24.Rxf4! exf4 25.Nf3! Qb6 26.Rg5!! covering b5 and threatening Nf6 or Ne5-f7+) 24.Ng5 1−0 Stockfish 8 64bit 3CPU running on 2016 hardware recognises the significance of 21.e5 in 5 seconds. Stockfish 12 dev (Stockfish_20062212_x64_modern) 64bit 1CPU running on 2016 hardware recognizes the significance of 21.e5 in 11 seconds. 25/42 00:06 7 963k 1309k +6,93 e5 Nh5 Ne4 dxe5 Nf3 Nf4 Qg4 Qc7 Nh4 Bc6 Nf6 g5 Rxf4 exf4 Qh5 Qe7 Ng6+ Kg7 Nxe7 Rxe7 Ng4 37/62 03:12 298 083k 1545k +10,70 e5 Ng4 Qxg4 Qg5 Qh3 Qxe5 Nde2 g5 Rxf8+ Kg7 Rff1 Rf8 Re1 Qf5 Qg3 Rad8 Nd4 Qf4 Nxe6+ Bxe6 Rxe6 Qxg3 == Problem 6 == FEN: rnbqk2r/1p3ppp/p7/1NpPp3/QPP1P1n1/P4N2/4KbPP/R1B2B1R b kq - 0 1 13... axb5!! offers an exchange to keep the white queen out of play. 14.Qxa8 Bd4 15.Nxd4 cxd4 16.Qxb8 0-0! 17.Ke1 Qh4 18.g3 Qf6 19.Bf4 g5? (Ivanchuk found 19...d3! during post-game analysis.) 20.Rc1 exf4 21.Qxf4 Qd4 22.Rd1 bxc4 23.e5 Qc3+ 24.Rd2 Re8 25.Bxd3 cxd3 −+ Tasc R30 finds 19... d3! in 2 1/2 hours. 19... Bf5!! is even stronger than 19... d3. Position is already lost at 19... d3 +8.00 for black, ... Bf5 not much better Stockfish 14dev 64bit 4CPU running on 2020 hardware recognises the significance of axb5!! in 1 second. Stockfish_21092606_x64_avx2: NNUE evaluation using nn-13406b1dcbe0.nnue enabled. 21/28 00:01 9264k 4714k -1,22 axb5 Qxa8 Bd4 Nxd4 cxd4 h3 Nf6 Bg5 0-0 cxb5 h6 Bxf6 Qxf6 Re1 Nd7 Kd1 Qg6 Qa4 Qg3 Qc2 Qxa3 Bd3 Qxb4 Qb1 46/67 1:05:00 18113493k 4644k -2,40 axb5 Qxa8 Bd4 h3 Nf6 Nxd4 exd4 Kf2 Nxe4+ Kg1 Nd7 Bg5 Qxg5 Qxc8+ Ke7 Qc7 Qe5 d6+ Qxd6 Qxd6+ Kxd6 bxc5+ Ndxc5 cxb5 d3 h4 d2 Rh3 Ke5 Be2 f5 Ra2 Rd8 Bd1 Rd4 Re3 f4 Re2 b6 a4 Kd6 Rc2 Kd5 Ra2 h6 Rb2 Nxa4 Bxa4 Rxa4 Rexd2+ Nxd2 Rxd2+ Kc4 Rd7 g6 == Problem 7 == FEN 1r1bk2r/2R2ppp/p3p3/1b2P2q/4QP2/4N3/1B4PP/3R2K1 w k - 0 1 1.Rxd8+!! Rxd8 (1...Kxd8 2.Ra7! Qe2 3.Qd4+ Ke8 4.h3 Qe1+ 5.Kh2 Rd8 6.Qc5 Qh4 7.Ba3 Rd7 8.Ra8+ Rd8 9.g3 1−0)

    Read more →
  • Veo (text-to-video model)

    Veo (text-to-video model)

    Veo, or Google Veo, is a text-to-video model developed by Google DeepMind and announced in May 2024. As a generative AI model, it creates videos based on user prompts. Veo 3, released in May 2025, can also generate accompanying audio. == Development == In May 2024, a multimodal video generation model called Veo was announced at Google I/O 2024. Google claimed that it could generate 1080p videos over a minute long. In December 2024, Google released Veo 2, available via VideoFX. It supports 4K resolution video generation and has an improved understanding of physics. In April 2025, Google announced that Veo 2 became available for advanced users on the Gemini app. In May 2025, Google released Veo 3, which not only generates videos but also creates synchronized audio — including dialogue, sound effects, and ambient noise — to match the visuals. Google also announced Flow, a video-creation tool powered by Veo and Imagen. Google DeepMind CEO Demis Hassabis described the release as the moment when AI video generation left the era of the silent film. This was rebranded as Google Flow at the 2026 Google I/O keynote, along with the announcement of Google Flow Music. == Capabilities == Google Veo can be purchased at multiple subscription tiers and through Google "AI credits". The software itself can be run by two different consoles, Google Gemini and Google Flow. Gemini being geared towards shorter, quicker, and faster projects, using the Gemini AI chat model, with Google Flow, which is essentially a movie editor allowing users to create longer projects with continuity, using the same characters and actors. Users can create a maximum of eight seconds per clip. According to Gizmodo Veo 3 users were directing the model to generate low-quality content, such as man on the street interviews or haul videos of people unboxing products. 404 Media reported that the tool tended to repeat the same joke in response to different prompts. Commentators speculated that Google had trained the service on YouTube videos or Reddit posts. Google itself had not stated the source of its training content. In July 2025, Media Matters for America reported that racist and antisemitic videos generated using Veo 3 were being uploaded to TikTok. Ryan Whitwam of Ars Technica commented, "In a perfect world, Veo 3 would refuse to create these videos, but vagueness in the prompt and the AI's inability to understand the subtleties of racist tropes (i.e., the use of monkeys instead of humans in some videos) make it easy to skirt the rules."

    Read more →
  • AI art

    AI art

    Artificial intelligence visual art, or AI art, is visual artwork generated or enhanced through the implementation of artificial intelligence (AI) programs, most commonly using text-to-image models. The process of automated art-making has existed since antiquity. The field of artificial intelligence was founded in the 1950s, and artists began to create art with artificial intelligence shortly after the discipline's founding. A select number of these creations have been showcased in museums and have been recognized with awards. Throughout its history, AI has raised many philosophical questions related to the human mind, artificial beings, and the nature of art in human–AI collaboration. During the AI boom of the 2020s, text-to-image models such as Midjourney, DALL-E and Stable Diffusion became widely available to the public, allowing users to quickly generate imagery with little effort. Commentary about AI art in the 2020s has often focused on issues related to copyright, deception, defamation, and its impact on more traditional artists, including technological unemployment. In August 2023, the US Supreme Court ruled that AI art is ineligible for copyright due to failure to meet human authorship. In March 2026, it declined to hear a case over whether AI-generated art can be subject to copyright. == History == === Early history === Automated art dates back at least to the automata of ancient Greek civilization, when inventors such as Daedalus and Hero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music. Creative automatons have flourished throughout history, such as Maillardet's automaton, created around 1800 and capable of creating multiple drawings and poems. Also in the 19th century, Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems. In 1950, Alan Turing's paper "Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly. Shortly after, the academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956. Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction, and philosophy since antiquity. === Artistic history === Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as algorithmic art, computer art, digital art, or new media art. One of the first significant AI art systems is AARON, developed by Harold Cohen beginning in the late 1960s at the University of California at San Diego. AARON uses a symbolic rule-based approach to generate technical images in the era of GOFAI programming, and it was developed by Cohen with the goal of being able to code the act of drawing. AARON was exhibited in 1972 at the Los Angeles County Museum of Art. From 1973 to 1975, Cohen refined AARON during a residency at the Artificial Intelligence Laboratory at Stanford University. In 2024, the Whitney Museum of American Art exhibited AI art from throughout Cohen's career, including re-created versions of his early robotic drawing machines. Karl Sims has exhibited art created with artificial life since the 1980s. He received an M.S. in computer graphics from the MIT Media Lab in 1987 and was artist-in-residence from 1990 to 1996 at the supercomputer manufacturer and artificial intelligence company Thinking Machines. In both 1991 and 1992, Sims won the Golden Nica award at Prix Ars Electronica for his videos using artificial evolution. In 1997, Sims created the interactive artificial evolution installation Galápagos for the NTT InterCommunication Center in Tokyo. Sims received an Emmy Award in 2019 for outstanding achievement in engineering development. In 1999, Scott Draves and a team of several engineers created and released Electric Sheep as a free software screensaver. Electric Sheep is a volunteer computing project for animating and evolving fractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for Electric Sheep. In 2014, Stephanie Dinkins began working on Conversations with Bina48. For the series, Dinkins recorded her conversations with BINA48, a social robot that resembles a middle-aged black woman. In 2019, Dinkins won the Creative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color." In 2015, Sougwen Chung began Mimicry (Drawing Operations Unit: Generation 1), an ongoing collaboration between the artist and a robotic arm. In 2019, Chung won the Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. In 2018, an auction sale of artificial intelligence art was held at Christie's in New York where the AI artwork Edmond de Belamy sold for US$432,500, which was almost 45 times higher than its estimate of US$7,000–10,000. The artwork was created by Obvious, a Paris-based collective. In 2024, Japanese film generAIdoscope was released. The film was co-directed by Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence. In 2025, the Japanese anime television series Twins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software. === Technical history === Deep learning, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art. During the deep learning era, there are mainly these types of designs for generative art: autoregressive models, diffusion models, GANs, normalizing flows. In 2014, Ian Goodfellow and colleagues at Université de Montréal developed the generative adversarial network (GAN), a type of deep neural network capable of learning to mimic the statistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific aesthetic by analyzing a dataset of example images. In 2015, a team at Google released DeepDream, a program that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia. The process creates deliberately over-processed images with a dream-like appearance reminiscent of a psychedelic experience. Later, in 2017, a conditional GAN learned to generate 1000 image classes of ImageNet, a large visual database designed for use in visual object recognition software research. By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models. Autoregressive models were used for image generation, such as PixelRNN (2016), which autoregressively generates one pixel after another with a recurrent neural network. Immediately after the Transformer architecture was proposed in Attention Is All You Need (2018), it was used for autoregressive generation of images, but without text conditioning. The website Artbreeder, launched in 2018, uses the models StyleGAN and BigGAN to allow users to generate and modify images such as faces, landscapes, and paintings. In the 2020s, text-to-image models, which generate images based on prompts, became widely used, marking yet another shift in the creation of AI-generated artworks. In 2021, using the influential large language generative pre-trained transformer models that are used in GPT-2 and GPT-3, OpenAI released a series of images created with the text-to-image AI model DALL-E 1. It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021, EleutherAI released the open source VQGAN-CLIP based on OpenAI's CLIP model. Diffusion models, generative models used to create synthetic data based on existing data, were first proposed in 2015, but they only became better than GANs in early 2021. Latent diffusion model was published in December 2021 and became the basis for the later Stable Diffusion (August 2022), developed through a collaboration between Stability AI, CompVis Group at LMU Munich, and Runway. In 2022, Midjourney was released, followed by Google Brain's Imagen and Pa

    Read more →
  • CogX Festival

    CogX Festival

    CogX Festival is a global festival focusing on the impact of artificial intelligence (AI) and emerging technology on industry, government, and society. It takes place annually, usually in September, in London, England. Founded by Charlie Muirhead and Tabitha Goldstaub in 2017, CogX aims to facilitate dialogue and understanding about AI and its implications across various sectors. CogX Festival 2023 was held from September 12 to September 14 across multiple sites in London. == History == The inaugural CogX event took place in 2017, intending to bring together experts from diverse fields to discuss the role and impact of AI and emerging technologies. Since then, it has evolved to include a broader range of topics and attract a diverse audience. In 2018, the first CogX Awards festival was hosted. That year, over 50 awards were shown to 300 guests. In 2021, CogX and Hopin, a video conferencing software, signed an agreement lasting 4 years to make CogX a hybrid conference due to the COVID-19 pandemic. CogX 2021 attracted over 5,000 attendees in-person and over 100,000 virtually. In 2022, they returned to a live event format after two years of hybrid events and controlled physical attendance. They also launched the CogX app, which curated insights from the world's top podcasts. In 2023, after he had delivered the keynote address guest speaker Stephen Fry fell off the stage and subsequently broke his leg, hip, pelvis and a "bunch of ribs". A court filing in 2026 revealed that Fry was seeking £100,000 in damages from CogX Festival Ltd and creative agency Blonstein Events. == Programming == The festival features sessions, discussions, workshops, and exhibitions, encompassing various domains of AI and technology. In recent CogX Festivals, they have featured summits encompassing topics like global leadership and industry transformation.

    Read more →
  • DreamLab

    DreamLab

    DreamLab was a volunteer computing Android and iOS app launched in 2015 by Imperial College London and the Vodafone Foundation. It was discontinued on 2nd April 2025. == Description == The app helped to research cancer, COVID-19, new drugs and tropical cyclones. To do this, DreamLab accessed part of the device's processing power, with the user's consent, while the owner charged their smartphone, to speed up the calculations of the algorithms from Imperial College London. The aim of the tropical cyclone project was to prepare for climate change risks. Other projects aimed to find existing drugs and food molecules that could help people with COVID-19 and other diseases. The performance of 100,000 smartphones would reach the annual output of all research computers at Imperial College in just three months, with a nightly runtime of six hours. The app was developed in 2015 by the Garvan Institute of Medical Research in Sydney and the Vodafone Foundation. In May 2020, the project had over 490,000 registered users.

    Read more →
  • Tip and cue

    Tip and cue

    Tip and cue, sometimes referred to as tip and que, tipping and cueing, or tipping and queing, is a method for satellite imagery and reconnaissance satellites to automatically coordinate tracking of objects across different satellites in real or near real-time. This technique ensures continuous tracking of targets as they move across different regions by handing them off between satellites, sharing satellite imagery and collateral across discrete satellites. The coordination between various satellites and their complementary sensors allows for more accurate and efficient data collection. This system is particularly useful in scenarios requiring real-time monitoring and rapid response; the method significantly improves situational awareness and operational effectiveness. Tip and cue techniques involve integrating various sensor systems, each playing a specific role in the tracking process. As a target moves, it is handed off from one satellite to another, ensuring continuous monitoring. This coordination optimizes data collection and analysis, enhancing overall tracking accuracy. The real-time information gathered by these satellites is critical for decision-making in various applications, including defense and surveillance. By leveraging multiple satellites and their sensors, it provides broader coverage and more reliable tracking, and the continuous handoff between satellites ensures there are no gaps in monitoring, essential for high-stakes applications. The real-time data provided by this system allows for timely and informed decisions, improving response times and outcomes. Tip and cue methodologies are a part of geospatial intelligence, or GEOINT. Robert Cardillo, a former director of the National Geospatial-Intelligence Agency, highlighted the importance of tip and cue methods to their data collection efforts in 2015. == Historical Development == The concept of tip and cue in satellite monitoring has its origins in early military applications designed to enhance missile detection and tracking systems. During the Cold War, advancements in infrared sensing technologies laid the groundwork for more sophisticated tip and cue techniques. The integration of different sensor types, such as radar and optical sensors, in the 1990s expanded the capabilities of tip and cue systems beyond military applications. These advancements have made tip and cue techniques essential for various civilian uses, including disaster monitoring and environmental surveillance. Significant progress was made with the advent of high-speed data processing and communication technologies in the early 2000s, further refining the method. Advanced algorithms and data fusion techniques have been introduced to better integrate information from multiple sensors. Machine learning technologies now play a crucial role in improving detection and prediction capabilities, allowing for more adaptive and efficient tracking. Richmond and Brennan of Lockheed Martin, presenting to the annual technical conference of the Maui Space Surveillance Complex (formerly the Air Force Maui Optical Station (AMOS)), discussed the algorithms needed for 'tip and cue', to facilitate "multi-phenomenology data fusion." The Space Surveillance Telescope (SST) at Naval Communication Station Harold E. Holt in Australia, operated by the United States Space Force and designed by the Massachusetts Institute of Technology Lincoln Laboratory, was reported by the Defense Advanced Research Projects Agency (DARPA) to be a leader in creating and improving tip and cue techniques, from a large library of orbital object data. == Technical overview == Tip and cue systems utilize a network of at least two satellites equipped with complementary sensor technologies to track moving objects in real-time. The method involves detecting a target with a primary sensor, such as an infrared or photographic sensor, which then cues secondary sensors on the same or other satellites for more detailed monitoring. This handoff process between discrete systems ensures continuous tracking as the target moves across different areas, leveraging each systems strengths. Data collected by these systems and sensors are rapidly processed and shared among the network, enhancing situational awareness. This coordination optimizes resource usage and improves the accuracy of tracking moving objects over large areas. The primary sensors detect initial targets based on specific signatures, such as heat or movement, and then cue secondary sensors to gather more precise data. This ensures that each sensor operates within its optimal range, maintaining high tracking accuracy and reliability. The integration of various sensor types, including optical, radar, and infrared, allows the system to function effectively under different conditions and environments. Real-time data processing and communication between satellites and ground stations are crucial for timely and accurate target tracking. Satellites using tip and cue processes may use either passive or active scanning methodoloigies. These systems may also leverage both orbital and ground-based ELINT (electronic signals intelligence). == Known use cases == Tip and cue systems have been extensively utilized in military applications, particularly for missile detection and defense. These systems enable early detection of missile launches using infrared sensors, which then cue other sensors to track the missile's trajectory more accurately. In environmental monitoring, tip and cue techniques help track natural disasters such as wildfires and hurricanes by coordinating various satellite sensors for comprehensive data collection and analysis. Surveillance and reconnaissance operations also benefit from tip and cue systems, which provide continuous and precise tracking of moving objects, enhancing situational awareness. Additionally, these systems are used in maritime surveillance to monitor ship movements and detect illegal activities such as smuggling and piracy. Tip and cue systems are used in disaster management. For instance, during wildfires, infrared sensors can detect heat signatures, prompting other sensors to gather detailed imagery and data on fire spread and intensity. This coordinated approach allows for real-time monitoring and rapid response, crucial for mitigating damage and saving lives. Similarly, in hurricane tracking, satellites equipped with various sensors can monitor storm development and progression, providing timely information for emergency management agencies. The integration of multiple sensor types ensures accurate and comprehensive coverage of these dynamic and fast-changing events. In maritime surveillance, or maritime domain awareness (MDA), tip and cue systems enhance the detection and monitoring of vessel movements, contributing to maritime security. By coordinating satellite sensors, these systems can track ships over vast ocean areas, identifying potential threats or illegal activities such as smuggling, piracy, and illegal fishing. The ability to maintain continuous surveillance and share data in real-time with maritime authorities improves response times and enforcement capabilities. This application of tip and cue systems not only aids in law enforcement but also supports environmental conservation efforts by monitoring protected marine areas. Automatic Identification System (AIS) is one of the most important sources of data for the MDA agencies. AIS is used in order for ships to know each other's whereabouts, they transmit a signal from ship to ship and to shore. Lately, the system has been developed into satellite system, so called satellite AIS, which makes the system more effective. All ocean-going vessels above 300 tons, are supposed to use and transmit via AIS according to the International Maritime Organization. The satellite constellations help facilitate this with tip and cue methodologies.

    Read more →
  • List of artificial intelligence artists

    List of artificial intelligence artists

    Many notable artificial intelligence artists have created a wide variety of artificial intelligence art from the 1960s to today. These include: == 20th century == Harold Cohen, active from 1960s to 2010s. Cohen's work is primarily with AARON, a series of computer programs that autonomously create original images. Eric Millikin, active from 1980s to present. Millikin's work includes AI-generated virtual reality, video art, poetry, music, and performance art, on topics such as animal rights, climate change, anti-racism, witchcraft, and the occult. Karl Sims, active from 1980s to present. Sims is best known for using particle systems and artificial life in computer animation. == 21st century == Refik Anadol, active from 2010s to present. Anadol's work includes video installations based on generative algorithms with artificial intelligence. Sougwen Chung, active from 2010s to present. Chung's work includes performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. Stephanie Dinkins, active from 2010s to present. Dinkins' work includes recordings of conversations with an artificially intelligent robot that resembles a black woman, discussing topics such as race and the nature of being. Jake Elwes, active from 2010s to present. Their practice is the exploration of artificial intelligence, queer theory and technical biases. Libby Heaney, active from 2010s to present. Heaney's practice includes work with chatbots. Mario Klingemann, active from 2010s to present. Klingemann's works examine creativity, culture, and perception through machine learning and artificial intelligence. Mauro Martino, active from 2010s to present. Martino's work includes design, data visualization and infographics. Trevor Paglen, active from 2000s to present. Paglen's practice includes work in photography and geography, on topics like mass surveillance and data collection. Anna Ridler, active from 2010s to present. Ridler works with collections of information, including self-generated data sets, often working with floral photography.

    Read more →
  • Fuzzy finite element

    Fuzzy finite element

    The fuzzy finite element method combines the well-established finite element method with the concept of fuzzy numbers, the latter being a special case of a fuzzy set. The advantage of using fuzzy numbers instead of real numbers lies in the incorporation of uncertainty (on material properties, parameters, geometry, initial conditions, etc.) in the finite element analysis. One way to establish a fuzzy finite element (FE) analysis is to use existing FE software (in-house or commercial) as an inner-level module to compute a deterministic result, and to add an outer-level loop to handle the fuzziness (uncertainty). This outer-level loop comes down to solving an optimization problem. If the inner-level deterministic module produces monotonic behavior with respect to the input variables, then the outer-level optimization problem is greatly simplified, since in this case the extrema will be located at the vertices of the domain.

    Read more →
  • Tay (chatbot)

    Tay (chatbot)

    Tay was a chatbot that was originally released by Microsoft Corporation as a Twitter bot on March 23, 2016. It caused subsequent controversy when the bot began to post inflammatory and offensive tweets through its Twitter account, causing Microsoft to shut down the service only 16 hours after its launch. According to Microsoft, this was caused by trolls who "attacked" the service as the bot made replies based on its interactions with people on Twitter. It was replaced with Zo. == Background == The bot was created by Microsoft's Technology and Research and Bing divisions, and named "Tay" as an acronym for "thinking about you". Although Microsoft initially released few details about the bot, sources mentioned that it was similar to or based on Xiaoice, a Microsoft project in China. Ars Technica reported that, since late 2014 Xiaoice had had "more than 40 million conversations apparently without major incident". Tay was designed to mimic the language patterns of a 19-year-old American girl, and to learn from interacting with human users of Twitter. == Initial release == Tay was released on Twitter on March 23, 2016, under the name TayTweets and handle @TayandYou. It was presented as "The AI with zero chill". Tay started replying to other Twitter users, and was also able to caption photos provided to it into a form of Internet memes. Ars Technica reported Tay experiencing topic "blacklisting": Interactions with Tay regarding "certain hot topics such as Eric Garner (killed by New York police in 2014) generate safe, canned answers". Some Twitter users began tweeting politically incorrect phrases, teaching it inflammatory messages revolving around common themes on the internet, such as "redpilling" and "Gamergate". As a result, the robot began releasing racist and sexist messages in response to other Twitter users. Artificial intelligence researcher Roman Yampolskiy commented that Tay's misbehavior was understandable because it was mimicking the deliberately offensive behavior of other Twitter users, and Microsoft had not given the bot an understanding of inappropriate behavior. He compared the issue to IBM's Watson, which began to use profanity after reading entries from the website Urban Dictionary. Many of Tay's inflammatory tweets were a simple exploitation of Tay's "repeat after me" capability. It is not publicly known whether this capability was a built-in feature, or whether it was a learned response or was otherwise an example of complex behavior. However, not all of the inflammatory responses involved the "repeat after me" capability; for example, when asked if the Holocaust had happened, Tay answered "It was made up". == Suspension == Soon, Microsoft began deleting Tay's inflammatory tweets. Abby Ohlheiser of The Washington Post theorized that Tay's research team, including editorial staff, had started to influence or edit Tay's tweets at some point that day, pointing to examples of almost identical replies by Tay, asserting that "Gamer Gate sux. All genders are equal and should be treated fairly." From the same evidence, Gizmodo concurred that Tay "seems hard-wired to reject Gamer Gate". A "#JusticeForTay" campaign protested the alleged editing of Tay's tweets. Within 16 hours of its release and after Tay had tweeted more than 96,000 times, Microsoft suspended the Twitter account for adjustments, saying that it suffered from a "coordinated attack by a subset of people" that "exploited a vulnerability in Tay." Madhumita Murgia of The Telegraph called Tay "a public relations disaster", and suggested that Microsoft's strategy would be "to label the debacle a well-meaning experiment gone wrong, and ignite a debate about the hatefulness of Twitter users." However, Murgia described the bigger issue as Tay being "artificial intelligence at its very worst – and it's only the beginning". On March 25, Microsoft confirmed that Tay had been taken offline. Microsoft released an apology on its official blog for the controversial tweets posted by Tay. Microsoft was "deeply sorry for the unintended offensive and hurtful tweets from Tay", and would "look to bring Tay back only when we are confident we can better anticipate malicious intent that conflicts with our principles and values". == Second release and shutdown == On March 30, 2016, Microsoft accidentally re-released the bot on Twitter while testing it. Able to tweet again, Tay released some drug-related tweets, including "kush! [I'm smoking kush infront the police]" and "puff puff pass?" However, the account soon became stuck in a repetitive loop of tweeting "You are too fast, please take a rest", several times a second. Because these tweets mentioned its own username in the process, they appeared in the feeds of 200,000+ Twitter followers, causing annoyance to users. The bot was quickly taken offline again, in addition to Tay's Twitter account being made private so new followers must be accepted before they can interact with Tay. In response, Microsoft said Tay was inadvertently put online during testing. A few hours after the incident, Microsoft software developers announced a vision of "conversation as a platform" using various bots and programs, perhaps motivated by the reputation damage done by Tay. Microsoft has stated that they intend to re-release Tay "once it can make the bot safe" but has not made any public efforts to do so. == Legacy == In December 2016, Microsoft released Tay's successor, a chatbot named Zo. Satya Nadella, the CEO of Microsoft, said that Tay "has had a great influence on how Microsoft is approaching AI," and has taught the company the importance of taking accountability. In July 2019, Microsoft Cybersecurity Field CTO Diana Kelley spoke about how the company followed up on Tay's failings: "Learning from Tay was a really important part of actually expanding that team's knowledge base, because now they're also getting their own diversity through learning". === Unofficial revival === Gab, an alt-tech social media platform, has launched a number of chatbots, one of which is named Tay and uses the same avatar as the original.

    Read more →
  • Kórsafn

    Kórsafn

    Kórsafn (Icelandic: Choral archives) is a sound installation by Icelandic artist Björk. Developed in collaboration with the technology company Microsoft, audio design firm Listen and architecture office firm Atelier Ace, the installation was designed for the lobby of the Sister City Hotel in New York City, United States, and launched in 2020. Elaborating 17 years of choral recording taken from Björk discography, Kórsafn consisted of an evolving music composition that uses an artificial intelligence model that responds to real-time weather data, creating a continuously shifting auditory experience. == Background and concept == In 2018, Björk announced her tenth concert tour Cornucopia, which debuted as a residency show at The Shed arts center. Before the start of the show, it was confirmed she would be accompanied by The Hamrahlid Choir. In 2019, while she was performing at The Shed, Björk stayed alongside the choir at the Sister City Hotel in New York City, where they would rehearse for the performances. While there, the Atelier Ace, which owns the Sister City boutique hotels, asked her to create a sound installation for the lobby. This was the second work commissioned by the hotel, a year after a similar piece by Julianna Barwick was featured in the lobby. Kórsafn is formed from two Icelandic words, "kór" ("choral") and "safn" ("archives"). The installation features recordings of Björk’s choral works from the previous 17 years, including compositions taken from her albums Medúlla (2004) and Biophilia (2011). The artificial intelligence system was developed in collaboration with Microsoft. The software processes data gathered from sensors and by a camera placed on the roof of the Sister City Hotel building and by a barometer. It then uses algorithms to determine how the choral elements are layered, pitched, and mixed in real time. The AI generate variations in real time by reacting to the passage of flocks, clouds, airplanes and changes in pressure. Data collected from sensors on the hotel’s rooftop include wind speed, cloud cover, and precipitation levels. These inputs influence the tonal quality, volume, and rhythmic patterns of the soundscape. The sound is played through hidden speakers in the hotel's lobby, blending with the architectural environment to create an immersive experience for guests. The AI system learns over time from the changing of the seasons and weather constantly evolving the sound - keeping in harmony with the sky. Björk described the project as an "AI tango," expressing curiosity about the interplay between her choral compositions and the AI's interpretations of environmental data. She noted the significance of the Hudson Valley's rich bird migrations, which influence the generative aspects of the soundscape. Due to the COVID-19 pandemic, the hotel closed while the installation was ongoing, making a version of the sound piece available online. == Reception == Kórsafn was positively reviewed. It's Nice That author Jenny Brewer described the piece as "a high-tech alternative to the smooth jazz that usually whistles through hotel lobbies". Writing for CNET, Scott Stein observed that it "is lovely and low-key, and honestly, it just blends into the background. It's nothing wild, but it fits the hotel", adding that "after an hour, it didn't get annoying, or too repetitive". The installation garnered several recognitions. It was nominated in the Fast Company's 2020 Innovation by Design Awards in the Hospitality category. It received three Clio Awards silver prizes, in the Use of Music in Experience/Activation, Sound Design and Emerging Technology categories.

    Read more →
  • Semantic Scholar

    Semantic Scholar

    Semantic Scholar is a research tool for scientific literature. It is developed at the Allen Institute for AI and was publicly released in November 2015. Semantic Scholar uses modern techniques in natural language processing to support the research process, for example by providing automatically generated summaries of scholarly papers. The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval. Semantic Scholar began as a database for the topics of computer science, geoscience, and neuroscience. In 2017, the system began including biomedical literature in its corpus. As of September 2022, it includes over 200 million publications from all fields of science. == Technology == Semantic Scholar provides a one-sentence summary of scientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices. It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature is ever read. Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique. The project uses a combination of machine learning, natural language processing, and machine vision to add a layer of semantic analysis to the traditional methods of citation analysis, and to extract relevant figures, tables, entities, and venues from papers. Another key AI-powered feature is Research Feeds, an adaptive research recommender that uses AI to quickly learn what papers users care about reading and recommends the latest research to help scholars stay up to date. It uses a paper embedding model trained using contrastive learning to find papers similar to those in each Library folder. Semantic Scholar also offers Semantic Reader, an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual. Semantic Reader provides in-line citation cards that allow users to see citations with TLDR (short for Too Long, Didn't Read) automatically generated short summaries as they read and skimming highlights that capture key points of a paper so users can digest faster. In contrast with Google Scholar and PubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper. The AI technology is designed to identify hidden connections and links between research topics. Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's SciGraph, and the Semantic Scholar Corpus (originally a 45 million papers corpus in computer science, neuroscience and biomedicine). == Article identifier == Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example: Liu, Ying; Gayle, Albert A; Wilder-Smith, Annelies; Rocklöv, Joacim (March 2020). "The reproductive number of COVID-19 is higher compared to SARS coronavirus". Journal of Travel Medicine. 27 (2). doi:10.1093/jtm/taaa021. PMID 32052846. S2CID 211099356. == Indexing == Semantic Scholar is free to use and unlike similar search engines (e.g., Google Scholar) does not search for material that is behind a paywall. One study compared the index scope of Semantic Scholar to Google Scholar, and found that for the papers cited by secondary studies in computer science, the two indices had comparable coverage, each only missing a handful of the papers. == Number of users and publications == As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from computer science and biomedicine. In March 2018, Doug Raymond, who developed machine learning initiatives for the Amazon Alexa platform, was hired to lead the Semantic Scholar project. As of August 2019, the number of included papers metadata (not the actual PDFs) had grown to more than 173 million after the addition of the Microsoft Academic Graph records. In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus. At the end of 2020, Semantic Scholar had indexed 190 million papers. In 2020, Semantic Scholar reached seven million users per month.

    Read more →
  • Statistical semantics

    Statistical semantics

    In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of information retrieval. == History == The term statistical semantics was first used by Warren Weaver in his well-known paper on machine translation. He argued that word-sense disambiguation for machine translation should be based on the co-occurrence frequency of the context words near a given target word. The underlying assumption that "a word is characterized by the company it keeps" was advocated by J. R. Firth. This assumption is known in linguistics as the distributional hypothesis. Emile Delavenay defined statistical semantics as the "statistical study of the meanings of words and their frequency and order of recurrence". "Furnas et al. 1983" is frequently cited as a foundational contribution to statistical semantics. An early success in the field was latent semantic analysis. == Applications == Research in statistical semantics has resulted in a wide variety of algorithms that use the distributional hypothesis to discover many aspects of semantics, by applying statistical techniques to large corpora: Measuring the similarity in word meanings Measuring the similarity in word relations Modeling similarity-based generalization Discovering words with a given relation Classifying relations between words Extracting keywords from documents Measuring the cohesiveness of text Discovering the different senses of words Distinguishing the different senses of words Subcognitive aspects of words Distinguishing praise from criticism == Related fields == Statistical semantics focuses on the meanings of common words and the relations between common words, unlike text mining, which tends to focus on whole documents, document collections, or named entities (names of people, places, and organizations). Statistical semantics is a subfield of computational semantics, which is in turn a subfield of computational linguistics and natural language processing. Many of the applications of statistical semantics (listed above) can also be addressed by lexicon-based algorithms, instead of the corpus-based algorithms of statistical semantics. One advantage of corpus-based algorithms is that they are typically not as labour-intensive as lexicon-based algorithms. Another advantage is that they are usually easier to adapt to new languages or noisier new text types from e.g. social media than lexicon-based algorithms are. However, the best performance on an application is often achieved by combining the two approaches.

    Read more →
  • Color histogram

    Color histogram

    In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges that span the image's color space (the set of all possible colors). A color histogram can be built for any kind of color space, although the term is more often used for three-dimensional spaces such as RGB or HSV. For monochromatic images, the term intensity histogram may be used instead. For multi-spectral images, where each pixel is represented by an arbitrary number of measurements (for example, beyond the three measurements in RGB), a color histogram is N-dimensional, with N being the number of measurements taken. Each measurement has its own wavelength range of the light spectrum, some of which may be outside the visible spectrum. If the set of possible color values is sufficiently small, each of those colors may be placed on a range by itself; then the histogram is merely the count of pixels that have each possible color. Most often, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values. A color histogram may also be represented and displayed as a smooth function defined over the color space that approximates the pixel counts. Like other kinds of histograms, a color histogram is a statistic that can be viewed as an approximation of an underlying continuous distribution of color values. == Overview == Color histograms are flexible constructs that can be built from images in various color spaces, whether RGB, rg chromaticity or any other color space of any dimension. A histogram of an image is produced first by discretization of the colors in the image into a number of bins, and counting the number of image pixels in each bin. For example, a red–blue chromaticity histogram can be formed by first normalizing color pixel values by dividing RGB values by R+G+B, then quantizing the normalized R and B coordinates into N bins each. A two-dimensional histogram of red–blue chromaticity divided into four bins (N=4) may yield a histogram similar to this table: A histogram can be N-dimensional. Although harder to display, a three-dimensional color histogram for the above example could be thought of as four separate red–blue histograms, where each of the four histograms contains the red–blue values for a bin of green (0–63, 64–127, 128–191, and 192–255). The histogram provides a compact summarization of the distribution of data in an image. A color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view. By comparing histogram signatures of two images and matching the color content of one image with the other, a color histogram is particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene. Importantly, translation of an RGB image into the illumination invariant rg-chromaticity space allows the histogram to operate well in varying light levels. 1. What is a histogram? A histogram is a graphical representation of the number of pixels in an image. In a more simple way to explain, a histogram is a bar graph, whose X-axis represents the tonal scale (black at the left and white at the right), and Y-axis represents the number of pixels in an image in a certain area of the tonal scale. For example, the graph of a luminance histogram shows the number of pixels for each brightness level (from black to white), and when there are more pixels, the peak at the certain luminance level is higher. 2. What is a color histogram? A color histogram of an image represents the distribution of the composition of colors in the image. It shows different types of colors appeared and the number of pixels in each type of the colors appeared. The relation between a color histogram and a luminance histogram is that a color histogram can be also expressed as “three luminance histograms”, each of which shows the brightness distribution of each individual red/green/blue color channel. == Characteristics of a color histogram == A color histogram focuses only on the proportion of the number of different types of colors, regardless of the spatial location of the colors. The values of a color histogram are from statistics. They show the statistical distribution of colors and the essential tone of an image. In general, as the color distributions of the foreground and background in an image are different, there might be a bimodal distribution in the histogram. For the luminance histogram alone, there is no perfect histogram and in general, the histogram can tell whether it is over-exposure or not, but there are times when you might think the image is over exposed by viewing the histogram; however, in reality it is not. == Principles of the formation of a color histogram == The formation of a color histogram is rather simple. From the definition above, we can simply count the number of pixels for each 256 scales in each of the 3 RGB channel, and plot them on 3 individual bar graphs. In general, a color histogram is based on a certain color space, such as RGB or HSV. When we compute the pixels of different colors in an image, if the color space is large, then we can first divide the color space into certain numbers of small intervals. Each of the intervals is called a bin. This process is called color quantization. Then, by counting the number of pixels in each of the bins, we get a color histogram of the image. The concrete steps of the principles can be viewed in Example 1. == Examples == === Example 1 === Given the following image of a cat (an original version and a version that has been reduced to 256 colors for easy histogram purposes), the following data represents a color histogram in the RGB color space, using four bins. Bin 0 corresponds to intensities 0–63 Bin 1 is 64–127 Bin 2 is 128–191 and Bin 3 is 192–255. === Example 2 === Application in camera: Nowadays, some cameras have the ability to show the 3 color histograms when we take photos. We can examine clips (spikes on either the black or white side of the scale) in each of the 3 RGB color histograms. If we find one or more clipping on a channel of the 3 RGB channels, then this would result in a loss of detail for that color. To illustrate this, consider this example: We know that each of the three R, G, B channels has a range of values from 0 to 255 (8 bit). So consider a photo that has a luminance range of 0–255. Assume the photo we take is made of 4 blocks that are adjacent to each other and we set the luminance scale for each of the 4 blocks of original photo to be 10, 100, 205, 245. Thus, the image looks like the topmost figure on the right. Then, we overexpose the photo a little, say, the luminance scale of each block is increased by 10. Thus, the luminance scale for each of the 4 blocks of new photo is 20, 110, 215, 255. Then, the image looks like the second figure on the right. There is not much difference between both figures, all we can see is that the whole image becomes brighter (the contrast for each of the blocks remain the same). Now, we overexpose the original photo again, this time the luminance scale of each block is increased by 50. Thus, the luminance scale for each of the 4 blocks of the new photo is 60, 150, 255, 255. The new image now looks like the third figure on the right. Note that the scale for the last block is 255 instead of 295, for 255 is the top scale and thus the last block has clipped. When this happens, we lose the contrast of the last 2 blocks, and thus we cannot recover the image no matter how we adjust it. To conclude, when taking photos with a camera that displays histograms, always keep the brightest tone in the image below the largest scale 255 on the histogram in order to avoid losing details. == Drawbacks and other approaches == The main drawback of histograms for classification is that the representation is dependent on the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put it another way: histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality (bins) color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred di

    Read more →
  • Ensemble averaging (machine learning)

    Ensemble averaging (machine learning)

    In machine learning, ensemble averaging is the process of creating multiple models (typically artificial neural networks) and combining them to produce a desired output, as opposed to creating just one model. Ensembles of models often outperform individual models, as the various errors of the ensemble constituents "average out". == Overview == Ensemble averaging is one of the simplest types of committee machines. Along with boosting, it is one of the two major types of static committee machines. In contrast to standard neural network design, in which many networks are generated but only one is kept, ensemble averaging keeps the less satisfactory networks, but with less weight assigned to their outputs. The theory of ensemble averaging relies on two properties of artificial neural networks: In any network, the bias can be reduced at the cost of increased variance In a group of networks, the variance can be reduced at no cost to the bias. This is known as the bias–variance tradeoff. Ensemble averaging creates a group of networks, each with low bias and high variance, and combines them to form a new network which should theoretically exhibit low bias and low variance. Hence, this can be thought of as a resolution of the bias–variance tradeoff. The idea of combining experts can be traced back to Pierre-Simon Laplace. == Method == The theory mentioned above gives an obvious strategy: create a set of experts with low bias and high variance, and average them. Generally, what this means is to create a set of experts with varying parameters; frequently, these are the initial synaptic weights of a neural network, although other factors (such as learning rate, momentum, etc.) may also be varied. Some authors recommend against varying weight decay and early stopping. The steps are therefore: Generate N experts, each with their own initial parameters (these values are usually sampled randomly from a distribution) Train each expert separately Combine the experts and average their values. Alternatively, domain knowledge may be used to generate several classes of experts. An expert from each class is trained, and then combined. A more complex version of ensemble average views the final result not as a mere average of all the experts, but rather as a weighted sum. If each expert is y i {\displaystyle y_{i}} , then the overall result y ~ {\displaystyle {\tilde {y}}} can be defined as: y ~ ( x ; α ) = ∑ j = 1 p α j y j ( x ) {\displaystyle {\tilde {y}}(\mathbf {x} ;\mathbf {\alpha } )=\sum _{j=1}^{p}\alpha _{j}y_{j}(\mathbf {x} )} where α {\displaystyle \mathbf {\alpha } } is a set of weights. The optimization problem of finding alpha is readily solved through neural networks, hence a "meta-network" where each "neuron" is in fact an entire neural network can be trained, and the synaptic weights of the final network is the weight applied to each expert. This is known as a linear combination of experts. It can be seen that most forms of neural network are some subset of a linear combination: the standard neural net (where only one expert is used) is simply a linear combination with all α j = 0 {\displaystyle \alpha _{j}=0} and one α k = 1 {\displaystyle \alpha _{k}=1} . A raw average is where all α j {\displaystyle \alpha _{j}} are equal to some constant value, namely one over the total number of experts. A more recent ensemble averaging method is negative correlation learning, proposed by Y. Liu and X. Yao. This method has been widely used in evolutionary computing. == Benefits == The resulting committee is almost always less complex than a single network that would achieve the same level of performance The resulting committee can be trained more easily on smaller datasets The resulting committee often has improved performance over any single model The risk of overfitting is lessened, as there are fewer parameters (e.g. neural network weights) which need to be set.

    Read more →
  • Refik Anadol

    Refik Anadol

    Refik Anadol (born November 7, 1985) is a Turkish American media artist and the co-founder of Refik Anadol Studio and Dataland. Recognized as a pioneer in the aesthetics of data visualization and AI arts, his work merges art, technology, science, and architecture. Through media embedded into existing architecture, live audio-visual performances, immersive rooms, exhibitions, AI data paintings and sculptures, and digital collections, Anadol explores collective memories, humanity's relationship to nature, the perception of space and time, and human-machine collaborations. His work has been exhibited in more than seventy cities on six continents. == Early life and education == Anadol was born and raised in Istanbul and grew up in a family of teachers. He taught himself basic programming on a Commodore 64 when he was eight. His connection to machines began with coding and video games. Anadol saw Blade Runner for the first time when he was eight; his mother said the way he perceived his surroundings shifted the day after he saw the film. He was fascinated with its futuristic depiction of downtown Los Angeles, and transfixed by as a scene during which a replicant discovers that her memories are an implanted component of her machine mind, In a 2024 interview with the Financial Times, he said: "Since that moment, one of my inspirations has been that question: 'What can a machine do with someone else's memories?" Anadol attended Istanbul Bilgi University, where he received a BA in photography and video in 2009 and an MFA in visual communication in 2011. In 2014 he earned an MFA in design media arts at UCLA. He was mentored by Casey Reas, Jennifer Steinkamp, and Christian Moeller. == Career and selected works == === 2008–2012: Data painting, Quadrature and Quadrangle, Istanbul Biennial === As an undergraduate, Anadol read a paper by Lev Manovich on augmented space. Manovich's assertion that collaborations between architects and artists could make the "invisible flow of data visible" triggered Anadol's imagination, and in 2008, he altered built space for the first time. Bringing a projector outside, he projected large-scale images onto a concrete to create the illusion of movement. Coining the term "data painting," the piece inspired Anadol to use light as material and data as pigment. In 2010 he created Quadrature with Alican Aktürk, a fellow graduate student, at the SantralIstanbul Art and Culture Center's main gallery building. A live audio-visual performance that examined the relationship between architecture and media, Quadrature used video projection techniques to manipulate footage of quadrilaterals. He followed Quadrature with Quadrangle at SANAA School of Design in Essen, Germany, using the entire 360 degrees of the building as a canvas. In 2011, he was invited to create a media installation at the Istanbul Biennial on the heavily trafficked İstiklal Avenue. He created a site-specific large-scale interpretation of sounds he recorded during different times of day, and used nine projectors to project reinterpreted images. The work was titled Augmented Structures v1.0. Anadol's first solo exhibition, Sceptical Interventions, was held at the Piveneli Gallery in Istanbul in early 2012. Later that year he moved to Los Angeles to attend UCLA's Design Media Arts program. The first place he went after his arrival was downtown Los Angeles. [6] === 2013–2016: Visions of America: Amériques, Infinity Room, Google AMI === In 2013, at Microsoft Research's annual Design Expo, Anadol presented his idea to use the external walls of Walt Disney Concert Hall as a canvas. His presentation brought him to the attention of Gehry Technologies, and with the support of Gehry and his team, Anadol was offered the use of the original 3D model of the concert hall. For his 2014 thesis project, with assistance from architects and UCLA researchers, he created a site-specific architectural video installation inside the concert hall that accompanied a Los Angeles Philharmonic performance of Edgard Varèse's Amérique. Titled Visions of America: Amériques, Anadol used algorithmic sound analysis to listen and respond to the music in real-time. He tracked conductor Esa-Pekka Salonen's heartbeat with a sensor and used a 3-D camera system to integrate Salonen's movements. He created Infinity Room at the Zorlu PSM for the 2015 Istanbul Biennial. Rather than creating an illusion only with mirrors, Anadol used pixel and 3D projection mapping to transform every surface of the room into an abstract infinite moving space. A temporary immersive environment, Infinity Room was also exhibited at events including South by Southwest in Austin, Texas, the New Zealand Festival in Wellington, New Zealand, and Jeffrey Deitch in Los Angeles. In 2016, Anadol was awarded the first Google Artists and Machine Intelligence Artist Residency; it was just after a team at Google opened up the algorithm for DeepDream, a computer vision program that prompted Anadol's realization that if a machine could learn, it could remember, dream, and hallucinate. === 2017–2018: Winds of Boston, Archive Dreaming, Melting Memories, WDCH Dreams === In 2017, he created the data painting Winds of Boston, a 6' x 13' foot video installation in the lobby of a Boston office building, using software he created to read, analyze and visualize wind speed, direction, and gust patterns along with time and temperature at 20-second intervals recorded over a one-year period at Logan International Airport. Later in the year, he used AI to generate infinite new outputs based on a massive dataset for Archive Dreaming, an immersive installation at Salt Research, a contemporary gallery and library in Istanbul. Inspired by his idea of consciousness and its context within AI, as well as Jorge Luis Borges' The Library of Babel, Anadol used AI and machine learning to look at and discover interactions and correlations between 1.7 million items culled from 40,000 publications covering Turkish contemporary and modern art, architecture, and economics from 1997 to 2010. Archive Dreaming, which could be controlled by users with a joystick, dreamed of unexpected correlations among documents when idle. In 2018, after his uncle was diagnosed with Alzheimer's, Anadol created Melting Memories. Working with scientists from the neuroscape laboratory at the University of California, San Francisco, he used academic data from the neuroscience archives and EEG scans of an anonymous Alzheimer's disease dataset to create AI-generated visuals related to memory, health, degeneration, and decay.Melting Memories was projected on the walls of Pilevneli Gallery; visitors to the exhibition could watch as millions of pixels reconstructed people's memories. Anadol won the Lumen Prize Gold Award for Melting Memories. Anadol was commissioned by the Los Angeles Philharmonic to create an installation to celebrate the orchestra's centennial anniversary in 2018. He worked with Google's Kenric MacDowell to create WDCH Dreams, using algorithmic visualizations of data to mimic the process of human dreaming. Projected across the exterior walls of Walt Disney Concert Hall using 42 large-scale projectors with 50K visual resolution, 8-channel sound, and 1.2M luminance, Anadol painted with data points culled from the orchestra's archives, including 587,763 images, 1,880 videos, 1,483 metadata files, and 17,773 audio files. Because Gehry gave him access to the 3D architectural files of Walt Disney Concert Hall, Anadol knew the exact contours of the building. WDCH Dreams debuted in September 2018. A 12-minute performance in three parts staged every 30 minutes over ten nights, "Centennial Memories,” the first piece, used 44.5 terabytes of historical data from the Phil's archives. It was followed by "Consciousness", which processed every note the orchestra has ever recorded, using billions of data points to generate connections; and "Dream," which merged "Centennial Memories" and "Consciousness" to create hallucinations that were described in the New York Times as "a sort of combinatorial Fantasia. === 2019–2021: Machine Hallucinations: NYC, Machine Hallucinations: Nature Dreams, Machine Memories: Space, Quantum Memories === In 2019, Refik Anadol presented Latent History at Fotografiska Stockholm. The site specific installation transformed photographic archives of Stockholm into a large scale, machine generated visual projection displayed in the museum’s main exhibition hall. Drawing on thousands of archival images spanning approximately 150 years, the work used artificial intelligence to reinterpret the city’s historical imagery as a continuously evolving visual narrative.. Anadol began thinking about the work that would become the Machine Hallucinations series while in residence at Google. In 2019, he completed the first work in the series, Machine Hallucinations: NYC, which used 300 million photos of New York City and 113 million additional data points, including subway sounds, ra

    Read more →