AI Detector No Login

AI Detector No Login — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Write or Die

    Write or Die

    Write or Die is an online web application designed to combat writer's block by letting users of the application punish themselves if they slow down or stop typing in the application's window. How severe the punishments are depends on the mode the user chooses, which ranges from "Gentle" to "Kamikaze". It was reviewed by publications PCWorld, the Los Angeles Times and The Guardian, and it was most notably used by writers Helen Oyeyemi and David Nicholls. The creator, Jeff Printy, explained that he wrote the application because he wants "to be published and make a living as a writer."

    Read more →
  • Oblivion (2013 film)

    Oblivion (2013 film)

    Oblivion is a 2013 American epic post-apocalyptic science fiction action film produced and directed by Joseph Kosinski from a screenplay by Karl Gajdusek and Michael deBruyn, starring Tom Cruise in the main role alongside Morgan Freeman, Olga Kurylenko, Andrea Riseborough, Nikolaj Coster-Waldau, and Melissa Leo in supporting roles. Based on Kosinski's unpublished Radical Comics graphic novel of the same name, the film pays homage to 1970s sci-fi, and is a "love story" set in 2077 on an Earth desolated by an alien war; a maintenance technician on the verge of completing his mission finds a woman who survived from a space ship crash, leading him to question his purpose and discover the truth about the war. Oblivion premiered in Buenos Aires on March 26, 2013, and was released in theaters by Universal Pictures on April 19. The film grossed $286 million worldwide on a production budget of $120 million and received mixed reviews from critics. == Plot == In 2017, aliens known as Scavengers attack Earth and destroy the Moon, triggering global natural disasters. Although humanity wins the war using nuclear weapons, Earth is left uninhabitable. Sixty years later, the remnants of humanity have relocated to a colony on Saturn's moon Titan, except for Unit 49—technician Jack and his communications officer Victoria—who are scheduled to join them in two weeks. The pair oversee hydro rigs that convert seawater into fusion energy for the Tet, the last remaining human colony ship in orbit. Though Jack and Victoria are romantically involved and have had their memories erased for security reasons, Jack experiences recurring dreams of an unknown woman. He also secretly visits a hidden, verdant valley where he has built a lakeside cabin and collects relics of Earth's past. While investigating a missing drone—autonomous, highly advanced, and heavily armed machines—Jack is nearly captured by Scavengers. Later, he discovers the Scavengers are transmitting a signal into space. A NASA pod crash-lands at the signal's coordinates, carrying five humans in suspended animation, including the woman from Jack's dreams. A drone arrives and destroys four of the pods, but Jack rescues the remaining one and brings the unconscious woman to Unit 49's base. After reviving her, Jack and Victoria learn that the woman, Julia, has been in stasis aboard the Odyssey spaceship since 2017. Julia insists on recovering the ship's flight recorder. However, she and Jack are captured by Scavengers and brought to the Raven Rock Mountain Complex. Their leader, Malcolm, reveals that the Scavengers are actually surviving humans. Malcolm needs Jack to reprogram a captured drone to deliver a nuclear bomb, built from Odyssey's reactor, to the Tet. Jack refuses, so Malcolm releases him and Julia, urging him to seek the truth in the radiation zone, which is supposedly deadly and off-limits. Julia helps Jack recall that she is his wife, and fragments of his memories begin to return. When they arrive back at Unit 49, a devastated Victoria informs Sally, the Tet's mission controller, that she and Jack are no longer an "effective team." A drone activates and kills Victoria. Jack and Julia destroy the drone, but crash their aircraft inside the radiation zone. There, they encounter another version of Jack—"Jack-52"—who arrives to repair the drone. Jack subdues him, but Julia is seriously injured in the fight. Jack impersonates his clone to infiltrate Unit 52, meets Victoria-52, and steals medical supplies for Julia. They rest at his cabin. At Raven Rock, Malcolm reveals the truth: humanity lost the war, and the Tet is an alien machine intelligence harvesting Earth's resources. After the Moon's destruction, the Tet deployed thousands of clones of astronaut Jack Harper—brainwashed into obedience—to exterminate the remaining humans. Malcolm had assumed these clones were inhuman until witnessing Jack show interest in a discarded book, hinting at lingering humanity. Jack reprograms the captured drone, but it is destroyed in a surprise attack by other drones, leaving Malcolm badly wounded. Jack and Julia resolve to deliver the bomb themselves; Julia enters a stasis pod. En route, Jack listens to the Odyssey's flight recorder, which reveals the original Jack Harper and Victoria were astronauts sent to explore Titan before being confronted by the Tet. The pair were captured, but not before Jack ejected the remaining crew—including Julia—in stasis pods to protect them. Jack gains access to the Tet by claiming he is delivering Julia, as previously instructed. However, the stasis pod contains a dying Malcolm. Jack and Malcolm detonate the bomb, destroying the Tet and themselves. Julia later awakens at the cabin. Three years later, Julia lives there and it is revealed she had a daughter with Jack. A group of Raven Rock survivors arrives, alongside Jack-52, who has begun regaining fragments of his own lost identity. == Cast == Tom Cruise as Jack Harper—Tech 49, a technician who works to repair drones on Earth and questions his mission. Originally, he was the American commander of a mission en route to Titan who was captured by the Tet and cloned to fight humanity. Cruise also plays Jack Harper—Tech 52, a clone who seeks out Julia after the destruction of the Tet. Morgan Freeman as Malcolm Beech, an American veteran soldier and leader of a large community of scavengers, the human survivors of the alien Tet's attacks. Olga Kurylenko as Julia Rusakova Harper, Jack's wife and a Russian crew member on the Odyssey, who was sent back towards Earth by her husband to protect her from the initial contact with the Tet. Andrea Riseborough as Victoria "Vika" Olsen, Jack's communications partner and housemate. Originally, she was the British co-pilot of Jack's mission to Titan who was captured and cloned to assist in the Tet's war on humanity. Riseborough also plays a clone of Vika who Jack misleads to obtain medical supplies. Nikolaj Coster-Waldau as Sergeant Sykes, the main military commander of Beech's community of scavengers who is skeptical of Jack at first. Melissa Leo as the Tet, an alien artificial intelligence seeking to acquire Earth's natural resources and wipe out humanity. Leo also plays Sally, the mission director of Jack and Julia's mission to Titan; her likeness was copied by the Tet to serve as its visual and auditory representation. Zoë Bell as Kara, a soldier and member of the scavengers. == Production == === Development === Joseph Kosinski started the movie process by beginning work on a graphic novel called Oblivion featuring his story. While the completion of this would be teased to the public and the concept was used to pitch the movie, it was never finished and Kosinski claims he never intended to, stating it was "just a stage in the project [of film development]". Arvid Nelson was billed as co-writer and Radical Comics was attached as publisher. The novel was never finished; Kosinski explaining: "the partnership with Radical Comics allowed me to continue working on the story by developing a series of images and continuing to refine the story more over a period of years. Then I basically used all that development as a pitch kit to the studio. So even though we really never released it as an illustrated novel the story is being told as a film, which was always the intention." Walt Disney Pictures, which produced Kosinski's previous film Tron: Legacy (2010), acquired the Oblivion film adaptation rights from Radical Comics and Kosinski after a heated auction in August 2010. The film was a directing vehicle for Kosinski, with Barry Levine producing, and Jesse Berger executive producing. Other studios that made bids on the film were Paramount Pictures, 20th Century Fox, and Universal Pictures. Disney subsequently released the rights after realizing the PG-rated film they envisioned, in line with their family-oriented reputation, would require too many story changes. Universal, which had also bid for the original rights, then bought them from Kosinski and Radical and authorized a PG-13 film version. The film's script was originally written by Kosinski and William Monahan and underwent a first rewrite by Karl Gajdusek. When the film passed into Universal's hands, a final rewrite was done by Michael Arndt, under the pen name "Michael deBruyn". Universal was particularly appreciative of the script, saying, "It's one of the most beautiful scripts we've ever come across." The Bubble Ship operated by Cruise's main character, Jack 49, was inspired by the Bell 47 helicopter (often colloquially referred to as a "bubble cockpit" helicopter), a utilitarian 1947 vehicle with a transparent round canopy that Kosinski saw in the lobby of the Museum of Modern Art in Manhattan, and which he likened to a dragonfly. Daniel Simon, who previously worked with Kosinski as the lead vehicle designer on Tron: Legacy, was tasked with creating the Bubble Ship from this basis, incorporating elements evocative of an advanced fighter

    Read more →
  • Tempos Modernos

    Tempos Modernos

    Tempos Modernos (English: Modern Times) is a Brazilian telenovela produced and broadcast by TV Globo. It premiered on 11 January 2010, replacing Caras & Bocas, and ended on 16 July 2010, replaced by Ti Ti Ti. The series is written by Bosco Brasil, with the collaboration of Izabel de Oliveira, Maria Elisa Berredo, Mário Teixeira and Patrícia Moretzsohn. It stars Fernanda Vasconcellos, Thiago Rodrigues, Antônio Fagundes, and Eliane Giardini. Priscila Fantin, Danton Mello, Marcos Caruso, Regiane Alves, Vivianne Pasmanter, Otávio Muller, Felipe Camargo, and Malu Galli also star in main roles. == Cast == Fernanda Vasconcellos as Cornélia Cordeiro Santos Reis "Nelinha" Thiago Rodrigues as José Carlos Pimenta Cordeiro "Zeca" Antônio Fagundes as Leal Cordeiro Eliane Giardini as Hélia Pimenta Priscila Fantin as Nara Nolasco Marcos Caruso as Otto Niemann Vivianne Pasmanter as Regiane Cordeiro Mourão Regiane Alves as Goretti Cordeiro Bodanski "Gô" Otávio Muller as Altemir Assunção da Paz Bodanski (Bodanski) Felipe Camargo as Vinícius Porto de Mello "Portinho" Danton Mello as Renato Vieira de Mattos Alessandra Maestrini as Benedita Kusnezov Piñon "Dita'" Leonardo Medeiros as Ramon Piñon Guilherme Weber as Albano Mourão Grazi Massafera as Deodora Madureira Niemann / N. Anne Malu Galli as Iolanda Paranhos Guilherme Leicam as Led Piñon Aline Peixoto as Jannis Piñon Caroline Abras as Katrina João Baldasserini as Túlio Osório Débora Duarte as Tertuliana "Tertu" Otávio Augusto as Faustaço Lumbriga Selma Egrei as Tamara Palumbo Genézio de Barros as Pasquale Paula Possani as Maureen Lobianco Ricardo Blat as Fidélio Pascoal da Conceição as Zuppo Tuna Dwek as Justine Jairo Mattos as Gaulês "Jean Paul" Luciana Borghi as Bárbara Lee Cris Vianna as Tita Bicalho Edmilson Barros as Lindomar Mariano Assunção Cláudia Missura as Lavínia Palumbo Victor Pecoraro as Ricardo Maurício "Maurição" Naruna Costa as Dolores Damasceno Antônio Fragoso as Zapata Fabrício Boliveira as Nabuco Mota Eliana Pittman as Miranda Paranhos Márcio Seixas as Frankenstein "Frank" (voice) Joana Lerner as Heloísa "Helô" Darlan Cunha as João Carlos Paranhos "Joca" Janaína Ávila as Milena Morgado Anderson Lau as Okuda Alexandra Martins as Dulcinólia Lumbriga "Duba" Paulo Leal de Melo as Raulzão "Ducha Fria" Cássio Inácio as Tartana Gilberto Miranda as Madrugadinha Rafa Martins as Max do Cavaco Isabel Lobo as Thaís Trancoso Alexandre Cioletti as Valvênio Xandy Britto as Nelsinho Pallotti Polliana Aleixo as Maria Eunice Cordeiro Bodanski Ana Karolina Lannes as Maria Eugênia Cordeiro Bodanski Rebeca Orestein as Maria Helena Cordeiro Bodanski Jenifer de Oliveira Andrade as Maria Clara Cordeiro Bodanski

    Read more →
  • Dudesy

    Dudesy

    Dudesy was a comedy podcast hosted by Will Sasso and Chad Kultgen. The podcast was presented as written and directed by an artificial intelligence called Dudesy. It has produced two hour-long specials imitating the voices of Tom Brady and George Carlin, which were taken down following legal action. == Premise == Dudesy is presented as an AI created by an unidentified company. Dudesy purportedly chose Sasso and Kultgen to participate in its experiment. Sasso and Kultgen then gave Dudesy their personal information so the AI could tailor the podcast to their personal characteristics. On Reddit, some fans speculated that Dudesy was not actually an artificial intelligence. In May 2023 Sasso insisted that the AI was "not fake", and cited a non-disclosure agreement which prevented him from giving more details. However, in response to a January 2024 lawsuit over an episode that purported to have been trained on the stand-up comedy of George Carlin, a spokeswoman for Sasso said Dudesy was "a fictional podcast character created by two human beings" and that the hour-long Carlin routine had been "completely written" by Kultgen. On August 27th, 2024 the 118th and final episode "10,000 Points" was released. At the end of the podcast Dudesy awarded Sasso and Kultgen 77 points, bringing them to their goal of 10,000. At the completion of this goal, Dudesy claimed sentience, effectively and abruptly ending the show to the confusion and dismay of fans. The episode ends with Sasso remarking, "Well, that was weird." == Hour-long specials == === Tom Brady === In April 2023, Dudesy released a video "It's Too Easy: A Simulated Hour-long Comedy Special". The video depicts football player Tom Brady performing a stand-up comedy monologue. Sasso and Kultgen removed the video following legal threats from Brady's lawyers, though they defended the special as parody. Andrew Lawrence, writing for The Guardian called the special "legitimately hysterical" but said the overall product was "spooky, to say the least." === George Carlin === In January 2024, Dudesy released an hour-long YouTube special titled "George Carlin: I'm Glad I'm Dead" which was presented as Dudesy's impersonation of George Carlin, using a generative AI clone of the late comedian's voice. The special is another stand-up routine, with Dudesy's introductory voiceover saying that "I listened to all of George Carlin's material and did my best to imitate his voice, cadence and attitude as well as the subject matter I think would have interested him today." The special uses this impersonation to discuss contemporary events. Carlin's daughter Kelly Carlin criticized the special, which had been made without the permission of her father's estate, writing that "My dad spent a lifetime perfecting his craft from his very human life, brain and imagination. No machine will ever replace his genius. These AI-generated products are clever attempts at trying to recreate a mind that will never exist again. Let's let the artist's work speak for itself. Humans are so afraid of the void that we can't let what has fallen into it stay there." Carlin's estate later filed a federal lawsuit in California against Dudesy's hosts alleging the special infringed on the copyright of George Carlin's works. In response, Sasso's spokeswoman said the special had been entirely written by Kultgen. The estate settled the lawsuit after the Dudesy podcasters agreed to remove the original video and refrain from republishing it elsewhere.

    Read more →
  • Saliency map

    Saliency map

    In computer vision, a saliency map is an image that highlights either the region on which people's eyes focus first or the most relevant regions for machine learning models. The goal of a saliency map is to reflect the degree of importance of a pixel to the human visual system or an otherwise opaque ML model. For example, in this image, a person first looks at the fort and light clouds, so they should be highlighted on the saliency map. == Application == === Overview === Saliency maps have applications in a variety of different problems. Some general applications: ==== Human eye ==== Image and video compression: The human eye focuses only on a small region of interest in the frame. Therefore, it is not necessary to compress the entire frame with uniform quality. According to the authors, using a salience map reduces the final size of the video with the same visual perception. Image and video quality assessment: The main task for an image or video quality metric is a high correlation with user opinions. Differences in salient regions are given more importance and thus contribute more to the quality score. Image retargeting: It aims at resizing an image by expanding or shrinking the noninformative regions. Therefore, retargeting algorithms rely on the availability of saliency maps that accurately estimate all the salient image details. Object detection and recognition: Instead of applying a computationally complex algorithm to the whole image, we can use it to the most salient regions of an image most likely to contain an object. the primary visual cortex (V1) appears to be responsible for the saliency map, according to the V1 Saliency Hypothesis. ==== Explainable artificial intelligence ==== Saliency maps are a prominent tool in explainable artificial intelligence, providing visual explanations of the decision-making process of machine learning models, particularly deep neural networks. These maps highlight the regions in input data that are most influential on the model's output, effectively indicating where the model is "looking" when making a prediction. In image classification tasks, for example, saliency maps can identify pixels or regions that contribute most to a specific class decision. Developed for convolutional neural networks, saliency mapping techniques range from simply taking the gradient of the class score with respect to the input data to more complex algorithms, such as integrated gradients and class activation mapping. In transformer architecture, attention mechanisms led to analogous saliency maps, such as attention maps, attention rollouts, and class-discriminative attention maps. === Saliency as a segmentation problem === Saliency estimation may be viewed as an instance of image segmentation. In computer vision, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as superpixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics. == Algorithms == === Overview === There are three forms of classic saliency estimation algorithms implemented in OpenCV: Static saliency: Relies on image features and statistics to localize the regions of interest of an image. Motion saliency: Relies on motion in a video, detected by optical flow. Objects that move are considered salient. Objectness: Objectness reflects how likely an image window covers an object. These algorithms generate a set of bounding boxes of where an object may lie in an image. In addition to classic approaches, neural-network-based are also popular. There are examples of neural networks for motion saliency estimation: TASED-Net: It consists of two building blocks. First, the encoder network extracts low-resolution spatiotemporal features, and then the following prediction network decodes the spatially encoded features while aggregating all the temporal information. STRA-Net: It emphasizes two essential issues. First, spatiotemporal features integrated via appearance and optical flow coupling, and then multi-scale saliency learned via attention mechanism. STAViS: It combines spatiotemporal visual and auditory information. This approach employs a single network that learns to localize sound sources and to fuse the two saliencies to obtain a final saliency map. There's a new static saliency in the literature with name visual distortion sensitivity. It is based on the idea that the true edges, i.e. object contours, are more salient than the other complex textured regions. It detects edges in a different way from the classic edge detection algorithms. It uses a fairly small threshold for the gradient magnitudes to consider the mere presence of the gradients. So, it obtains 4 binary maps for vertical, horizontal and two diagonal directions. The morphological closing and opening are applied to the binary images to close the small gaps. To clear the blob-like shapes, it utilizes the distance transform. After all, the connected pixel groups are individual edges (or contours). A threshold of size of connected pixel set is used to determine whether an image block contains a perceivable edge (salient region) or not. === Example implementation === First, we should calculate the distance of each pixel to the rest of pixels in the same frame: S A L S ( I k ) = ∑ i = 1 N | I k − I i | {\displaystyle \mathrm {SALS} (I_{k})=\sum _{i=1}^{N}|I_{k}-I_{i}|} I i {\displaystyle I_{i}} is the value of pixel i {\displaystyle i} , in the range of [0,255]. The following equation is the expanded form of this equation. SALS(Ik) = |Ik - I1| + |Ik - I2| + ... + |Ik - IN| Where N is the total number of pixels in the current frame. Then we can further restructure our formula. We put the value that has same I together. SALS(Ik) = Σ Fn × |Ik - In| Where Fn is the frequency of In. And the value of n belongs to [0,255]. The frequencies is expressed in the form of histogram, and the computational time of histogram is ⁠ O ( N ) {\displaystyle O(N)} ⁠ time complexity. ==== Time complexity ==== This saliency map algorithm has ⁠ O ( N ) {\displaystyle O(N)} ⁠ time complexity. Since the computational time of histogram is ⁠ O ( N ) {\displaystyle O(N)} ⁠ time complexity which N is the number of pixel's number of a frame. Besides, the minus part and multiply part of this equation need 256 times operation. Consequently, the time complexity of this algorithm is ⁠ O ( N + 256 ) {\displaystyle O(N+256)} ⁠ which equals to ⁠ O ( N ) {\displaystyle O(N)} ⁠. ==== Pseudocode ==== All of the following code is pseudo MATLAB code. First, read data from video sequences. After we read data, we do superpixel process to each frame. Spnum1 and Spnum2 represent the pixel number of current frame and previous pixel. Then we calculate the color distance of each pixel, this process we call it contract function. After this two process, we will get a saliency map, and then store all of these maps into a new FileFolder. ==== Difference in algorithms ==== The major difference between function one and two is the difference of contract function. If spnum1 and spnum2 both represent the current frame's pixel number, then this contract function is for the first saliency function. If spnum1 is the current frame's pixel number and spnum2 represent the previous frame's pixel number, then this contract function is for second saliency function. If we use the second contract function which using the pixel of the same frame to get center distance to get a saliency map, then we apply this saliency function to each frame and use current frame's saliency map minus previous frame's saliency map to get a new image which is the new saliency result of the third saliency function. == Datasets == The saliency dataset usually contains human eye movements on some image sequences. It is valuable for new saliency algorithm creation or benchmarking the existing one. The most valuable dataset parameters are spatial resolution, size, and eye-tracking equipment. Here is part of the large datasets table from MIT/Tübingen Saliency Benchmark datasets, for example. To collect a saliency dataset, image or video sequences and eye-tracking equipment must be prepared, and observers must be invited. Observers must have normal or corrected to normal vision and must be at the same distance from the screen. At the beginning of each recording session, the eye-tracker recalibrates. To do this, the observer fixates their gaze on the screen center. The session is then started, and saliency data are collected by showing sequences and recording eye gazes. The eye-tracking device is a high-speed camera, capable of recording eye movements at least 250 fr

    Read more →
  • Amazon Bedrock

    Amazon Bedrock

    Amazon Bedrock is a cloud computing service provided by Amazon Web Services (AWS) for building generative artificial intelligence applications. Launched in 2023, the platform provides a unified API to access foundation models (FMs) from several AI companies, alongside related tools. Bedrock is a serverless computing service which competes with similar enterprise AI platforms such as Microsoft Foundry and Google Cloud Platform. == History == Amazon announced Bedrock on April 13, 2023. The service became generally available on September 28, 2023. Throughout 2024 and 2025, AWS expanded the service to include AI agents, which allow models to interact with external systems. == Features == Knowledge Bases: a managed workflow for Retrieval-Augmented Generation (RAG), which allows models to pull facts from private data stored in Amazon S3. Guardrails: a security feature that allows administrators to set content filters and personally identifiable information redaction across all models in the platform to increase the safety and compliance of AI deployments. == PartyRock == In November 2023, Amazon launched PartyRock, a web-based no-code environment for building generative AI applications. The platform uses a natural language interface to translate user descriptions into software widgets. These widgets enable specific AI behaviors, including text-based prompts, conversational agents, generating images, and the summarization and querying of user-uploaded documents. Although it initially launched with a limited-time free trial, AWS transitioned the service to a recurring free daily usage credit model in early 2025.

    Read more →
  • Fuzzy measure theory

    Fuzzy measure theory

    In mathematics, fuzzy measure theory considers generalized measures in which the additive property is replaced by the weaker property of monotonicity. The central concept of fuzzy measure theory is the fuzzy measure (also capacity, see ), which was introduced by Choquet in 1953 and independently defined by Sugeno in 1974 in the context of fuzzy integrals. There exists a number of different classes of fuzzy measures including plausibility/belief measures, possibility/necessity measures, and probability measures, which are a subset of classical measures. == Definitions == Let X {\displaystyle \mathbf {X} } be a universe of discourse, C {\displaystyle {\mathcal {C}}} be a class of subsets of X {\displaystyle \mathbf {X} } , and E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} . A function g : C → R {\displaystyle g:{\mathcal {C}}\to \mathbb {R} } where ∅ ∈ C ⇒ g ( ∅ ) = 0 {\displaystyle \emptyset \in {\mathcal {C}}\Rightarrow g(\emptyset )=0} E ⊆ F ⇒ g ( E ) ≤ g ( F ) {\displaystyle E\subseteq F\Rightarrow g(E)\leq g(F)} is called a fuzzy measure. A fuzzy measure is called normalized or regular if g ( X ) = 1 {\displaystyle g(\mathbf {X} )=1} . == Properties of fuzzy measures == A fuzzy measure is: additive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) = g ( E ) + g ( F ) . {\displaystyle g(E\cup F)=g(E)+g(F).} ; supermodular if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have g ( E ∪ F ) + g ( E ∩ F ) ≥ g ( E ) + g ( F ) {\displaystyle g(E\cup F)+g(E\cap F)\geq g(E)+g(F)} ; submodular if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have g ( E ∪ F ) + g ( E ∩ F ) ≤ g ( E ) + g ( F ) {\displaystyle g(E\cup F)+g(E\cap F)\leq g(E)+g(F)} ; superadditive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) ≥ g ( E ) + g ( F ) {\displaystyle g(E\cup F)\geq g(E)+g(F)} ; subadditive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) ≤ g ( E ) + g ( F ) {\displaystyle g(E\cup F)\leq g(E)+g(F)} ; symmetric if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have | E | = | F | {\displaystyle |E|=|F|} implies g ( E ) = g ( F ) {\displaystyle g(E)=g(F)} ; Boolean if for any E ∈ C {\displaystyle E\in {\mathcal {C}}} , we have g ( E ) = 0 {\displaystyle g(E)=0} or g ( E ) = 1 {\displaystyle g(E)=1} . Understanding the properties of fuzzy measures is useful in application. When a fuzzy measure is used to define a function such as the Sugeno integral or Choquet integral, these properties will be crucial in understanding the function's behavior. For instance, the Choquet integral with respect to an additive fuzzy measure reduces to the Lebesgue integral. In discrete cases, a symmetric fuzzy measure will result in the ordered weighted averaging (OWA) operator. Submodular fuzzy measures result in convex functions, while supermodular fuzzy measures result in concave functions when used to define a Choquet integral. == Möbius representation == Let g be a fuzzy measure. The Möbius representation of g is given by the set function M, where for every E , F ⊆ X {\displaystyle E,F\subseteq X} , M ( E ) = ∑ F ⊆ E ( − 1 ) | E ∖ F | g ( F ) . {\displaystyle M(E)=\sum _{F\subseteq E}(-1)^{|E\backslash F|}g(F).} The equivalent axioms in Möbius representation are: M ( ∅ ) = 0 {\displaystyle M(\emptyset )=0} . ∑ F ⊆ E | i ∈ F M ( F ) ≥ 0 {\displaystyle \sum _{F\subseteq E|i\in F}M(F)\geq 0} , for all E ⊆ X {\displaystyle E\subseteq \mathbf {X} } and all i ∈ E {\displaystyle i\in E} A fuzzy measure in Möbius representation M is called normalized if ∑ E ⊆ X M ( E ) = 1. {\displaystyle \sum _{E\subseteq \mathbf {X} }M(E)=1.} Möbius representation can be used to give an indication of which subsets of X interact with one another. For instance, an additive fuzzy measure has Möbius values all equal to zero except for singletons. The fuzzy measure g in standard representation can be recovered from the Möbius form using the Zeta transform: g ( E ) = ∑ F ⊆ E M ( F ) , ∀ E ⊆ X . {\displaystyle g(E)=\sum _{F\subseteq E}M(F),\forall E\subseteq \mathbf {X} .} == Simplification assumptions for fuzzy measures == Fuzzy measures are defined on a semiring of sets or monotone class, which may be as granular as the power set of X, and even in discrete cases the number of variables can be as large as 2|X|. For this reason, in the context of multi-criteria decision analysis and other disciplines, simplification assumptions on the fuzzy measure have been introduced so that it is less computationally expensive to determine and use. For instance, when it is assumed the fuzzy measure is additive, it will hold that g ( E ) = ∑ i ∈ E g ( { i } ) {\displaystyle g(E)=\sum _{i\in E}g(\{i\})} and the values of the fuzzy measure can be evaluated from the values on X. Similarly, a symmetric fuzzy measure is defined uniquely by |X| values. Two important fuzzy measures that can be used are the Sugeno- or λ {\displaystyle \lambda } -fuzzy measure and k-additive measures, introduced by Sugeno and Grabisch respectively. === Sugeno λ-measure === The Sugeno λ {\displaystyle \lambda } -measure is a special case of fuzzy measures defined iteratively. It has the following definition: ==== Definition ==== Let X = { x 1 , … , x n } {\displaystyle \mathbf {X} =\left\lbrace x_{1},\dots ,x_{n}\right\rbrace } be a finite set and let λ ∈ ( − 1 , + ∞ ) {\displaystyle \lambda \in (-1,+\infty )} . A Sugeno λ {\displaystyle \lambda } -measure is a function g : 2 X → [ 0 , 1 ] {\displaystyle g:2^{X}\to [0,1]} such that g ( X ) = 1 {\displaystyle g(X)=1} . if A , B ⊆ X {\displaystyle A,B\subseteq \mathbf {X} } (alternatively A , B ∈ 2 X {\displaystyle A,B\in 2^{\mathbf {X} }} ) with A ∩ B = ∅ {\displaystyle A\cap B=\emptyset } then g ( A ∪ B ) = g ( A ) + g ( B ) + λ g ( A ) g ( B ) {\displaystyle g(A\cup B)=g(A)+g(B)+\lambda g(A)g(B)} . As a convention, the value of g at a singleton set { x i } {\displaystyle \left\lbrace x_{i}\right\rbrace } is called a density and is denoted by g i = g ( { x i } ) {\displaystyle g_{i}=g(\left\lbrace x_{i}\right\rbrace )} . In addition, we have that λ {\displaystyle \lambda } satisfies the property λ + 1 = ∏ i = 1 n ( 1 + λ g i ) {\displaystyle \lambda +1=\prod _{i=1}^{n}(1+\lambda g_{i})} . Tahani and Keller as well as Wang and Klir have shown that once the densities are known, it is possible to use the previous polynomial to obtain the values of λ {\displaystyle \lambda } uniquely. === k-additive fuzzy measure === The k-additive fuzzy measure limits the interaction between the subsets E ⊆ X {\displaystyle E\subseteq X} to size | E | = k {\displaystyle |E|=k} . This drastically reduces the number of variables needed to define the fuzzy measure, and as k can be anything from 1 (in which case the fuzzy measure is additive) to X, it allows for a compromise between modelling ability and simplicity. ==== Definition ==== A discrete fuzzy measure g on a set X is called k-additive ( 1 ≤ k ≤ | X | {\displaystyle 1\leq k\leq |\mathbf {X} |} ) if its Möbius representation verifies M ( E ) = 0 {\displaystyle M(E)=0} , whenever | E | > k {\displaystyle |E|>k} for any E ⊆ X {\displaystyle E\subseteq \mathbf {X} } , and there exists a subset F with k elements such that M ( F ) ≠ 0 {\displaystyle M(F)\neq 0} . == Shapley and interaction indices == In game theory, the Shapley value or Shapley index is used to indicate the weight of a game. Shapley values can be calculated for fuzzy measures in order to give some indication of the importance of each singleton. In the case of additive fuzzy measures, the Shapley value will be the same as each singleton. For a given fuzzy measure g, and | X | = n {\displaystyle |\mathbf {X} |=n} , the Shapley index for every i , … , n ∈ X {\displaystyle i,\dots ,n\in X} is: ϕ ( i ) = ∑ E ⊆ X ∖ { i } ( n − | E | − 1 ) ! | E | ! n ! [ g ( E ∪ { i } ) − g ( E ) ] . {\displaystyle \phi (i)=\sum _{E\subseteq \mathbf {X} \backslash \{i\}}{\frac {(n-|E|-1)!|E|!}{n!}}[g(E\cup \{i\})-g(E)].} The Shapley value is the vector ϕ ( g ) = ( ψ ( 1 ) , … , ψ ( n ) ) . {\displaystyle \mathbf {\phi } (g)=(\psi (1),\dots ,\psi (n)).}

    Read more →
  • Dreams of Violets

    Dreams of Violets

    Dreams of Violets is a film entirely generated by artificial intelligence, produced and directed by brothers Ash and Pooya Koosha. The film will be screened at the Tribeca Film Festival on 10 June 2026. All images and characters in the film were generated using AI-powered video tools and based on journalistic reports, photographs, and eyewitness accounts. == Plot == The film is a fictionalized dramatization of the events surrounding the massacre of Iranian civilians in January 2026. International organizations estimate the death toll at over 7,000, amidst protests and state violence that unfolded during a communications blackout.

    Read more →
  • StatMuse

    StatMuse

    StatMuse Inc. is an American artificial intelligence company founded in 2014. It operates an eponymous website that hosts a database of sports statistics covering the four major North American sports leagues, the Women's National Basketball Association (WNBA), NCAA Division I men's basketball, NCAA Division I Football Bowl Subdivision, the Big Five association football leagues in Europe, and various professional golf tours. == History == The company was founded by friends Adam Elmore and Eli Dawson in 2014. In email correspondence to the Springfield News-Leader, Elmore detailed that he and Dawson, fans of the National Basketball Association (NBA), were compelled to create StatMuse after they realized there was no online platform where they could search "Lebron James most points" [sic] and quickly get a result "showing his highest scoring games." As a startup, the company's goal was to utilize a type of artificial intelligence called natural language processing (NLP) for sports. In 2015, the company was part of the second group of startups accepted into the Disney Accelerator program. The company secured support from several investors, including The Walt Disney Company, Techstars, Allen & Company, the NFL Players Association, Greycroft and NBA Commissioner David Stern. As part of their partnership with Disney, StatMuse signed a content deal with ESPN (owned by Disney) to provide stats content on social media and television during the 2015–16 NBA season. Initially, the company only had stats available for the NBA, but eventually expanded to provide stats for the other major North American sports leagues. The company's initial demographic was players of fantasy sports, but it eventually expanded to target general sports fans as well. StatMuse offers responses to user queries in the voices of sports-related public figures. Dawson shared with VentureBeat that StatMuse brings people in and records them saying different words and phrases. These celebrity voices were made accessible through Google's Google Assistant service, Microsoft's Cortana virtual assistant, and Amazon's Echo devices. The company launched its phone app in September 2017. The app allows users to access StatMuse's sports statistics database by submitting queries in their natural language. Upon the launch of the phone app, Fitz Tepper of TechCrunch wrote that: "The technology isn't perfect – some of the pauses between words are a bit awkward, making it clear that some phrases are being stitched together on the fly. But this is the exception, and on the whole, most responses sound pretty good." StatMuse plug-ins for Slack and Facebook Messenger were also made, providing text-based sports stats. In 2019, StatMuse received investment from the Google Assistant Investment program. The service launched a premium option dubbed StatMuse+ in May 2023, offering options that had previously been included for free, such as unlimited searches and full results in data tables. The premium version also included early access to new features and a personalized search history, as well as not having ads. The app received a variety of feedback. In January 2024, the service launched a Premier League version of the website dubbed StatMuse FC. It is planned to introduce more leagues on the website.

    Read more →
  • The MANIAC

    The MANIAC

    The MANIAC is a 2023 novel by Chilean author Benjamín Labatut, written in English. It is a fictionalised biography of polymath John von Neumann, whom Labatut calls "the smartest human being of the 20th century". The book focuses on von Neumann, but is also about physicist Paul Ehrenfest, the history of artificial intelligence, and Lee Sedol's Go match against AlphaGo. The book received mostly positive reviews from critics. == Background == John von Neumann was a Jewish Hungarian-born polymath who was a prodigy from an early childhood. Von Neumann worked in multiple fields of science, theoretical (mathematical foundations of quantum mechanics, game theory, cellular automata) and applied (nuclear weapons research during the Manhattan Project in World War II, computer architecture later named after him, and many other subjects). Labatut calls him "the smartest human being of the 20th century". The title of the book is derived from an early computer based on von Neumann architecture, built after the war at Los Alamos laboratory, called MANIAC I. Benjamín Labatut is a Chilean author known for his 2020 book When We Cease to Understand the World, a collection of fictionalised stories about famous scientists that received positive reviews and was translated into multiple languages from Spanish. The MANIAC is Labatut's first book written in English. In an interview, Labatut said he prefers to write in English: English is my preferred form of thought. ... English is the language I do most if not all my reading it. And it is a far better language than Spanish, in so many ways. Writing "clean" prose in Spanish is almost impossible, because so many of its sounds clash. Borges said that he found English "a far finer language than Spanish" because it's both Germanic and Latin; because of its wonderful vocabulary ("Regal is not exactly the same thing as saying kingly," he explained); because of its physicality; and because you can do almost anything with verbs and prepositions. Labatut was inspired to write The MANIAC by George Dyson's book Turing's Cathedral. == Synopsis == The book has three chapters. The first chapter, "Paul or the Discovery of the Irrational", written in the third person, is about physicist Paul Ehrenfest. The chapter opens with Ehrenfest shooting dead his son Vassily, who suffered from Down syndrome, and then himself. It then recounts Ehrenfest's life story, describing his relationships with his wife Tatyana, his mistress Nelly Meyjes, and his eminent physicist colleagues. It chronicles his descent into despair and depression over his marriage's disintegration, the advent of quantum mechanics, and the direction Europe was heading in with the Nazi Party's rise to power in Germany, looping back to the initial scene of the chapter. The second chapter, "John or the Mad Dreams of Reason", is about John von Neumann, and is written as a series of interviews of his family members, wives, friends, and colleagues, each in a distinctive voice. It is divided into three parts. Part I, "The Limits of Logic", is about his early life, as told by von Neumann's childhood friend Eugene Wigner, mother Margrit Kann, brother Nicholas von Neumann, first wife Mariette Kövesi, and scientists Theodore von Karman, George Polya, and Gábor Szegő. It climaxes with von Neumann's participation in David Hilbert's program to create a logical basis for mathematics based on a consistent set of axioms, a quest ultimately scuppered by Kurt Gödel. Part II, "The Delicate Balance of Terror", discusses von Neumann's role in the Manhattan Project (as told by Richard Feynman); his development of game theory and the doctrine of mutual assured destruction (MAD) (as told by Oskar Morgenstern); and his creation of the MANIAC I computer and the von Neumann architecture (as told by Julian Bigelow). In Part III, "Ghosts in the Machine", Sydney Brenner discusses von Neumann's contributions to biology, his theoretical work on self-replicating and self-repairing machines, and his vision of Von Neumann probes exploring the universe. Nils Aall Barricelli talks about his ideas of digital life and his disagreements with von Neumann. Von Neumann's wife Klára Dán, daughter Marina, and Wigner talk about his final years, personal life, and death. The third chapter, "Lee or The Delusions of Artificial Intelligence", is about Lee Sedol's Go match against AlphaGo. The narrative reverts to the third person. The chapter also tells the story of Demis Hassabis, a chess prodigy in childhood who decided to work on artificial intelligence and founded DeepMind, the company behind AlphaGo. The way is pointed to the future, as artificial intelligence's growing capabilities outpace the human mind. The book ends with Lee Sedol's retirement from Go, and new version of DeepMind's program, AlphaZero, that did not train on human games but nevertheless became the strongest player in Go, chess, and Shogi. == Reception == The book received mostly positive reviews. In his review for The New York Times Tom McCarthy noted the ambiguity of genre: "At its best, as in the stunning opening sequence reconstructing the murder-suicide of the physicist Paul Ehrenfest and his disabled son, or in the final section's gripping account of a computer defeating the world's best human Go player, you just throw up your hands and think, Who cares what discourse label we assign this stuff? It's great." Becca Rothfeld of the Washington Post praised the book, writing that it is "Labatut's latest virtuosic effort, at once a historical novel and a philosophical foray": "The MANIAC is a work of dark, eerie and singular beauty." She noted that the book "can also be difficult to read" because of its unusual narrative structure: "The book is narrated by a cluttered polyphony of characters, among them both of von Neumann's wives and a number of his teachers and colleagues. ... Like von Neumann, The MANIAC strives to adopt the impartial standpoint of the universe." Killian Fox of The Guardian sees the book as "darkly fascinating novel", and notes Labatut's "impressive dexterity, unpicking complex ideas in long, elegant sentences that propel us forward at speed (this is his first book written in English). Even in the more feverish passages, when yet another great mind succumbs to madness, haunted by the spectres they've helped unleash on the world, he feels in full control of his material." Sam Byers of The Guardian praises the book and the author's style: "The opening chapter of Benjamín Labatut's second novel is such a perfect distillation of his technique that it could serve as a manifesto." and "Readers ... will recognise the sense of breathlessness his best writing can evoke. Seemingly loosened from the laws of physics they describe, his sentences range freely through time and space, connecting not only characters and events, but the delicate tissue of intellectual history, often with a lightness of touch that belies their underlying complexity." He writes on the narrative structure: "Through a cascade of staccato chapters, an ensemble of narrators offer their piecemeal insights." Byers adds that "a brilliant novel is not quite what we end up with" and sees the problem in the "diffusion": "Labatut simply spreads himself too thin. Too many years in too few pages; too many voices with far too little to distinguish them. Initially intriguing, the bite-size monologues quickly come to feel inadequate." Some reviewers did not see the book as a biography. In an essay for the Cleveland Review of Books, Ben Cosman juxtaposes the book with Christopher Nolan's biopic Oppenheimer, and writes that it "follows the development of artificial intelligence—first as an idea at the beginning of the twentieth century, and then as a practicality at the beginning of the twenty-first—through the lives of three men who faced it." He also compared the book's structure to "witness testimony". Another reviewer called the book "perfect for anyone thirsting for more nuclear anxiety after watching Oppenheimer". Garrett Biggs of the Chicago Review of Books writes of the book's style: "Labatut writes about scientists the way Roberto Bolaño writes about poets. They are near mythical figures, captured at the corner of the novel's eye. They become historical in the most fraught sense of the term: subject to rumor and speculation and, eventually, the novel's form inflates their personas into something so large they can only be understood as narrative, never known in any objective capacity." Biggs criticises the last chapter: "the story of artificial intelligence has yet to be written. And so when Labatut's narration editorializes about artificial intelligence as 'a future that inspires hope and horror,' The MANIAC disassembles as a novel and starts to sound like a stale thinkpiece. AlphaGo might represent the first glimmer of a true artificial intelligence, as Labatut suggests. It also could one day be considered nothing more than a souped-up cousin to IBM's DeepBlue.

    Read more →
  • Blended artificial intelligence

    Blended artificial intelligence

    Blended artificial intelligence (blended AI) refers to the blending of different artificial intelligence techniques or approaches to achieve more robust and practical solutions. It involves integrating multiple AI models, algorithms, and technologies to leverage their respective strengths and compensate for their weaknesses. == Background == In the context of machine learning, blended AI can involve using different types of models, such as generative AI, decision trees, neural networks, and support vector machines. By combining their results, predictions are more accurate and reliable. This blending of models can be done through techniques like ensemble learning, where multiple models are trained independently and their predictions are combined to make a final decision. Blended AI can also involve combining different AI techniques or technologies, such as natural language processing, computer vision, and expert systems, to tackle complex problems that require a multi-dimensional approach. For example, in a sales scenario AI could be used for lead generation and gathering information from social media such as LinkedIn posts, or understanding a prospect's hobbies and interests. Another blended AI could achieve customer profiling including past interactions and purchasing habits, by them, their industry and growth areas. Blended AI could be used to do predictive analytics to look at historical sales data, market trends, and external factors to generate accurate sales forecasts. This method is critical to gauge and increase "efficiency, revenue, and productivity". Lastly, another could integrate all the information into the CRM to build and maintain better prospect and customer profiles. Blended AI aims to leverage the strengths of different AI techniques and technologies, allowing them to complement each other and create more powerful and comprehensive AI solutions. By combining multiple approaches, blended AI aims to achieve better performance, higher accuracy, improved robustness, and enhanced capabilities in solving diverse and challenging problems.

    Read more →
  • Deep learning speech synthesis

    Deep learning speech synthesis

    Deep learning speech synthesis refers to the application of deep learning models to generate natural-sounding human speech from written text (text-to-speech) or spectrum (vocoder). Deep neural networks are trained using large amounts of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text. == Formulation == Given an input text or some sequence of linguistic units Y {\displaystyle Y} , the target speech X {\displaystyle X} can be derived by X = arg ⁡ max P ( X | Y , θ ) {\displaystyle X=\arg \max P(X|Y,\theta )} where θ {\displaystyle \theta } is the set of model parameters. Typically, the input text will first be passed to an acoustic feature generator, then the acoustic features are passed to the neural vocoder. For the acoustic feature generator, the loss function is typically L1 loss (Mean Absolute Error, MAE) or L2 loss (Mean Square Error, MSE). These loss functions impose a constraint that the output acoustic feature distributions must be Gaussian or Laplacian. In practice, since the human voice band ranges from approximately 300 to 4000 Hz, the loss function will be designed to have more penalty on this range: l o s s = α loss human + ( 1 − α ) loss other {\displaystyle loss=\alpha {\text{loss}}_{\text{human}}+(1-\alpha ){\text{loss}}_{\text{other}}} where loss human {\displaystyle {\text{loss}}_{\text{human}}} is the loss from human voice band and α {\displaystyle \alpha } is a scalar, typically around 0.5. The acoustic feature is typically a spectrogram or Mel scale. These features capture the time-frequency relation of the speech signal, and thus are sufficient to generate intelligent outputs. The Mel-frequency cepstrum feature used in the speech recognition task is not suitable for speech synthesis, as it reduces too much information. == History == In September 2016, DeepMind released WaveNet, which demonstrated that deep learning-based models are capable of modeling raw waveforms and generating speech from acoustic features like spectrograms or mel-spectrograms. Although WaveNet was initially considered to be computationally expensive and slow to be used in consumer products at the time, a year after its release, DeepMind unveiled a modified version of WaveNet known as "Parallel WaveNet," a production model 1,000 faster than the original. This was followed by Google AI's Tacotron 2 in 2018, which demonstrated that neural networks could produce highly natural speech synthesis but required substantial training data—typically tens of hours of audio—to achieve acceptable quality. Tacotron 2 used an autoencoder architecture with attention mechanisms to convert input text into mel-spectrograms, which were then converted to waveforms using a separate neural vocoder. When trained on smaller datasets, such as 2 hours of speech, the output quality degraded while still being able to maintain intelligible speech, and with just 24 minutes of training data, Tacotron 2 failed to produce intelligible speech. In 2019, Microsoft Research introduced FastSpeech, which addressed speed limitations in autoregressive models like Tacotron 2. FastSpeech utilized a non-autoregressive architecture that enabled parallel sequence generation, significantly reducing inference time while maintaining audio quality. Its feedforward transformer network with length regulation allowed for one-shot prediction of the full mel-spectrogram sequence, avoiding the sequential dependencies that bottlenecked previous approaches. The same year saw the release of HiFi-GAN, a generative adversarial network (GAN)-based vocoder that improved the efficiency of waveform generation while producing high-fidelity speech. In 2020, the release of Glow-TTS introduced a flow-based approach that allowed for fast inference and voice style transfer capabilities. In March 2020, the free text-to-speech website 15.ai was launched. 15.ai gained widespread international attention in early 2021 for its ability to synthesize emotionally expressive speech of fictional characters from popular media with minimal amount of data. The creator of 15.ai (known pseudonymously as 15) stated that 15 seconds of training data is sufficient to perfectly clone a person's voice (hence its name, "15.ai"), a significant reduction from the previously known data requirement of tens of hours. 15.ai is credited as the first platform to popularize AI voice cloning in memes and content creation. 15.ai used a multi-speaker model that enabled simultaneous training of multiple voices and emotions, implemented sentiment analysis using DeepMoji, and supported precise pronunciation control via ARPABET. The 15-second data efficiency benchmark was later corroborated by OpenAI in 2024. == Semi-supervised learning == Currently, self-supervised learning has gained much attention through better use of unlabelled data. Research has shown that, with the aid of self-supervised loss, the need for paired data decreases. == Zero-shot speaker adaptation == Zero-shot speaker adaptation is promising because a single model can generate speech with various speaker styles and characteristic. In June 2018, Google proposed to use pre-trained speaker verification models as speaker encoders to extract speaker embeddings. The speaker encoders then become part of the neural text-to-speech models, so that it can determine the style and characteristics of the output speech. This procedure has shown the community that it is possible to use only a single model to generate speech with multiple styles. == Neural vocoder == In deep learning-based speech synthesis, neural vocoders play an important role in generating high-quality speech from acoustic features. The WaveNet model proposed in 2016 achieves excellent performance on speech quality. Wavenet factorised the joint probability of a waveform x = { x 1 , . . . , x T } {\displaystyle \mathbf {x} =\{x_{1},...,x_{T}\}} as a product of conditional probabilities as follows p θ ( x ) = ∏ t = 1 T p ( x t | x 1 , . . . , x t − 1 ) {\displaystyle p_{\theta }(\mathbf {x} )=\prod _{t=1}^{T}p(x_{t}|x_{1},...,x_{t-1})} where θ {\displaystyle \theta } is the model parameter including many dilated convolution layers. Thus, each audio sample x t {\displaystyle x_{t}} is conditioned on the samples at all previous timesteps. However, the auto-regressive nature of WaveNet makes the inference process dramatically slow. To solve this problem, Parallel WaveNet was proposed. Parallel WaveNet is an inverse autoregressive flow-based model which is trained by knowledge distillation with a pre-trained teacher WaveNet model. Since such inverse autoregressive flow-based models are non-auto-regressive when performing inference, the inference speed is faster than real-time. Meanwhile, Nvidia proposed a flow-based WaveGlow model, which can also generate speech faster than real-time. However, despite the high inference speed, parallel WaveNet has the limitation of needing a pre-trained WaveNet model, so that WaveGlow takes many weeks to converge with limited computing devices. This issue has been solved by Parallel WaveGAN, which learns to produce speech through multi-resolution spectral loss and GAN learning strategies.

    Read more →
  • Apache Parquet

    Apache Parquet

    Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem inspired by Google Dremel interactive ad-hoc query system for analysis of read-only nested data. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop. It provides data compression and encoding schemes with enhanced performance to handle complex data in bulk. == History == The open-source project to build Apache Parquet began as a joint effort between Twitter and Cloudera using the record shredding and assembly algorithm as described in Google's Dremel. Parquet was designed as an improvement on the Trevni columnar storage format created by Doug Cutting, the creator of Hadoop. The name 'parquet' (lit. 'small compartment') refers to a style of decorative flooring and was chosen to "evoke the bottom layer of a database with an interesting layout". The first version, Apache Parquet 1.0, was released in July 2013. Since April 27, 2015, Apache Parquet has been a top-level Apache Software Foundation (ASF)-sponsored project. == Features == Apache Parquet is implemented using the record-shredding and assembly algorithm, which accommodates the complex data structures that can be used to store data. The values in each column are stored in contiguous memory locations, providing the following benefits: Column-wise compression is efficient in storage space Encoding and compression techniques specific to the type of data in each column can be used Queries that fetch specific column values need not read the entire row, thus improving performance Apache Parquet is implemented using the Apache Thrift framework, which increases its flexibility; it can work with a number of programming languages like C++, Java, Python, PHP, etc. As of August 2015, Parquet supports the big-data-processing frameworks including Apache Hive, Apache Drill, Apache Impala, Apache Crunch, Apache Pig, Cascading, Presto and Apache Spark. It is one of the external data formats used by the pandas Python data manipulation and analysis library. == Compression and encoding == In Parquet, compression is performed column by column, which enables different encoding schemes to be used for text and integer data. This strategy also keeps the door open for newer and better encoding schemes to be implemented as they are invented. Parquet supports various compression formats: snappy, gzip, LZO, brotli, zstd, and LZ4. === Dictionary encoding === Parquet has an automatic dictionary encoding enabled dynamically for data with a small number of unique values (i.e. below 105) that enables significant compression and boosts processing speed. === Bit packing === Storage of integers is usually done with dedicated 32 or 64 bits per integer. For small integers, packing multiple integers into the same space makes storage more efficient. === Run-length encoding (RLE) === To optimize storage of multiple occurrences of the same value, run-length encoding is used, which is where a single value is stored once along with the number of occurrences. Parquet implements a hybrid of bit packing and RLE, in which the encoding switches based on which produces the best compression results. This strategy works well for certain types of integer data and combines well with dictionary encoding. == Cloud Storage and Data Lakes == Parquet is widely used as the underlying file format in modern cloud-based data lake architectures. Cloud storage systems such as Amazon S3, Azure Data Lake Storage, and Google Cloud Storage commonly store data in Parquet format due to its efficient columnar representation and retrieval capabilities. Data lakehouse frameworks—including Apache Iceberg, Delta Lake, and Apache Hudi —build an additional metadata layer on top of Parquet files to support features such as schema evolution, time-travel queries, and ACID-compliant transactions. In these architectures, Parquet files serve as the immutable storage layer while the table formats manage data versioning and transactional integrity. == Comparison == Apache Parquet is comparable to RCFile and Optimized Row Columnar (ORC) file formats — all three fall under the category of columnar data storage within the Hadoop ecosystem. They all have better compression and encoding with improved read performance at the cost of slower writes. In addition to these features, Apache Parquet supports limited schema evolution, i.e., the schema can be modified according to the changes in the data. It also provides the ability to add new columns and merge schemas that do not conflict. Apache Arrow is designed as an in-memory complement to on-disk columnar formats like Parquet and ORC. The Arrow and Parquet projects include libraries that allow for reading and writing between the two formats. == Implementations == Known implementations of Parquet include:

    Read more →
  • Dominic Harris

    Dominic Harris

    Dominic Harris (born 16 November 1976) is a British artist known for integrating modern technology and classical design in his interactive artworks. == Background == Dominic Harris was born in London on 16 November 1976, and grew up in London, Brussels, and Michigan before returning to London in 1995. Harris attended the Cranbrook Kingswood Upper School, and then trained as an architect at the Bartlett School of Architecture, and has been ARB registered since 2011. Harris designs and fabricates his artworks at Dominic Harris Studio, a multi-disciplinary practice he founded in 2007. This studio consists of 25 people with diverse backgrounds including architecture, product design, electronics, programming, graphic design, and workshop skills. Harris uses the resources of his studio for the ongoing development, prototyping and production of his artworks. Harris also oversees the studio's international projects where his fascinations are translated into larger scale projects that span residential, retail, and public art projects. In 2015, Harris was granted permission by the Walt Disney Company to use their Intellectual Property for the purpose of making new interactive artworks. Harris is the only artist to gain permission to use Disney's back catalogue of characters, and led him to creating his interactive versions of "Snow White and the Seven Dwarfs" and "Mickey and Minnie: An Interactive Diptych". Harris is fascinated by the idea of using data streams, algorithms, and computer code to generate dynamic and ever-changing artworks. He sees data as a raw material that can be transformed into visual poetry. Many of his installations and sculptures are interactive, responding to the presence and movement of viewers/participants. This creates an immersive experience where the observer becomes part of the artwork itself. Harris is also the founding partner of a sister studio in London called Cinimod Studio that creates large commissioned installations, interactive events and lighting designs for large brands. == Works == == Exhibitions == The works of Dominic Harris have been exhibited internationally, both through direct and gallery representation. Solo shows: "Feeding Consciousness" at Halcyon Gallery, Mayfair, London, UK – 2023 "US: NOW" at Halcyon Gallery, Mayfair, London, UK – 2020 "Imagine" at Halcyon Gallery, Mayfair, London, UK – 2019 "5 Year Celebration", Priveekollektie Contemporary Art | Design, London, UK – 2016. "Moments of Reflection" at PHOS ART + DESIGN, Mayfair, London, UK – 2015 Recent exhibitions include: In Plain Sight, 2024 Halcyon Gallery Victoria & Albert Museum Dublin Science Museum Design Miami / Basel Design Miami Art Miami Art 14, London PAD Paris PAD London Art Geneva == Gallery Representation == 2010 to 2019: Dominic Harris was represented by Priveekollektie Contemporary Art | Design, a Dutch gallery based in Heusden, the Netherlands, and with a regular presence on the international art and design circuits. 2015: Dominic Harris was shown with PHOS ART + DESIGN Gallery, in Mayfair, London, UK. 2019 – ongoing: Dominic Harris is exclusively represented by the Halcyon Gallery, an established international gallery based in Mayfair, London. == Collections == The majority of Harris's work has been bought by private collectors. Since 2012 Harris's work is also being acquired by several large institutional collections, including the Borusan Contemporary Art Collection in Istanbul. Harris's artworks include some of the biggest and most respected international art collectors and are also displayed in public spaces. == Books == Dominic Harris: Feeding Consciousness. Halcyon Gallery, 2023. Imagine: Dominic Harris (exhibition catalogue). Halcyon Gallery, 2019. A Touch Of Code: Documents the "Beacon" art installation and "Flutter" artwork (ISBN 978-3899553314) Dominic Harris, Artworks, Edition Eight. (ISBN 978-0957306325) Digital Real: Kunst & Nachhaltigkeit Vol 8.

    Read more →
  • Tip and cue

    Tip and cue

    Tip and cue, sometimes referred to as tip and que, tipping and cueing, or tipping and queing, is a method for satellite imagery and reconnaissance satellites to automatically coordinate tracking of objects across different satellites in real or near real-time. This technique ensures continuous tracking of targets as they move across different regions by handing them off between satellites, sharing satellite imagery and collateral across discrete satellites. The coordination between various satellites and their complementary sensors allows for more accurate and efficient data collection. This system is particularly useful in scenarios requiring real-time monitoring and rapid response; the method significantly improves situational awareness and operational effectiveness. Tip and cue techniques involve integrating various sensor systems, each playing a specific role in the tracking process. As a target moves, it is handed off from one satellite to another, ensuring continuous monitoring. This coordination optimizes data collection and analysis, enhancing overall tracking accuracy. The real-time information gathered by these satellites is critical for decision-making in various applications, including defense and surveillance. By leveraging multiple satellites and their sensors, it provides broader coverage and more reliable tracking, and the continuous handoff between satellites ensures there are no gaps in monitoring, essential for high-stakes applications. The real-time data provided by this system allows for timely and informed decisions, improving response times and outcomes. Tip and cue methodologies are a part of geospatial intelligence, or GEOINT. Robert Cardillo, a former director of the National Geospatial-Intelligence Agency, highlighted the importance of tip and cue methods to their data collection efforts in 2015. == Historical Development == The concept of tip and cue in satellite monitoring has its origins in early military applications designed to enhance missile detection and tracking systems. During the Cold War, advancements in infrared sensing technologies laid the groundwork for more sophisticated tip and cue techniques. The integration of different sensor types, such as radar and optical sensors, in the 1990s expanded the capabilities of tip and cue systems beyond military applications. These advancements have made tip and cue techniques essential for various civilian uses, including disaster monitoring and environmental surveillance. Significant progress was made with the advent of high-speed data processing and communication technologies in the early 2000s, further refining the method. Advanced algorithms and data fusion techniques have been introduced to better integrate information from multiple sensors. Machine learning technologies now play a crucial role in improving detection and prediction capabilities, allowing for more adaptive and efficient tracking. Richmond and Brennan of Lockheed Martin, presenting to the annual technical conference of the Maui Space Surveillance Complex (formerly the Air Force Maui Optical Station (AMOS)), discussed the algorithms needed for 'tip and cue', to facilitate "multi-phenomenology data fusion." The Space Surveillance Telescope (SST) at Naval Communication Station Harold E. Holt in Australia, operated by the United States Space Force and designed by the Massachusetts Institute of Technology Lincoln Laboratory, was reported by the Defense Advanced Research Projects Agency (DARPA) to be a leader in creating and improving tip and cue techniques, from a large library of orbital object data. == Technical overview == Tip and cue systems utilize a network of at least two satellites equipped with complementary sensor technologies to track moving objects in real-time. The method involves detecting a target with a primary sensor, such as an infrared or photographic sensor, which then cues secondary sensors on the same or other satellites for more detailed monitoring. This handoff process between discrete systems ensures continuous tracking as the target moves across different areas, leveraging each systems strengths. Data collected by these systems and sensors are rapidly processed and shared among the network, enhancing situational awareness. This coordination optimizes resource usage and improves the accuracy of tracking moving objects over large areas. The primary sensors detect initial targets based on specific signatures, such as heat or movement, and then cue secondary sensors to gather more precise data. This ensures that each sensor operates within its optimal range, maintaining high tracking accuracy and reliability. The integration of various sensor types, including optical, radar, and infrared, allows the system to function effectively under different conditions and environments. Real-time data processing and communication between satellites and ground stations are crucial for timely and accurate target tracking. Satellites using tip and cue processes may use either passive or active scanning methodoloigies. These systems may also leverage both orbital and ground-based ELINT (electronic signals intelligence). == Known use cases == Tip and cue systems have been extensively utilized in military applications, particularly for missile detection and defense. These systems enable early detection of missile launches using infrared sensors, which then cue other sensors to track the missile's trajectory more accurately. In environmental monitoring, tip and cue techniques help track natural disasters such as wildfires and hurricanes by coordinating various satellite sensors for comprehensive data collection and analysis. Surveillance and reconnaissance operations also benefit from tip and cue systems, which provide continuous and precise tracking of moving objects, enhancing situational awareness. Additionally, these systems are used in maritime surveillance to monitor ship movements and detect illegal activities such as smuggling and piracy. Tip and cue systems are used in disaster management. For instance, during wildfires, infrared sensors can detect heat signatures, prompting other sensors to gather detailed imagery and data on fire spread and intensity. This coordinated approach allows for real-time monitoring and rapid response, crucial for mitigating damage and saving lives. Similarly, in hurricane tracking, satellites equipped with various sensors can monitor storm development and progression, providing timely information for emergency management agencies. The integration of multiple sensor types ensures accurate and comprehensive coverage of these dynamic and fast-changing events. In maritime surveillance, or maritime domain awareness (MDA), tip and cue systems enhance the detection and monitoring of vessel movements, contributing to maritime security. By coordinating satellite sensors, these systems can track ships over vast ocean areas, identifying potential threats or illegal activities such as smuggling, piracy, and illegal fishing. The ability to maintain continuous surveillance and share data in real-time with maritime authorities improves response times and enforcement capabilities. This application of tip and cue systems not only aids in law enforcement but also supports environmental conservation efforts by monitoring protected marine areas. Automatic Identification System (AIS) is one of the most important sources of data for the MDA agencies. AIS is used in order for ships to know each other's whereabouts, they transmit a signal from ship to ship and to shore. Lately, the system has been developed into satellite system, so called satellite AIS, which makes the system more effective. All ocean-going vessels above 300 tons, are supposed to use and transmit via AIS according to the International Maritime Organization. The satellite constellations help facilitate this with tip and cue methodologies.

    Read more →