AI App How To Use

AI App How To Use — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Inverse consistency

    Inverse consistency

    In image registration, inverse consistency measures the consistency of mappings between images produced by a registration algorithm. The inverse consistency error, introduced by Christiansen and Johnson in 2001, quantifies the distance between the composition of the mappings from each image to the other, produced by the registration procedure, and the identity function, and is used as a regularisation constraint in the loss function of many registration algorithms to enforce consistent mappings. Inverse consistency is necessary for good image registration but it is not sufficient, since a mapping can be perfectly consistent but not register the images at all. == Definition == Image registration is the process of establishing a common coordinate system between two images, and given two images I 1 : Ω 1 → R I 2 : Ω 2 → R {\displaystyle {\begin{aligned}I_{1}:\Omega _{1}\to \mathbb {R} \\I_{2}:\Omega _{2}\to \mathbb {R} \end{aligned}}} registering a source image I 1 {\displaystyle I_{1}} to a target image I 2 {\displaystyle I_{2}} consists of determining a transformation f 1 : Ω 2 → Ω 1 {\displaystyle f_{1}:\Omega _{2}\to \Omega _{1}} that maps points from the target space to the source space. An ideal registration algorithm should not be sensitive to which image in the pair is used as source or target, and the registration operator should be antisymmetric such that the mappings f 1 : Ω 2 → Ω 1 f 2 : Ω 1 → Ω 2 {\displaystyle {\begin{aligned}f_{1}:\Omega _{2}\to \Omega _{1}\\f_{2}:\Omega _{1}\to \Omega _{2}\end{aligned}}} produced when registering I 1 {\displaystyle I_{1}} to I 2 {\displaystyle I_{2}} and I 2 {\displaystyle I_{2}} to I 1 {\displaystyle I_{1}} respectively should be the inverse of each other, i.e. f 2 = f 1 − 1 {\displaystyle f_{2}=f_{1}^{-1}} and f 1 = f 2 − 1 {\displaystyle f_{1}=f_{2}^{-1}} or, equivalently, f 2 ∘ f 1 = id Ω 2 {\displaystyle f_{2}\circ f_{1}=\operatorname {id} _{\Omega _{2}}} and f 1 ∘ f 2 = id Ω 1 {\displaystyle f_{1}\circ f_{2}=\operatorname {id} _{\Omega _{1}}} , where ∘ {\displaystyle \circ } denotes the function composition operator. Real algorithms are not perfect, and when swapping the role of source and target image in a registration problem the so obtained transformations are not the inverse of each other. Inverse consistency can be enforced by adding to the loss function of the registration a symmetric regularisation term that penalises inconsistent transformations ∫ Ω 2 ‖ f 2 ( f 1 ( x ) ) − x ‖ 2 d x + ∫ Ω 1 ‖ f 1 ( f 2 ( x ) ) − x ‖ 2 d x . {\displaystyle \int _{\Omega _{2}}\left\Vert f_{2}(f_{1}(x))-x\right\Vert ^{2}\mathrm {d} x+\int _{\Omega _{1}}\left\Vert f_{1}(f_{2}(x))-x\right\Vert ^{2}\mathrm {d} x.} Inverse consistency can be used as a quality metric to evaluate image registration results. The inverse consistency error ( I C E {\displaystyle ICE} ) measures the distance between the composition of the two transforms and the identity function, and it can be formulated in terms of both average ( I C E a {\displaystyle ICE_{a}} ) or maximum ( I C E m {\displaystyle ICE_{m}} ) over a region of interest Ω {\displaystyle \Omega } of the image: I C E a = 1 ∫ Ω d x ∫ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ d x I C E m = max x ∈ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ . {\displaystyle {\begin{aligned}ICE_{a}&={\frac {1}{\int _{\Omega }\mathrm {d} x}}\int _{\Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert \mathrm {d} x\\ICE_{m}&=\max _{x\in \Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert .\end{aligned}}} While inverse consistency is a necessary property of good registration algorithms, inverse consistency error alone is not a sufficient metric to evaluate the quality of image registration results, since a perfectly consistent mapping, with no other constraint, may be not even close to correctly register a pair of images.

    Read more →
  • Content-based image retrieval

    Content-based image retrieval

    Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases (see this survey for a scientific overview of the CBIR field). Content-based image retrieval is opposed to traditional concept-based approaches (see Concept-based image indexing). "Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because searches that rely purely on metadata are dependent on annotation quality and completeness. == Comparison with metadata searching == An image meta search requires humans to have manually annotated images by entering keywords or metadata in a large database, which can be time-consuming and may not capture the keywords desired to describe the image. The evaluation of the effectiveness of keyword image search is subjective and has not been well-defined. In the same regard, CBIR systems have similar challenges in defining success. "Keywords also limit the scope of queries to the set of predetermined criteria." and, "having been set up" are less reliable than using the content itself. == History == The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese Electrotechnical Laboratory engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision. === QBIC - Query By Image Content === The earliest commercial CBIR system was developed by IBM and was called QBIC (Query By Image Content). Recent network- and graph-based approaches have presented a simple and attractive alternative to existing methods. While the storing of multiple images as part of a single entity preceded the term BLOB (Binary Large OBject), the ability to fully search by content, rather than by description, had to await IBM's QBIC. === VisualRank === == Technical progress == The interest in CBIR has grown because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but this requires humans to manually describe each image in the database. This can be impractical for very large databases or for images that are generated automatically, e.g. those from surveillance cameras. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" can avoid the miscategorization problem, but will require more effort by a user to find images that might be "cats", but are only classified as an "animal". Many standards have been developed to categorize images, but all still face scaling and miscategorization issues. Initial CBIR systems were developed to search databases based on image color, texture, and shape properties. After these systems were developed, the need for user-friendly interfaces became apparent. Therefore, efforts in the CBIR field started to include human-centered design that tried to meet the needs of the user performing the search. This typically means inclusion of: query methods that may allow descriptive semantics, queries that may involve user feedback, systems that may include machine learning, and systems that may understand user satisfaction levels. == Techniques == Many CBIR systems have been developed, but as of 2006, the problem of retrieving images on the basis of their pixel content remains largely unsolved. Different query techniques and implementations of CBIR make use of different types of user queries. === Query By Example === QBE (Query By Example) is a query technique that involves providing the CBIR system with an example image that it will then base its search upon. The underlying search algorithms may vary depending on the application, but result images should all share common elements with the provided example. Options for providing example images to the system include: A preexisting image may be supplied by the user or chosen from a random set. The user draws a rough approximation of the image they are looking for, for example with blobs of color or general shapes. This query technique removes the difficulties that can arise when trying to describe images with words. === Semantic retrieval === Semantic retrieval starts with a user making a request like "find pictures of Abraham Lincoln". This type of open-ended task is very difficult for computers to perform - Lincoln may not always be facing the camera or in the same pose. Many CBIR systems therefore generally make use of lower-level features like texture, color, and shape. These features are either used in combination with interfaces that allow easier input of the criteria or with databases that have already been trained to match features (such as faces, fingerprints, or shape matching). However, in general, image retrieval requires human feedback in order to identify higher-level concepts. === Relevance feedback (human interaction) === Combining CBIR search techniques available with the wide range of potential users and their intent can be a difficult task. An aspect of making CBIR successful relies entirely on the ability to understand the user intent. CBIR systems can make use of relevance feedback, where the user progressively refines the search results by marking images in the results as "relevant", "not relevant", or "neutral" to the search query, then repeating the search with the new information. Examples of this type of interface have been developed. === Iterative/machine learning === Machine learning and application of iterative techniques are becoming more common in CBIR. === Other query methods === Other query methods include browsing for example images, navigating customized/hierarchical categories, querying by image region (rather than the entire image), querying by multiple example images, querying by visual sketch, querying by direct specification of image features, and multimodal queries (e.g. combining touch, voice, etc.) == Content comparison using image distance measures == The most common method for comparing two images in content-based image retrieval (typically an example image and an image from the database) is using an image distance measure. An image distance measure compares the similarity of two images in various dimensions such as color, texture, shape, and others. For example, a distance of 0 signifies an exact match with the query, with respect to the dimensions that were considered. As one may intuitively gather, a value greater than 0 indicates various degrees of similarities between the images. Search results then can be sorted based on their distance to the queried image. Many measures of image distance (Similarity Models) have been developed. === Color === Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. Examining images based on the colors they contain is one of the most widely used techniques because it can be completed without regard to image size or orientation. However, research has also attempted to segment color proportion by region and by spatial relationship among several color regions. === Texture === Texture measures look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located. Texture is a difficult concept to represent. The identification of specific textures in an image is achieved primarily by modeling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, coarseness and directionality may be estimated. The problem is in identifying patterns of co-pixel variation and associating them with particular classes of textures such as silky, or rough. Other methods of classifying textures include: Co-occurrence matrix Laws texture energy Wavelet transform Orthogonal transforms (discrete Chebyshev moments) =

    Read more →
  • Neuroshima

    Neuroshima

    Neuroshima is a Polish tabletop roleplaying system inspired by such films and games as Mad Max, Fallout, The Matrix, Terminator and Deadlands: Hell on Earth. It is currently available only in Polish. The game's motto is "never trust the machines". Its designers include Michal Oracz and Ignacy Trzewiczek. == Setting == The game describes the United States in the mid-21st century, after a nuclear war started by a cybernetic revolt, which molded the continent into a barren wasteland. It seems that the reason for the war to break out was a sentient Artificial Intelligence commonly referred to as Moloch and made up of interconnected net of military computers: automated factories, military facilities, power plants and alike, that now cover the whole north of the U.S., from Oregon to the Great Lakes. On the south, there is another creation, called the Neojungle, that poses a threat to those who survived the war. It is a semi-intelligent carnivorous vegetation that grows very quickly, advancing north from Latin America. Right in the middle, there are humans. They are surrounded by mutant creatures, some bred by Moloch and hostile towards humans, and some simply animals and humans misshapen by nuclear fallout. On top of that there are Moloch's deadly machines lurking to complete the picture. But what is stressed in the book is that the worst enemy of humans is within them: hatred, indifference, greed. === Landscapes of Neuroshima === Car wrecks, ruined towns and villages, collapsed roofs on deserted houses, broken glass in the windows of abandoned gas stations fill the landscape of the United States of the middle of the 21st century. Technology is history - cars will not start, radios are jammed, no electricity whatsoever almost everywhere the characters go. Shops and malls are looted, prosperous villages are burned by gangers, and safe places are very sparse. === People in Neuroshima === No one knows how many people survived the war with machines, but it is estimated that their number oscillates around 2-3 million. Some people reverted to nomadic lifestyles and live in the deserts, some of them try to build the civilisation anew in devastated cities, some of them form gangs of highwaymen (called gangers), some of them just try to make a living by growing crops, and finally, there are those who just wander around the wasteland; the adventuring sort here is mostly represented by player characters. Each village they visit in this world is a discrete microcosm and nothing is certain as whether the inhabitants are welcoming or shoot strangers on sight. The continent is full of small, anonymous settlements, but there are places which aspire to become post-nuclear states. === Places in Neuroshima === In this world it is very important where you come from, and that is because people are prejudiced and afraid of strangers. Different places produce different kinds of people, and who you are is determined by where you are from. Examples: The Southern Hegemony - (commonly referred to as 'the Hegemony') - located in what was once Arizona, New Mexico and partially Texas. A place where brute force determines one's place in the society. Dominated by gangs and unhampered by Moloch, the Hegemony is a threat to neighbouring lands. Vegas - the only well-lit city in the post-apocalyptic world. Home to many playhouses and casinos, it attracts people from every part of the country. Mother Desert - if you were born in the desert, whenever you go away from civilisation, you feel at home. Many Native Americans still live out there and are doing fine - after all the warheads did not hit the deserts. Detroit - known for some of the best drivers and racers in the post-nuclear US. Home of many gangs, such as The Shultz (mafia styled), Hurons (punkers), The League (racers), Parker Lots (gothic assassins) and the Gas Drinkers (mutant barbarians). New York - a place which has established a strong government and would like to rebuild America. They maintain schools, factories and railways and send soldiers to fight Moloch. Surprisingly enough, they sometimes succeed. Texas - the healthiest place in America. Actually, the only place where one can find green vegetation. Modern Texans still grow crops, breed horses and herd cattle, like their ancestors in the 19th century did. The Appalachian Federation - a place ruled by feudal lords. They have a social class system, in which people are divided into nobility and peasantry. Thanks to its iron and coal deposits, it's one of the richest places in the post-nuclear U.S. The Outpost - A mobile settlement run by scientists who aim to destroy Moloch. In coalition with New York, they manage an army, which is yet to stop Moloch's advance south. They steal technology from the machines they destroy and apply it to their own advantage. == System == The game uses its own, custom system of rules. The dice you use is d20. This system does not have an official name, but it is unconnected to the d20 system, as it typically uses three twenty-sided dice. === Four colours === Neuroshima relies on the division of the gameplay into something the authors called Four Colours, namely steel, chrome, rust and mercury. The choice of a particular colour is made by the gamemaster (the decision can be consulted with the players in order to enhance the game experience) and determines the mood, atmosphere and the type of events/characters present in the story. The name of the colour itself implies the kind of gameplay it will symbolise. These colours are: Steel - this kind of gameplay is characterised by a slightly optimistic attitude towards the world. The aim is to raise the spirit of the characters by showing them that the war with the machines that is going on may be a difficult one, but it is not unwinnable, and that humans, when strong and united, can build the world anew. Example of a story: a unit of soldiers dispatched from the Outpost is sent to build a bunker and establish a relay base far in the north in order to plan a counter-tactic against Moloch's advance south. Chromium - is characterised by a hedonistic attitude. The characters are supposed to enjoy anything that is left from the world after the war and the story is supposed to allow them to do that. Example: the characters are offered a well-paid job by a local ganger boss who extorts wares from local tradesmen. Their job is to drive around the county and pick up the extorted items and trade it for drugs. Rust - a depressing, pessimistic mood. The characters will encounter rust, dilapidation and ruin everywhere they go. All the elements and NPCs of a story played in this mood are supposed to put the characters down and destroy their spirit. Example: the characters, badly wounded after a gunfight and robbed of all their possession find refuge in a village which is constantly raided by gangers. The characters' quest is to repel those attacks, but the enemies outnumber them and are well equipped, whereas the characters have nothing to fight with. Mercury (Quicksilver) - the most depressing side of the game; usually stories played in this mood end with the death of all the characters. The aim of this mood is to show that any kind of action undertaken is futile and that the war is already over, hence all the people are already dead, which is a fact they just need to realise. Example: a group of soldiers stationed in a bunker is awaiting an attack by mutants. They are well-armed and trained, but there is a mistake in the intelligence they were given and they do not know yet that they are seriously outnumbered. The attack commences at dusk and it is already too late to retreat, so the characters decide to seal off the bunker, hopeful that the mutants will not be able to get inside and simply go away. The mutants attack the bunker with chemical weapons instead. The characters do not have enough gas masks to go around. As an effect, those strong enough will kill the weaker ones to get their masks, not knowing that the mutants will blow up the sealed entrance the following morning. == Official rulebooks and sourcebooks == The current edition is 1.5 [1]. Since the release of the game in 2003, sourcebooks have been appearing. The game keeps growing bigger with every add-on, as well as the storyline, which is updated in those sourcebooks and in Space Pirate (pl. Gwiezdny Pirat) magazine, also published by Portal. === List of released rulebooks and sourcebooks === Neuroshima 1.0 - the original edition of the core rulebook (out of print). Neuroshima 1.5 - enhanced and revised core rulebook, with new material added and some material cut out. Wyścig (The Race) - sourcebook dedicated to cars and racing; contains rules concerning building your own vehicle and new character classes connected with driving. Gladiator - sourcebook describing in detail the "Gladiator" character class. Supplement (Supplement) - sourcebook revising the core rulebook. Detroit - sourcebook describing the city of Detroit, its inhabi

    Read more →
  • Fuzzy Control Language

    Fuzzy Control Language

    Fuzzy Control Language, or FCL, is a language for implementing fuzzy logic, especially fuzzy control. It was standardized by IEC 61131-7. It is a domain-specific programming language: it has no features unrelated to fuzzy logic, so it is impossible to even print "Hello, world!". Therefore, one does not write a program in FCL, but one may write part of it in FCL. == Example == RULE 0: IF (temperature IS cold) THEN (output IS low) RULE 1: IF (temperature IS very cold) THEN (output IS high) == Limitations == FCL is not an entirely complete fuzzy language, for instance, it does not support "hedges", which are adverbs that modify the set. For instance, the programmer cannot write: RULE 0: If (Temperature is VERY COLD) then (Output is VERY HIGH) However, the programmer can simply define new sets for "very cold" and "very high". FCL also lacks support for higher-order fuzzy sets, subsets, and so on. None of these features are essential to fuzzy control, although they may be nice to have.

    Read more →
  • Convolutional neural network

    Convolutional neural network

    A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. CNNs are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently been replaced—in some cases—by newer architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by the regularization that comes from using shared weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution (or cross-correlation) kernels, only 25 weights for each convolutional layer are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features. Some applications of CNNs include: image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain–computer interfaces, and financial time series. CNNs are also known as shift invariant or space invariant artificial neural networks, based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input. Feedforward neural networks are usually fully connected networks, that is, each neuron in one layer is connected to all neurons in the next layer. The "full connectivity" of these networks makes them prone to overfitting data. Typical ways of regularization, or preventing overfitting, include: penalizing parameters during training (such as weight decay) or trimming connectivity (skipped connections, dropout, etc.) Robust datasets also increase the probability that CNNs will learn the generalized principles that characterize a given dataset rather than the biases of a poorly-populated set. Convolutional networks were inspired by biological processes in that the connectivity pattern between neurons resembles the organization of the animal visual cortex. Individual cortical neurons respond to stimuli only in a restricted region of the visual field known as the receptive field. The receptive fields of different neurons partially overlap such that they cover the entire visual field. CNNs use relatively little pre-processing compared to other image classification algorithms. This means that the network learns to optimize the filters (or kernels) through automated learning, whereas in traditional algorithms these filters are hand-engineered. This simplifies and automates the process, enhancing efficiency and scalability overcoming human-intervention bottlenecks. == Architecture == A convolutional neural network consists of an input layer, hidden layers and an output layer. In a convolutional neural network, the hidden layers include one or more layers that perform convolutions. Typically this includes a layer that performs a dot product of the convolution kernel with the layer's input matrix. This product is usually the Frobenius inner product, and its activation function is commonly ReLU. As the convolution kernel slides along the input matrix for the layer, the convolution operation generates a feature map, which in turn contributes to the input of the next layer. This is followed by other layers such as pooling layers, fully connected layers, and normalization layers. Here it should be noted how close a convolutional neural network is to a matched filter. === Convolutional layers === In a CNN, the input is a tensor with shape: (number of inputs) × (input height) × (input width) × (input channels) After passing through a convolutional layer, the image becomes abstracted to a feature map, also called an activation map, with shape: (number of inputs) × (feature map height) × (feature map width) × (feature map channels). Convolutional layers convolve the input and pass its result to the next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus. Each convolutional neuron processes data only for its receptive field. Although fully connected feedforward neural networks can be used to learn features and classify data, this architecture is generally impractical for larger inputs (e.g., high-resolution images), which would require massive numbers of neurons because each pixel is a relevant input feature. A fully connected layer for an image of size 100 × 100 has 10,000 weights for each neuron in the second layer. Convolution reduces the number of free parameters, allowing the network to be deeper. For example, using a 5 × 5 tiling region, each with the same shared weights, requires only 25 neurons. Using shared weights means there are many fewer parameters, which helps avoid the vanishing gradients and exploding gradients problems seen during backpropagation in earlier neural networks. To speed processing, standard convolutional layers can be replaced by depthwise separable convolutional layers, which are based on a depthwise convolution followed by a pointwise convolution. The depthwise convolution is a spatial convolution applied independently over each channel of the input tensor, while the pointwise convolution is a standard convolution restricted to the use of 1 × 1 {\displaystyle 1\times 1} kernels. === Pooling layers === Convolutional networks may include local and/or global pooling layers along with traditional convolutional layers. Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer. Local pooling combines small clusters, tiling sizes such as 2 × 2 are commonly used. Global pooling acts on all the neurons of the feature map. There are two common types of pooling in popular use: max and average. Max pooling uses the maximum value of each local cluster of neurons in the feature map, while average pooling takes the average value. === Fully connected layers === Fully connected layers connect every neuron in one layer to every neuron in another layer. It is the same as a traditional multilayer perceptron neural network (MLP). Each neuron in the fully connected layer receives input from all the neurons in the previous layer. These inputs are weighted and summed with the corresponding biases, and then passed through an activation function to perform a nonlinear transformation, generating the output. The flattened matrix goes through a fully connected layer to classify the images. === Receptive field === In neural networks, each neuron receives input from some number of locations in the previous layer. In a convolutional layer, each neuron receives input from only a restricted area of the previous layer called the neuron's receptive field. Typically the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer, the receptive field is the entire previous layer. Thus, in each convolutional layer, each neuron takes input from a larger area in the input than previous layers. This is due to applying the convolution over and over, which takes the value of a pixel into account, as well as its surrounding pixels. When using dilated layers, the number of pixels in the receptive field remains constant, but the field is more sparsely populated as its dimensions grow when combining the effect of several layers. To manipulate the receptive field size as desired, there are some alternatives to the standard convolutional layer. For example, atrous or dilated convolution expands the receptive field size without increasing the number of parameters by interleaving visible and blind regions. Moreover, a single dilated convolutional layer can comprise filters with multiple dilation ratios, thus having a variable receptive field size. === Weights === Each neuron in a neural network computes an output value by applying a specific function to the input values received from the receptive field in the previous layer. The function that is applied to the input values is determined by a vector of weights and a bias (typically real numbers). Learning consists of iteratively adjusting these biases and weights. The vectors of weights and biases are called filters and represent particular features of the input (e.g., a particular shape). A distinguishing feature of CNNs is that many neurons can share the same filter. This reduces the memory footprint because a single bias and a single vector of weights are used across all receptive fields that share that filter, as opposed to each receptive field having its own bias and vector

    Read more →
  • Perry Rhodan

    Perry Rhodan

    Perry Rhodan is a German space opera franchise, named after its hero. It commenced in 1961 and has been ongoing for decades, written by an ever-changing team of authors. Having sold approximately two billion copies (in novella format) worldwide (including over one billion in Germany alone), it is the most successful science fiction book series ever written. The first billion of worldwide sales was celebrated in 1986. The series has spun off into comic books, audio dramas, video games and the like. A reboot, Perry Rhodan NEO, was launched in 2011 and began publication in English in April 2021. == Print publication == The series has spun off into many different forms of media, but originated as a serial novella published weekly since 8 September 1961 in the Romanheft (Meaning "Magazine novel") format. These are digest-sized booklets, usually containing 66 pages, the German equivalent of the now-defunct (and generally longer) American pulp magazine. They are published by Pabel-Moewig Verlag, a subsidiary of Bauer Media Group headquartered in Hamburg. As of February 2019, 3000 booklet novels of the original series, 850 spinoff novels of the sister series Atlan and over 400 paperbacks and 200 hardcover editions have been published, totalling over 300,000 pages. == English translation == The first 126 novels (plus five novels of the spinoff series Atlan) were translated into English and published by Ace Books between 1969 and 1978, with the same translations used for the British edition published by Futura Publications which issued only 39 novels. When Ace cancelled its translation of the series, translator Wendayne Ackerman self-published the following 19 novels (under the business name 'Master Publications') and made them available by subscription only. Financial disputes with the German publishers led to the cancellation of the American translation in 1979. An attempt to revive the series in English was made in 1997–1998 by Vector Publications of the US, which published translations of four issues (1800–1803) from the current storyline being published in Germany at the time. The series and its spin-offs have captured a substantial fraction of the original German science fiction output and exert influence on many German writers in the field. == Structure == The series is told in an arc storyline structure. An arc—called a "cycle"—would have anywhere from 25 to 100 issues devoted to it. Similar subsequent cycles are referred to as a "grand-cycle". == History == ‘Perry Rhodan, der Erbe des Universums’ (Eng: ‘The Heir to the Universe’, though the American/British editions instead used the subtitle 'Peacelord of the Universe') was created by German science fiction authors K. H. Scheer and Walter Ernsting and launched in 1961 by German publishing house Arthur Moewig Verlag (now Pabel-Moewig Verlag). Originally planned as a 30 to 50 volume series, it has been published continuously every week since, celebrating the 3000th issue in 2019. Written by an ever-changing team of authors, many of whom, however, remained with the series for decades or life, Perry Rhodan is issued in weekly novella-size installments in the traditional German Heftroman (pulp booklet) format. Unlike most German Heftromane, Perry Rhodan consists not of unconnected novels but is a series with a continuous, increasingly complex plotline, with frequent back references to events. In addition to its original Heftroman form, the series now also appears in hardcovers, paperbacks, e-books, comics and audiobooks. Over the decades there have also been comic strips, numerous collectibles, several encyclopedias, audio plays, inspired music, etc. The series has seen partial translations into several languages. It also spawned the German-Italian-Spanish 1967 movie Mission Stardust, which is widely considered so terrible that many fans of the series pretend it never existed. Coinciding with the 50th-anniversary World Con, on 30 September 2011, a new series named Perry Rhodan Neo began publication, attracting new readers with a reboot of the story, starting in the year 2036 instead of 1971, and a related but independent story-line. On 2 April 2021, light novel and manga publisher J-Novel Club announced Perry Rhodan NEO as a launch title for its new J-Novel Pulp imprint, making this the first ongoing English release of new Perry Rhodan serials in over 20 years. It has become the most popular science fiction book series of all time. == Overview == === Fictional history === The story begins in 1971. During the first human Moon landing by US Space Force Major Perry Rhodan and his crew, they discover a marooned extraterrestrial space ship from the fictional planet Arkon, located in the (real) M13 cluster. Appropriating the Arkonide technology, they proceed to unify Terra and carve out a place for humanity in the galaxy and the cosmos. Two of the accomplishments that enable them to do so are positronic brains and starship drives for near-instantaneous hyperspatial translation. These were directly borrowed from Isaac Asimov's science fiction. As the series progresses, major characters, including the title character, are granted relative immortality. They are immune to age and disease, but not to violent death. The story continues over the course of millennia and includes flashbacks thousands and even millions of years into the past. The scope widens to encompass other galaxies, even more remote regions of space, parallel universes and cosmic structures, time travel, paranormal powers, a variety of aliens ranging from threatening to endearing, and bodiless entities, some of which have godlike powers. === Multiverse === The universe in which the main plot generally takes place is called the Einstein Universe (or "Meekorah"). Its laws are for the most part identical to those of the real universe, as known by late 20th century science. Newer theories about dark matter and dark energy are currently not used in the series. The laws of nature follow old theories that have been disproven, in order to protect series continuity. There are many other universes, each to a greater or lesser extent different from the familiar one, in which, for example one in which time runs slower, an anti-matter universe, a shrinking universe, etc. Each universe possesses its owntimelines, which are for the most part unreachable from each other but may be accessed by special means, thereby itself creating many more parallel timelines. The Einstein Universe is embedded in a high-dimensional manifold, called Hyperspace. This hyperspace consists of several subspaces use for faster-than-light travel by technological means. The exact traits of those higher dimensions are got yhr mode pity unexplained. The border of the universe is a dimension called the deep, once used for construction of the gigantic disc-shaped world Deepland. === Psionic Web and Moral Code === The Psionic Web crosses the whole universe, constantly emitting "vital energy" and "psionic energy", guaranteeing normal (organic among others) life and the wellbeing of higher entities. The Moral Code crosses through all universes, and is linked to the Psionic Web. It is subdivided into the Cosmogenes, which are again subdivided into the Cosmonucleotids. The Cosmonucleotids determine reality and fate for their respective parts of a given universe, via messengers. Higher beings are trying to gain control of this Code to rule reality. The Moral Code itself was not installed by the higher beings, the higher powers by themselves have no clue why or by whom the Code was made. Once the Cosmocrats ordered Perry Rhodan to find the answer to the third ultimate question: "Who initiated the LAW and what does it accomplish?" Perry Rhodan had the chance to receive the answer at the mountain of creation, but refused, as he knew that the answer would destroy his mind. The negative Superintelligence Koltoroc had received the answer to the last ultimate question, 69 million years BC at Negane Mountain, but it is not known if it made any use of the information. === Onion-shell model === An evolutionary schema, similar to the Great Chain of Being, called the "onion-shell model" is employed in relationship to all life. Here, continuous evolution is from lower to higher lifeforms, culminating in bodiless entities. Later in the series, further lifeforms, representing stages between the known shells, were introduced. The main shells are: Lifeless matter Bacteria Higher animals Intelligent species Intelligent species that have contacted other species Superintelligences (SI) Matter sources/ Matter sinks Cosmocrats / Chaotarchs (High Powers) Powers close to the "Horizon of the LAW", the essence of the Multiverse The Superintelligences are the next step above normal minds. They can be born, for example, when a species collectively gives up its bodies and unites their spirits. Such Superintelligences may claim as their domain areas consisting of up to several galaxies (the entity known as "E

    Read more →
  • Terminator (franchise)

    Terminator (franchise)

    Terminator is an American media franchise created by James Cameron and Gale Anne Hurd. It is considered to be of the cyberpunk subgenre of science fiction. The franchise primarily focuses on the events leading to a future post-apocalyptic war between a synthetic intelligence known as Skynet, and a surviving resistance of humans led by John Connor. In this future, Skynet uses an arsenal of cyborgs known as Terminators, designed to mimic humans and infiltrate the resistance. Much of the franchise takes place in time periods prior to the Skynet takeover, with both humans and Terminators using time travel to attempt to alter the past and change the outcome of the future. A prominent Terminator model throughout the films is the T-800, commonly known as "the Terminator", with instances of this model portrayed by Arnold Schwarzenegger. The franchise began with the 1984 film The Terminator, written and directed by Cameron, with Hurd as producer. They would return for the 1991 sequel Terminator 2: Judgment Day (or T2). Both films were critical and commercial successes. Terminator 3: Rise of the Machines (or T3) was released in 2003 to positive reviews, followed by Terminator Salvation in 2009 to more negative reviews. Salvation was intended as the first in a new trilogy, which was later scrapped after the film rights were sold. Cameron was consulted for the 2015 film Terminator Genisys, a reboot branching off from the timeline of the original film. It was negatively received and performed poorly at the box-office. Cameron had a larger role as a producer of the 2019 film Terminator: Dark Fate, a direct sequel to T2 that ignores the three preceding films. As with Salvation, both Genisys and Dark Fate were planned as first installments of new trilogies, with the plans scrapped each time due to the films' poor box-office performances. Outside of the theatrical films, Cameron co-directed T2-3D: Battle Across Time, a 1996 theme park film-based attraction. It was produced as the original sequel to T2 and reunited its main cast. A television series, Terminator: The Sarah Connor Chronicles, was developed without Cameron's involvement and aired for two seasons in 2008 and 2009. It was also produced as a T2 sequel, taking place in an alternate timeline that ignores the third film and subsequent events. Terminator Zero, an anime series, premiered in August 2024. The franchise has also inspired several lines of comic books since 1988, and numerous video games since 1991. By 2010, the franchise had generated $3 billion in revenue. == Themes and setting == The central theme of the franchise is the battle for survival between the nearly-extinct human race and the world-spanning, synthetic intelligence that is Skynet. Skynet is positioned in the first film, The Terminator (1984), as a U.S. strategic "Global Digital Defense Network" computer system by Cyberdyne Systems which becomes self-aware. Shortly after activation, Skynet seemingly perceives all humans as a threat to its existence and formulates a plan to systematically wipe out humanity itself. The system initiates a nuclear first strike against Russia, thereby ensuring a devastating second strike and a nuclear holocaust which wipes out much of humanity in the resulting nuclear war. In the post-apocalyptic aftermath, Skynet later builds up its own autonomous machine-based military capability which includes the Terminators used against individual human targets and thereafter proceeds to wage a persistent total war against the surviving elements of humanity, some of whom have militarily organized themselves into a Resistance. At some point in this future, Skynet develops the capability of time travel and both it and the Resistance seek to use this technology in order to win the war; either by altering or accelerating past events or by preventing the apocalyptic timeline. === Judgment Day === In the franchise, Judgment Day (a reference to the biblical Day of Judgment) is the date on which Skynet becomes self-aware, in which case its creators panic and attempt to deactivate the network. As a result, Skynet perceives humanity as a threat and attempts to exterminate them. Skynet launches an all-out nuclear attack on Russia in order to provoke a nuclear counter-strike against the United States, knowing this will eliminate its human enemies. Due to time travel and the consequent ability to change the future, several differing dates are given for Judgment Day. In Terminator 2: Judgment Day (1991), Sarah Connor states that Judgment Day will occur on August 29, 1997. However, this date is delayed following the attack on Cyberdyne Systems in the same film. Judgment Day has various different dates in different timelines of the subsequent films, as well as the television series, creating a multiverse of temporal phenomena. In Terminator 3: Rise of the Machines (2003) and Terminator Salvation (2009), Judgment Day was postponed to July 2003. In Terminator: The Sarah Connor Chronicles (2008–2009), the attack on Cyberdyne Systems in the second film delayed Judgment Day to April 21, 2011. In Terminator Genisys (2015), the fifth film in the franchise, Judgment Day was postponed to an unspecified day in October 2017, attributed to altered events in both the future and the past. Sarah and Kyle Reese travel through time to the year 2017 and seemingly defeat Skynet, but the system core, contained inside a subterranean blast shelter, survives unknown to them, thus further delaying, rather than preventing, Judgment Day. In Terminator: Dark Fate (2019), the direct sequel to Terminator 2: Judgment Day, a date is not given for the new Judgment Day though it is named as such by Grace. Since Grace is a ten-year-old in 2020 and shown as a teenager in the post-Judgment Day world in flash-forwards throughout the film, Judgment Day occurs sometime in the early 2020s in this timeline. == Franchise rights == Before the first film was created, director James Cameron sold the rights for $1 to Gale Anne Hurd, his future wife, who produced the film, under the strict provision that he be allowed to direct it. Hemdale Film Corporation also became a 50-percent owner of the franchise rights, until its share was sold in 1990 to Carolco Pictures, a company founded by Andrew G. Vajna and Mario Kassar. Terminator 2: Judgment Day was released a year later. Carolco filed for bankruptcy in 1995 and its library was subsequently acquired by StudioCanal, which continues to own the franchise today. However, the rights to future Terminator films were ultimately put up for auction. By that time, Cameron had become interested in making a Terminator 3 film. The rights were ultimately auctioned to Vajna in 1997, for $8 million. Vajna and Kassar spent another $8 million to purchase Hurd's half of the rights in 1998, becoming the full owners of the franchise. Hurd was initially opposed to the sale of the rights, while Cameron had lost interest in the franchise and a third film. After the 2003 release of Terminator 3: Rise of the Machines, the franchise rights were sold in 2007 for about $25 million to The Halcyon Company, which produced Terminator Salvation in 2009. Later that year, the company faced legal issues and filed for bankruptcy, putting the franchise rights up for sale. The rights were valued at about $70 million. In 2010, the rights were sold for $29.5 million to Pacificor, a hedge fund that was Halcyon's largest creditor. In 2012, the rights were sold to Megan Ellison and her production company Annapurna Pictures for less than $20 million, a lower price than what was previously offered. The low price was because of the possibility of Cameron regaining the rights in 2019, as a result of new North American copyright laws. Megan's brother David Ellison and Skydance Productions produced Terminator Genisys in 2015. Cameron worked together with David Ellison to produce the 2019 film Terminator: Dark Fate. As the film neared its release, Hurd filed to terminate a copyright grant made 35 years earlier. Under this move, Hurd would again become a 50-percent owner of the rights with Cameron and Skydance could lose the rights to make any additional Terminator films beginning in November 2020, unless a new deal is worked out. Skydance responded that it had a deal in place with Cameron and that it "controls the rights to the Terminator franchise for the foreseeable future". == Films == === The Terminator (1984) === The Terminator is a 1984 science fiction action film released by Orion Pictures, co-written and directed by James Cameron and starring Arnold Schwarzenegger, Linda Hamilton and Michael Biehn. It is the first work in the Terminator franchise. In the film, robots take over the world in the near future, directed by the artificial intelligence Skynet. With its sole mission to completely annihilate humanity, it develops android assassins called Terminators that outwardly appear human. A man named John Connor starts the Tech-Com resistance to fight the machi

    Read more →
  • Question (short story)

    Question (short story)

    "Question" is a science fiction short story by American writer Isaac Asimov. The story first appeared in the March 1955 issue of Computers and Automation (thought to be the first computer magazine), and was reprinted in the April 30, 1957, issue of Science World. It is the first of a loosely connected series of stories concerning a fictional supercomputer called Multivac. The story concerns two technicians who are servicing Multivac, and their argument over whether or not the machine is truly intelligent and able to think. Multivac, however, supplies the answer on its own. After the reprint, another author, Robert Sherman Townes, noticed the climax in the last sentence was very similar to one of his own stories, "Problem for Emmy" (Startling Stories, June 1952), and wrote to Asimov about it. After searching in his library, Asimov did find the original story and, although he did not recall having read it, admitted that the endings were pretty similar. He then replied to Townes, apologizing and promising the story would never again be published, and it never was. Asimov mentioned "Question" in an editorial called "Plagiarism" which appeared in the August 1985 issue of Asimov's Science Fiction (although he did not mention Townes' name or the title of either story). "Plagiarism" was reprinted in Asimov's collection Gold (1995).

    Read more →
  • Springpad

    Springpad

    Springpad was a free online application and web service that allowed its registered users to save, organize and share collected ideas and information. As users added content to their Springpad accounts, the application automatically identified and categorized it, then generated additional snippets based on the types of objects added—for example, listing price comparisons for products and showtimes for movies. Springpad was also available as apps on the iPad, iPhone and Android that synchronized with the Web interface. Springpad was bundled on new Toshiba notebook computers through a Web application subscription service. On May 23, 2014, Springpad announced that it would cease operations on June 25, 2014. The company then allowed users to export their data (as JSON and read-only HTML formats), or to automatically migrate it to Evernote accounts before the expiration date. == Features == Springpad users could use the main site interface which uses HTML5 from most browsers or use the smartphone app to capture notes, tasks, or lists which were then added to the user's "My Stuff", the user's personal database or collection. Additionally Springpad let users look up items of interest which were then automatically categorized based on type or manually categorized by the user. Category types included recipes, movies, products, restaurants and wine. Events could also be added to Springpad, and if the user used Google Calendar, they could opt to sync the event to it. In addition to the smartphone app and site, Springpad could be used via browser extension for Google Chrome, or the Springpad Clipper, a bookmarklet to analyze webpages and clip relevant information from them—for example, the ingredients needed for a recipe—or to add the site as a normal bookmark. Another way users could add content to their Springpad "My Stuff" was by emailing entries to an email address specified on Springpad registration. Springpad's smartphone apps could be used to scan barcodes to identify products, save them to the user's "My Stuff", and automatically generate additional product information and links. The mobile app could also save images taken with the phone's camera, and locate nearby businesses. With most of the content added to a user's "My Stuff", relevant news, useful links and other helpful information could be viewed. Users could also attach additional notes and images to content they had already saved, and could add reminders and alerts which could be emailed to the user or texted to their phone. Springpad also added alerts to its own Alerts section for relevant news, deals or coupons for specific products users added. For additional organization, anything added to Springpad could also be tagged. Users could also add entries to "Notebooks" to separate content by projects, or any other way they wished. Each Notebook included a section called a "Board", which acted as a pin board where users could "pin" content they'd added to the Notebook, allowing them to visually lay out items. If the user added a map to the Board and had entries that included an address, Springpad could automatically point out entries on the map. By default, everything added to Springpad was private. However users could change the privacy settings for each of the types of items added, decide to make specific items public and shareable on Facebook and Twitter, add them to their public page, or keep them private but links to them with specific people.

    Read more →
  • Google AI Studio

    Google AI Studio

    Google AI Studio is a web-based integrated development environment developed by Google for prototyping applications using generative AI models. Released in December 2023 alongside the Gemini API, the platform provides access to Google's Gemini family of models and related tools for image, video, and audio generation. The service targets both developers and non-technical users for testing prompts and generating code for the Gemini API. == History == Google launched AI Studio on December 13, 2023, as the successor to Google MakerSuite. MakerSuite, introduced at Google I/O in May 2023, had provided similar functionality for Google's PaLM language models. The AI Studio was launched alongside the public release of the Gemini API. == Features == AI Studio's interface consists of a central prompt area and a settings panel for model selection and parameter adjustment. The platform supports chat prompts for multi-turn conversations and includes system instructions for defining model behavior, tone, or specific rules. Users can employ zero-shot and few-shot prompting techniques to guide the model's output format. The platform processes various media types including video, audio, and documents, and can generate images through Imagen models, videos through Veo models, and audio through text-to-speech functionality. Additional tools include real-time streaming for screen sharing and live analysis, code execution in a sandboxed Python environment, grounding with Google Search for current information, URL context for analyzing specific web pages, and a thinking mode for complex reasoning tasks. == Available models == The platform provides access to several Google AI models including the Gemini language models, Imagen for image generation, Veo for video generation, LearnLM for educational applications, and Gemma, Google's open-source model family. == Privacy and data usage == Google AI Studio's data handling differs between free and paid users. For free tier users, Google uses submitted prompts, uploaded files, and generated responses to improve its products and services, with human reviewers potentially reading and annotating the data after disconnection from user accounts. Google advises against submitting sensitive information on the free tier. Users who enable Google Cloud Billing are considered paid service users, and their data is not used for product improvement. Data is processed according to Google's Data Processing Addendum and retained temporarily for abuse monitoring. == Availability == The platform is available at no cost, with API usage subject to a free tier with daily and per-minute rate limits. Access is restricted to users aged 18 and older in specific countries and territories. The service was initially unavailable in the United Kingdom and European Economic Area due to regulatory concerns, which drew user complaints. == Reception == Reviews have noted the platform's accessibility and integration with Gemini models, with features such as real-time screen sharing and large context windows cited as notable capabilities. However, reviewers have raised concerns about the privacy implications for free tier users, whose data is used for model training. Some users have reported inconsistent performance with features like screen streaming and issues with folder uploads for large datasets. The initial geographic restrictions were a point of criticism among developers in affected regions.

    Read more →
  • Fuzzy relation

    Fuzzy relation

    A fuzzy relation is the cartesian product of mathematical fuzzy sets. Two fuzzy sets are taken as input, the fuzzy relation is then equal to the cross product of the sets which is created by vector multiplication. Usually, a rule base is stored in a matrix notation which allows the fuzzy controller to update its internal values. From a historical perspective, the first fuzzy relation was mentioned in the year 1971 by Lotfi A. Zadeh. A practical approach to describe a fuzzy relation is based on a 2d table. At first, a table is created which consists of fuzzy values from 0..1. The next step is to apply the if-then-rules to the values. The resulting numbers are stored in the table as an array. Fuzzy relations can be utilized in fuzzy databases.

    Read more →
  • Deep learning speech synthesis

    Deep learning speech synthesis

    Deep learning speech synthesis refers to the application of deep learning models to generate natural-sounding human speech from written text (text-to-speech) or spectrum (vocoder). Deep neural networks are trained using large amounts of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text. == Formulation == Given an input text or some sequence of linguistic units Y {\displaystyle Y} , the target speech X {\displaystyle X} can be derived by X = arg ⁡ max P ( X | Y , θ ) {\displaystyle X=\arg \max P(X|Y,\theta )} where θ {\displaystyle \theta } is the set of model parameters. Typically, the input text will first be passed to an acoustic feature generator, then the acoustic features are passed to the neural vocoder. For the acoustic feature generator, the loss function is typically L1 loss (Mean Absolute Error, MAE) or L2 loss (Mean Square Error, MSE). These loss functions impose a constraint that the output acoustic feature distributions must be Gaussian or Laplacian. In practice, since the human voice band ranges from approximately 300 to 4000 Hz, the loss function will be designed to have more penalty on this range: l o s s = α loss human + ( 1 − α ) loss other {\displaystyle loss=\alpha {\text{loss}}_{\text{human}}+(1-\alpha ){\text{loss}}_{\text{other}}} where loss human {\displaystyle {\text{loss}}_{\text{human}}} is the loss from human voice band and α {\displaystyle \alpha } is a scalar, typically around 0.5. The acoustic feature is typically a spectrogram or Mel scale. These features capture the time-frequency relation of the speech signal, and thus are sufficient to generate intelligent outputs. The Mel-frequency cepstrum feature used in the speech recognition task is not suitable for speech synthesis, as it reduces too much information. == History == In September 2016, DeepMind released WaveNet, which demonstrated that deep learning-based models are capable of modeling raw waveforms and generating speech from acoustic features like spectrograms or mel-spectrograms. Although WaveNet was initially considered to be computationally expensive and slow to be used in consumer products at the time, a year after its release, DeepMind unveiled a modified version of WaveNet known as "Parallel WaveNet," a production model 1,000 faster than the original. This was followed by Google AI's Tacotron 2 in 2018, which demonstrated that neural networks could produce highly natural speech synthesis but required substantial training data—typically tens of hours of audio—to achieve acceptable quality. Tacotron 2 used an autoencoder architecture with attention mechanisms to convert input text into mel-spectrograms, which were then converted to waveforms using a separate neural vocoder. When trained on smaller datasets, such as 2 hours of speech, the output quality degraded while still being able to maintain intelligible speech, and with just 24 minutes of training data, Tacotron 2 failed to produce intelligible speech. In 2019, Microsoft Research introduced FastSpeech, which addressed speed limitations in autoregressive models like Tacotron 2. FastSpeech utilized a non-autoregressive architecture that enabled parallel sequence generation, significantly reducing inference time while maintaining audio quality. Its feedforward transformer network with length regulation allowed for one-shot prediction of the full mel-spectrogram sequence, avoiding the sequential dependencies that bottlenecked previous approaches. The same year saw the release of HiFi-GAN, a generative adversarial network (GAN)-based vocoder that improved the efficiency of waveform generation while producing high-fidelity speech. In 2020, the release of Glow-TTS introduced a flow-based approach that allowed for fast inference and voice style transfer capabilities. In March 2020, the free text-to-speech website 15.ai was launched. 15.ai gained widespread international attention in early 2021 for its ability to synthesize emotionally expressive speech of fictional characters from popular media with minimal amount of data. The creator of 15.ai (known pseudonymously as 15) stated that 15 seconds of training data is sufficient to perfectly clone a person's voice (hence its name, "15.ai"), a significant reduction from the previously known data requirement of tens of hours. 15.ai is credited as the first platform to popularize AI voice cloning in memes and content creation. 15.ai used a multi-speaker model that enabled simultaneous training of multiple voices and emotions, implemented sentiment analysis using DeepMoji, and supported precise pronunciation control via ARPABET. The 15-second data efficiency benchmark was later corroborated by OpenAI in 2024. == Semi-supervised learning == Currently, self-supervised learning has gained much attention through better use of unlabelled data. Research has shown that, with the aid of self-supervised loss, the need for paired data decreases. == Zero-shot speaker adaptation == Zero-shot speaker adaptation is promising because a single model can generate speech with various speaker styles and characteristic. In June 2018, Google proposed to use pre-trained speaker verification models as speaker encoders to extract speaker embeddings. The speaker encoders then become part of the neural text-to-speech models, so that it can determine the style and characteristics of the output speech. This procedure has shown the community that it is possible to use only a single model to generate speech with multiple styles. == Neural vocoder == In deep learning-based speech synthesis, neural vocoders play an important role in generating high-quality speech from acoustic features. The WaveNet model proposed in 2016 achieves excellent performance on speech quality. Wavenet factorised the joint probability of a waveform x = { x 1 , . . . , x T } {\displaystyle \mathbf {x} =\{x_{1},...,x_{T}\}} as a product of conditional probabilities as follows p θ ( x ) = ∏ t = 1 T p ( x t | x 1 , . . . , x t − 1 ) {\displaystyle p_{\theta }(\mathbf {x} )=\prod _{t=1}^{T}p(x_{t}|x_{1},...,x_{t-1})} where θ {\displaystyle \theta } is the model parameter including many dilated convolution layers. Thus, each audio sample x t {\displaystyle x_{t}} is conditioned on the samples at all previous timesteps. However, the auto-regressive nature of WaveNet makes the inference process dramatically slow. To solve this problem, Parallel WaveNet was proposed. Parallel WaveNet is an inverse autoregressive flow-based model which is trained by knowledge distillation with a pre-trained teacher WaveNet model. Since such inverse autoregressive flow-based models are non-auto-regressive when performing inference, the inference speed is faster than real-time. Meanwhile, Nvidia proposed a flow-based WaveGlow model, which can also generate speech faster than real-time. However, despite the high inference speed, parallel WaveNet has the limitation of needing a pre-trained WaveNet model, so that WaveGlow takes many weeks to converge with limited computing devices. This issue has been solved by Parallel WaveGAN, which learns to produce speech through multi-resolution spectral loss and GAN learning strategies.

    Read more →
  • UpScrolled

    UpScrolled

    UpScrolled is an Australian social media platform for microblogging and short-form online video sharing that was launched in June 2025 by Recursive Methods Pty Ltd. It was founded by Issam Hijazi. == History == UpScrolled was launched in June 2025 by Recursive Methods Pty Ltd. It was founded by Issam Hijazi, a Palestinian-Australian app developer. UpScrolled is backed by the Tech for Palestine incubator. In January 2026, UpScrolled saw increased attention and number of downloads after the acquisition of TikTok by a group of pro-Donald Trump US investors, including Larry Ellison, which led to calls to boycott TikTok and migrate to other apps. TikTok was alleged to be suppressing pro-Palestinian content, as well as news surrounding the killing of Alex Pretti in Minneapolis on the platform. UpScrolled subsequently climbed to the top 10 of Apple's App Store list of free apps. The app saw a reported 2,850% increase in downloads between 22 and 24 January 2026. As of 27 January 2026, UpScrolled "had been downloaded about 400,000 times in the US and 700,000 globally since launching in June 2025". The app became the most downloaded app in the Apple App store on 29 January 2026, following allegations that TikTok was suppressing videos and content opposed to Immigration and Customs Enforcement (ICE) under its new ownership. By 2 February 2026, UpScrolled had reached 2.5 million users. According to the Google Play Store and the Apple App Store, it has become the most downloaded social media app in the United States and Canada, with rising interest in the United Kingdom, France, Germany and Italy. On 14 February, UpScrolled was suspended from the Google Play Store; the suspension was reverted by 15 February. == Founder == Hijazi was born in Jordan. His parents and grandparents are from Safad, a northern Israeli city near the Lebanese border. He worked for IBM and Oracle prior to starting UpScrolled. Hijazi told Rest of World that he launched UpScrolled in response to Israel's genocide in Gaza which followed the October 7 attacks. He said, "I couldn't take it anymore. I lost family members in Gaza, and I didn't want to be complicit. So I was like, I'm done with this, I want to feel useful. I found this gap in the market, with a lot of people asking why there is no alternative to the Big Tech platforms for their content, which was getting censored." Hijazi also alleges that social media accounts that were posting pro-Palestinian content were getting shadow banned on larger platforms, and alleges that even his account was not exempt from being targeted by censors. Hijazi has further elaborated on the importance of social media independence to further the Palestinian cause. In January 2026, Web Summit Qatar announced that Hijazi would be an opening night speaker. Following the announcement, there was a surge in ticket sales for the summit. Hijazi lives in Sydney with his wife and daughter. He lost 60 family members during the Gaza war. == Features == UpScrolled's algorithm allows users to discover posts based on likes, comments, and shares with time decay and some randomness, all chronologically, with "no manipulation" according to the app's website. UpScrolled has an interface resembling a mix of Instagram and Twitter, allowing users to post and view text posts, photos, and videos. It also lets users send private messages to each other. The app is currently available for iOS and Android devices, with plans to upscale. UpScrolled does not include Israel as an option in its location selection menu. Cities such as Tel Aviv are included under "Occupied Territories of Palestine", and Palestine can also be set as the location. UpScrolled says that it is against censorship and shadow banning, and describes itself as "belong[ing] to the people who use it — not to hidden algorithms or outside agendas". Hijazi said, "The other platforms claim to be free speech platforms. But when it comes to anything on Palestine, that's a different story." UpScrolled states that it "does not tolerate hate speech, propaganda, or bad-faith behaviour, but it also refuses to silence voices quietly or without explanation". == User base and content == Al Jazeera reported that posts expressing pro-Palestinian sentiment or depicting the continued suffering in the Gaza Strip were "flooding" the app. Political and global issues such as the Gaza war are prominent. Content includes updates from the Gaza Freedom Flotilla, posts by doctors working in Gaza, video essays about Palantir’s influence within the military and calls for boycotts of Israel. It has been used by Gazans to crowdfund and record daily life. Celebrity users of UpScrolled include American labour activist Chris Smalls and actor Jacob Berger, both of whom were on the July 2025 Gaza Freedom Flotilla. Political figures have also joined UpScrolled, such as South African politician and Economic Freedom Fighters leader Julius Malema, and Islamic Revolutionary Guard Corps commander Esmail Qaani. One user said that most early users were attracted to the platform for the opportunity to criticize Zionism. The Jewish Telegraphic Agency (JTA) reported that UpScrolled was observed to be "flooded" with antisemitic and anti-Israel content, including Holocaust denial and accusations that Israel carried out the 9/11 attacks. In a statement, UpScrolled said, "Our content moderation hasn't been able to keep up with the massive rise of users this week. We're working with digital rights experts to grow our Trust & Safety team and are beefing up our content moderation to prevent this. We apologise to all impacted users, thank you for being part of Upscrolled." The Times reported in February 2026 that UpScrolled was hosting content that could potentially breach UK law, including antisemitic content and posts promoting Hamas, Hezbollah, Islamic State and Al-Qaeda, as well as footage of the 2019 Christchurch mosque shootings and content praising the perpetrators of the 2019 Halle synagogue shooting and 2018 Pittsburgh synagogue shooting. Antisemitic influencers Lucas Gage, Jake Shields, Stew Peters and Anastasia Maria Loupis have accounts on UpScrolled. UpScrolled’s policies prohibit threats, glorification of harm or support for terrorist or violent groups. Hijazi said harmful content was being uploaded to UpScrolled and the company had expanded its content moderation team and upgraded its technology infrastructure to deal with the issue. In May 2026, Moment magazine said that users had identified some antisemitic content, pornography and extremist videos on the platform. The magazine said there were gaps in content moderation due to the small size of the developer team. == Reception == In January 2026, the Council on American–Islamic Relations (CAIR) praised UpScrolled for "pledging to protect the free flow of ideas on its platform, including both support for and opposition to the Israeli government's human rights abuses." Guy Christensen, a pro-Palestinian social media celebrity, has encouraged his audience to download UpScrolled. Christensen characterized UpScrolled as having "no censorship, no ownership by billionaires who put their interests and biases onto you to control you". He compared the platform to others like TikTok, saying that Israel is behind censorship that wouldn't happen on UpScrolled. Jaigris Hodson, an associate professor of Interdisciplinary Studies at Royal Roads University in Canada, has argued that "Network effects mean that unless UpScrolled continues its explosive growth, people are unlikely to continue to choose it over the more established TikTok. At best, we might see a Twitter/X effect, which is where TikTok will host more pro-U.S. government content creators and those people who want to follow them, and UpScrolled will host more critical content creators and their followers."

    Read more →
  • CogX Festival

    CogX Festival

    CogX Festival is a global festival focusing on the impact of artificial intelligence (AI) and emerging technology on industry, government, and society. It takes place annually, usually in September, in London, England. Founded by Charlie Muirhead and Tabitha Goldstaub in 2017, CogX aims to facilitate dialogue and understanding about AI and its implications across various sectors. CogX Festival 2023 was held from September 12 to September 14 across multiple sites in London. == History == The inaugural CogX event took place in 2017, intending to bring together experts from diverse fields to discuss the role and impact of AI and emerging technologies. Since then, it has evolved to include a broader range of topics and attract a diverse audience. In 2018, the first CogX Awards festival was hosted. That year, over 50 awards were shown to 300 guests. In 2021, CogX and Hopin, a video conferencing software, signed an agreement lasting 4 years to make CogX a hybrid conference due to the COVID-19 pandemic. CogX 2021 attracted over 5,000 attendees in-person and over 100,000 virtually. In 2022, they returned to a live event format after two years of hybrid events and controlled physical attendance. They also launched the CogX app, which curated insights from the world's top podcasts. In 2023, after he had delivered the keynote address guest speaker Stephen Fry fell off the stage and subsequently broke his leg, hip, pelvis and a "bunch of ribs". A court filing in 2026 revealed that Fry was seeking £100,000 in damages from CogX Festival Ltd and creative agency Blonstein Events. == Programming == The festival features sessions, discussions, workshops, and exhibitions, encompassing various domains of AI and technology. In recent CogX Festivals, they have featured summits encompassing topics like global leadership and industry transformation.

    Read more →
  • Fuzzy differential equation

    Fuzzy differential equation

    Fuzzy differential equation are general concept of ordinary differential equation in mathematics defined as differential inclusion for non-uniform upper hemicontinuity convex set with compactness in fuzzy set. d x ( t ) / d t = F ( t , x ( t ) , α ) , {\displaystyle dx(t)/dt=F(t,x(t),\alpha ),} for all α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,1]} . == First order fuzzy differential equation == A first order fuzzy differential equation with real constant or variable coefficients x ′ ( t ) + p ( t ) x ( t ) = f ( t ) {\displaystyle x'(t)+p(t)x(t)=f(t)} where p ( t ) {\displaystyle p(t)} is a real continuous function and f ( t ) : [ t 0 , ∞ ) → R F {\displaystyle f(t)\colon [t_{0},\infty )\rightarrow R_{F}} is a fuzzy continuous function y ( t 0 ) = y 0 {\displaystyle y(t_{0})=y_{0}} such that y 0 ∈ R F {\displaystyle y_{0}\in R_{F}} . == Linear systems of fuzzy differential equations == A system of equations of the form x ( t ) n ′ = a n 1 ( t ) x 1 ( t ) + . . . . . . + a n n ( t ) x n ( t ) + f n ( t ) {\displaystyle x(t)'_{n}=a_{n}1(t)x_{1}(t)+......+a_{n}n(t)x_{n}(t)+f_{n}(t)} where a i j {\displaystyle a_{i}j} are real functions and f i {\displaystyle f_{i}} are fuzzy functions x n ′ ( t ) = ∑ i = 0 1 a i j x i . {\displaystyle x'_{n}(t)=\sum _{i=0}^{1}a_{ij}x_{i}.} == Fuzzy partial differential equations == A fuzzy differential equation with partial differential operator is ∇ x ( t ) = F ( t , x ( t ) , α ) , {\displaystyle \nabla x(t)=F(t,x(t),\alpha ),} for all α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,1]} . == Fuzzy fractional differential equation == A fuzzy differential equation with fractional differential operator is d n x ( t ) d t n = F ( t , x ( t ) , α ) , {\displaystyle {\frac {d^{n}x(t)}{dt^{n}}}=F(t,x(t),\alpha ),} for all α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,1]} where n {\displaystyle n} is a rational number.

    Read more →