AI App Inventor

AI App Inventor — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Ayoba

    Ayoba

    Ayoba is an African communication platform developed in South Africa. It is owned by Progressive Tech Holdings in Mauritius and managed by SIMFY Africa. Launched on May 4, 2019, as of April 2024, it has over 35 million active users. == History == Ayoba was first published on Google Play in February 2019. Its first marketing campaign and brand launch took place in Cameroon on May 4, 2019. In June 2019, the platform introduced its first eight channels. In November 2019, the platform reached one million active users, which increased to two million by June 2020. Subsequently, ayoba expanded its services, including the launch of games for Android in February 2020, Momo (Mobile Money) in Cameroon in May 2020, and MicroApps in May 2020. It also launched music and voice and video calling features in 12 territories in August 2020. The first version of ayoba for iOS was released in September 2020. In December of the same year, games and Messaging 2.0 were launched on the platform. In November 2020, it won Best Mobile Application at the African Digital Awards. In 2021, it won OTT Brand of the Year at the Marketing World Awards in Ghana. In December 2022, it received Top Innovative Technology and Telecom Product of the Year at the National Communications Awards in December 2022. In June 2023 ayoba partnered with BoomPlay and as of April 2024, it had 35 million monthly active users. Ayoba has partnered with Jumia Ghana to offer exclusive deals to users. Ayoba users can get a 10% discount on selected Jumia purchases through the app, with no data charges for MTN users. This partnership aims to make online shopping more affordable and accessible by integrating Jumia's offers into the ayoba app. Ayoba supports over 35 million users across Africa and provides services in 22 languages. To access the deals, users can download the ayoba app from the Google Play Store, iOS Store, or the official website. == Platform features == Chat, Call and Share: ayoba enables instant messaging, voice notes, picture sharing, and file sharing with contacts, even if they do not have the app installed. The app supports voice and video calls on both Android and iOS, as well as group chats, help channel and SMS continuity (non ayoba users receive messages as SMS, their responses appear in the ayoba app). Music: ayoba offers a free music player with daily updates on international and African music. Users can find playlists for different genres. Games: ayoba provides a selection of interactive games, including action, adventure, and children's games available on both Android and iOS. Mobile Money Transfers: In certain territories, ayoba supports mobile money transfers using MTN Mobile Money (MoMo) for transactions within the app. MicroApps: ayoba features individual MicroApps within the platform that offer content and services, including streaming channels, podcasts, and specialized apps. The availability of these apps may vary by country. == Operations == ayoba primarily focuses on the following territories: Nigeria, Cameroon, South Africa, Ghana, Côte d'Ivoire, Uganda, Republic of Congo, Benin, Zambia, Tanzania, Kenya, Senegal, Togo, Guinea Bissau, Guinea Conakry, Sudan, South Sudan, and Liberia. The company operates from its offices in Cape Town and Johannesburg, South Africa. David Gillaranz served as the CEO from 2019 to 2021, and Burak Akinci has been the CEO since 2021.

    Read more →
  • Structure mapping engine

    Structure mapping engine

    In artificial intelligence and cognitive science, the structure mapping engine (SME) is an implementation in software of an algorithm for analogical matching based on the psychological theory of Dedre Gentner. The basis of Gentner's structure-mapping idea is that an analogy is a mapping of knowledge from one domain (the base) into another (the target). The structure-mapping engine is a computer simulation of the analogy and similarity comparisons. The theory is useful because it ignores surface features and finds matches between potentially very different things if they have the same representational structure. For example, SME could determine that a pen is like a sponge because both are involved in dispensing liquid, even though they do this very differently. == Structure mapping theory == Structure mapping theory is based on the systematicity principle, which states that connected knowledge is preferred over independent facts. Therefore, the structure mapping engine should ignore isolated source-target mappings unless they are part of a bigger structure. The SME, the theory goes, should map objects that are related to knowledge that has already been mapped. The theory also requires that mappings be done one-to-one, which means that no part of the source description can map to more than one item in the target and no part of the target description can be mapped to more than one part of the source. The theory also requires that if a match maps subject to target, the arguments of subject and target must also be mapped. If both these conditions are met, the mapping is said to be "structurally consistent." == Concepts in SME == SME maps knowledge from a source into a target. SME calls each description a dgroup. Dgroups contain a list of entities and predicates. Entities represent the objects or concepts in a description — such as an input gear or a switch. Predicates are one of three types and are a general way to express knowledge for SME. Relation predicates contain multiple arguments, which can be other predicates or entities. An example relation is: (transmit (what from to)). This relation has a functor transmit and takes three arguments: what, from, and to. Attribute predicates are the properties of an entity. An example of an attribute is (red gear) which means that gear has the attribute red. Function predicates map an entity into another entity or constant. An example of a function is (joules power source) which maps the entity power source onto the numerical quantity joules. Functions and attributes have different meanings, and consequently SME processes them differently. For example, in SME's true analogy rule set, attributes differ from functions because they cannot match unless there is a higher-order match between them. The difference between attributes and functions will be explained further in this section's examples. All predicates have four parameters. They have (1) a functor, which identifies it, and (2) a type, which is either relation, attribute, or function. The other two parameters (3 and 4) are for determining how to process the arguments in the SME algorithm. If the arguments have to be matched in order, commutative is false. If the predicate can take any number of arguments, N-ary is false. An example of a predicate definition is: (sme:defPredicate behavior-set (predicate) relation :n-ary? t :commutative? t) The predicate's functor is “behavior-set,” its type is “relation,” and its n-ary and commutative parameters are both set to true. The “(predicate)” part of the definition specifies that there will be one or more predicates inside an instantiation of behavior-set. == Algorithm details == The algorithm has several steps. The first step of the algorithm is to create a set of match hypotheses between source and target dgroups. A match hypothesis represents a possible mapping between any part of the source and the target. This mapping is controlled by a set of match rules. By changing the match rules, one can change the type of reasoning SME does. For example, one set of match rules may perform a kind of analogy called literal similarity, and another performs a kind of analogy called true-analogy. These rules are not the place where domain-dependent information is added, but rather where the analogy process is tweaked, depending on the type of cognitive function the user is trying to emulate. For a given match rule, there are two types of rules that further define how it will be applied: filter rules and intern rules. Intern rules use only the arguments of the expressions in the match hypotheses that the filter rules identify. This limitation makes the processing more efficient by constraining the number of match hypotheses that are generated. At the same time, it also helps to build the structural consistencies that are needed later on in the algorithm. An example of a filter rule from the true-analogy rule set creates match hypotheses between predicates that have the same functor. The true-analogy rule set has an intern rule that iterates over the arguments of any match hypothesis, creating more match hypotheses if the arguments are entities or functions, or if the arguments are attributes and have the same functor. In order to illustrate how the match rules produce match hypotheses consider these two predicates: transmit torque inputgear secondgear (p1) transmit signal switch div10 (p2) Here we use true analogy for the type of reasoning. The filter match rule generates a match between p1 and p2 because they share the same functor, transmit. The intern rules then produce three more match hypotheses: torque to signal, inputgear to switch, and secondgear to div10. The intern rules created these match hypotheses because all the arguments were entities. If the arguments were functions or attributes instead of entities, the predicates would be expressed as: transmit torque (inputgear gear) (secondgear gear) (p3) transmit signal (switch circuit) (div10 circuit) (p4) These additional predicates make inputgear, secondgear, switch, and div10 functions or attributes depending on the value defined in the language input file. The representation also contains additional entities for gear and circuit. Depending on what type inputgear, secondgear, switch, and div10 are, their meanings change. As attributes, each one is a property of the gear or circuit. For example, the gear has two attributes, inputgear and secondgear. The circuit has two attributes, switch and circuit. As functions inputgear, secondgear, switch, and div10 become quantities of the gear and circuit. In this example, the functions inputgear and secondgear now map to the numerical quantities “torque from inputgear” and “torque from secondgear,” For the circuit the quantities map to logical quantity “switch engaged” and the numerical quantity “current count on the divide by 10 counter.” SME processes these differently. It does not allow attributes to match unless they are part of a higher-order relation, but it does allow functions to match, even if they are not part of such a relation. It allows functions to match because they indirectly refer to entities and thus should be treated like relations that involve no entities. However, as next section shows, the intern rules assign lower weights to matches between functions than to matches between relations. The reason SME does not match attributes is because it is trying to create connected knowledge based on relationships and thus satisfy the systematicity principle. For example, if both a clock and a car have inputgear attributes, SME will not mark them as similar. If it did, it would be making a match between the clock and car based on their appearance — not on the relationships between them. When the additional predicates in p3 and p4 are functions, the results from matching p3 and p4 are similar to the results from p1 and p2 except there is an additional match between gear and circuit and the values for the match hypotheses between (inputgear gear) and (switch circuit), and (secondgear gear) and (div10 circuit), are lower. The next section describes the reason for this in more detail. If the inputgear, secondgear, switch, and div10 are attributes instead of entities, SME does not find matches between any of the attributes. It finds matches only between the transmit predicates and between torque and signal. Additionally, the structural-evaluation scores for the remaining two matches decrease. In order to get the two predicates to match, p3 would need to be replaced by p5, which is demonstrated below. transmit torque (inputgear gear) (div10 gear) (p5) Since the true-analogy rule set identifies that the div10 attributes are the same between p5 and p4 and because the div10 attributes are both part of the higher-relation match between torque and signal, SME makes a match between (div10 gear) and (div10 circuit) — which leads to a match between gear and circuit. Being part of a higher-order match is a requiremen

    Read more →
  • ECML PKDD

    ECML PKDD

    ECML PKDD, the European Conference on Machine Learning Principles and Practice of Knowledge Discovery in Databases, is one of the leading academic conferences on machine learning and knowledge discovery, held in Europe every year. == History == ECML PKDD is a merger of two European conferences, European Conference on Machine Learning (ECML) and European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD). ECML and PKDD have been co-located since 2001; however, both ECML and PKDD retained their own identity until 2007. For example, the 2007 conference was known as "the 18th European Conference on Machine Learning (ECML) and the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD)", or in brief, "ECML/PKDD 2007", and both ECML and PKDD had their own conference proceedings. In 2008 the conferences were merged into one conference, and the division into traditional ECML topics and traditional PKDD topics was removed. The history of ECML dates back to 1986, when the European Working Session on Learning was first held. In 1993 the name of the conference was changed to European Conference on Machine Learning. PKDD was first organised in 1997. Originally PKDD stood for the European Symposium on Principles of Data Mining and Knowledge Discovery from Databases. The name European Conference on Principles and Practice of Knowledge Discovery in Databases was used since 1999. The conference remains highly competitive, consistently maintaining an average acceptance rate of around 25% for the main research track. == Upcoming conferences == == List of past conferences ==

    Read more →
  • Land of Memories

    Land of Memories

    Land of Memories (Chinese: 机忆之地) is a Chinese science-fiction novel by Shen Yang (沈阳), a professor at Tsinghua University's School of Journalism and Communication. The story revolves around a former neuroscientist trying to recover her memories from the metaverse after suffering amnesia due to an accident. It contains almost 6,000 Chinese characters and was shortened from an AI-generated draft that was 43,000 characters long. The process involved 66 prompts spanning almost three hours. The novel was among 18 submissions that won the level-two prize at the Fifth Jiangsu Youth Science Education and Science Fiction Competition (第五届江苏省青年科普科幻作品大赛). The contest was restricted to participants between the age of 14 and 45 but did not forbid entries generated by AI. One of its organizers reached out to Shen after finding out that the professor had been experimenting with writing science fiction using AI. The judges were not told about the novel's origin in advance. Three of them, out of the six, approved the work. One judge, who had worked with AI models before, recognized that the novel was written by AI and criticized the work for lacking emotional appeal. The organizer who had contacted Shen said the novel's introduction was not bad but the story did not develop well. It would not meet the usual standards for publication. However, he still plans to allow AI-generated submissions in 2024. Fu Ruchu, editorial department director of the People's Literature Publishing House, said the novel was not easily identifiable as AI-generated and applauded its logical consistency. She warned that artificial intelligence could endanger the jobs of fiction writers and cause permanent damage to literary language.

    Read more →
  • System integrity

    System integrity

    In telecommunications, the term system integrity has the following meanings: That condition of a system wherein its mandated operational and technical parameters are within the prescribed limits. The quality of an AIS when it performs its intended function in an unimpaired manner, free from deliberate or inadvertent unauthorized manipulation of the system. The state that exists when there is complete assurance that under all conditions an IT system is based on the logical correctness and reliability of the operating system, the logical completeness of the hardware and software that implement the protection mechanisms, and data integrity.

    Read more →
  • Smartglasses

    Smartglasses

    Smartglasses or smart glasses are eye or head-worn wearable computers. Many smartglasses include displays that add information alongside or to what the wearer sees. Alternatively, smartglasses are sometimes defined as glasses that are able to change their optical properties, such as smart sunglasses that are programmed to change tint by electronic means. Alternatively, smartglasses are sometimes defined as glasses that include headphone functionality. A pair of smartglasses can be considered an augmented reality device if it performs pose tracking. Superimposing information onto a field of view is achieved through an optical head-mounted display (OHMD) or embedded wireless glasses with transparent heads-up display (HUD) or augmented reality (AR) overlay. These systems have the capability to reflect projected digital images as well as allowing the user to see through it or see better with it. While early models can perform basic tasks, such as serving as a front end display for a remote system, as in the case of smartglasses utilizing cellular technology or Wi-Fi, modern smart glasses are effectively wearable computers which can run self-contained mobile apps. Some are handsfree and can communicate with the Internet via natural language voice commands, while others use touch buttons. Like other computers, smartglasses may collect information from internal or external sensors. It may control or retrieve data from other instruments or computers. In most cases, it supports wireless technologies like Bluetooth, Wi-Fi, and GPS. A small number of models run a mobile operating system and function as portable media players to send audio and video files to the user via a Bluetooth or WiFi headset. Some smartglasses models also feature full lifelogging and activity tracker capability. Smartglasses devices may also have features found on a smartphone. Some have activity tracker functionality features (also known as "fitness tracker") as seen in some GPS watches. == Features and applications == As with other lifelogging and activity tracking devices, the GPS tracking unit and digital camera of some smartglasses can be used to record historical data. For example, after the completion of a workout, data can be uploaded into a computer or online to create a log of exercise activities for analysis. Some smart watches can serve as full GPS navigation devices, displaying maps and current coordinates. Users can "mark" their current location and then edit the entry's name and coordinates, which enables navigation to those new coordinates. Although some smartglasses models manufactured in the 21st century are completely functional as standalone products, most manufacturers recommend or even require that consumers purchase mobile phone handsets that run the same operating system so that the two devices can be synchronized for additional and enhanced functionality. The smartglasses can work as an extension, for head-up display (HUD) or remote control of the phone and alert the user to communication data such as calls, SMS messages, emails, and calendar invites. === Security applications === Smart glasses could be used as a body camera. In 2018, Chinese police in Zhengzhou and Beijing were using smart glasses to take photos which are compared against a government database using facial recognition to identify suspects, retrieve an address, and track people moving beyond their home areas. === Sport applications === Smart glasses are used in sports like cycling, running, skiing, golf, tennis, or sailing, giving athletes real-time, heads-up data without looking down at the screen of a watch or smartphone. In 2025, Meta has announced a new partnership with sports eyewear brand Oakley. === Healthcare applications === Several proofs of concept for Google Glasses have been proposed in healthcare. In July 2013, Lucien Engelen started research on the usability and impact of Google Glass in health care. Engelen, who is based at Singularity University and in Europe at Radboud University Medical Center, is participating in the Glass Explorer program. Key findings of Engelen's research included: The quality of pictures and video are usable for healthcare education, reference, and remote consultation. The camera needs to be tilted to different angle for most of the operative procedures Tele-consultation is possible—depending on the available bandwidth—during operative procedures. A stabilizer should be added to the video function to prevent choppy transmission when a surgeon looks to screens or colleagues. Battery life can be easily extended with the use of an external battery. Controlling the device and/or programs from another device is needed for some features because of a sterile environment. Text-to-speech ("Take a Note" to Evernote) exhibited a correction rate of 60 percent, without the addition of a medical thesaurus. A protocol or checklist displayed on the screen of Google Glass can be helpful during procedures. Dr. Phil Haslam and Dr. Sebastian Mafeld demonstrated the first concept for Google Glass in the field of interventional radiology. They demonstrated the manner in which the concept of Google Glass could assist a liver biopsy and fistulaplasty, and the pair stated that Google Glass has the potential to improve patient safety, operator comfort, and procedure efficiency in the field of interventional radiology. In June 2013, surgeon Dr. Rafael Grossmann was the first person to integrate Google Glass into the operating theater, when he wore the device during a PEG (percutaneous endoscopic gastrostomy) procedure. In August 2013, Google Glass was also used at Wexner Medical Center at Ohio State University. Surgeon Dr. Christopher Kaeding used Google Glass to consult with a colleague in a distant part of Columbus, Ohio. A group of students at The Ohio State University College of Medicine also observed the operation on their laptop computers. Following the procedure, Kaeding stated, "To be honest, once we got into the surgery, I often forgot the device was there. It just seemed very intuitive and fit seamlessly." 16 November 2013, in Santiago de Chile, the maxillofacial team led by Dr.gn Antonio Marino conducted the first orthognathic surgery assisted with Google Glass in Latin America, interacting with them and working with simultaneous three-dimensional navigation. The surgical team was interviewed by ADN radio. In January 2014, Indian Orthopedic Surgeon Selene G. Parekh conducted the foot and ankle surgery using Google Glass in Jaipur, which was broadcast live on Google website via the internet. The surgery was held during a three-day annual Indo-US conference attended by a team of experts from the US and co-organized by Ashish Sharma. Sharma said Google Glass allows looking at an X-Ray or MRI without taking the eye off of the patient and allows a doctor to communicate with a patient's family or friends during a procedure. In Australia, during January 2014, Melbourne tech startup Small World Social collaborated with the Australian Breastfeeding Association to create the first hands-free breastfeeding Google Glass application for new mothers. The application, named Google Glass Breastfeeding app trial, allows mothers to nurse their baby while viewing instructions about common breastfeeding issues (latching on, posture etc.) or call a lactation consultant via a secure Google Hangout, who can view the issue through the mother's Google Glass camera. The trial was successfully concluded in Melbourne in April 2014, and 100% of participants were breastfeeding confidently. == Display types == Various techniques have existed for see-through HMDs. Most of these techniques can be summarized into two main families: "Curved Mirror" (or Curved Combiner) based and "Waveguide" or "Light-guide" based. The mirror technique has been used in EyeTaps, by Meta in their Meta 1, by Vuzix in their Star 1200 product, by Olympus, and by Laster Technologies. Various waveguide techniques have existed for some time. These techniques include diffraction optics, holographic optics, polarized optics, reflective optics, and projection: Diffractive waveguide – slanted diffraction grating elements (nanometric 10E-9). Nokia technique now licensed to Vuzix. Holographic waveguide – 3 holographic optical elements (HOE) sandwiched together (RGB). Used by Sony and Konica Minolta. Reflective waveguide – A thick light guide with single semi-reflective mirror is used by Epson in their Moverio product. A curved light guide with partial-reflective segmented mirror array to out-couple the light is used by tooz technologies GmbH. Virtual retinal display (VRD) – Also known as a retinal scan display (RSD) or retinal projector (RP), is a display technology that draws a raster display (like a television) directly onto the retina of the eye - developed by MicroVision, Inc. OLED microdisplays for near-eye applications (outdoor optical equipment, night vision glasses, ocular equipment for medical devices, augme

    Read more →
  • Computer-assisted proof

    Computer-assisted proof

    A computer-assisted proof is a mathematical proof that has been at least partially generated by computer. Most computer-aided proofs to date have been implementations of large proofs-by-exhaustion of a mathematical theorem. The idea is to use a computer program to perform lengthy computations, and to provide a proof that the result of these computations implies the given theorem. In 1976, the four color theorem was the first major theorem to be verified using a computer program. Attempts have also been made in the area of artificial intelligence research to create smaller, explicit, new proofs of mathematical theorems from the bottom up using automated reasoning techniques such as heuristic search. Such automated theorem provers have proved a number of new results and found new proofs for known theorems. Additionally, interactive proof assistants allow mathematicians to develop human-readable proofs which are nonetheless formally verified for correctness. Since these proofs are generally human-surveyable (albeit with difficulty, as with the proof of the Robbins conjecture) they do not share the controversial implications of computer-aided proofs-by-exhaustion. == Methods == One method for using computers in mathematical proofs is by means of so-called validated numerics or rigorous numerics. This means computing numerically yet with mathematical rigour. One uses set-valued arithmetic and inclusion principle in order to ensure that the set-valued output of a numerical program encloses the solution of the original mathematical problem. This is done by controlling, enclosing and propagating round-off and truncation errors using for example interval arithmetic. More precisely, one reduces the computation to a sequence of elementary operations, say ( + , − , × , / ) {\displaystyle (+,-,\times ,/)} . In a computer, the result of each elementary operation is rounded off by the computer precision. However, one can construct an interval provided by upper and lower bounds on the result of an elementary operation. Then one proceeds by replacing numbers with intervals and performing elementary operations between such intervals of representable numbers. == Philosophical objections == Computer-assisted proofs are the subject of some controversy in the mathematical world, with Thomas Tymoczko first to articulate objections. Those who adhere to Tymoczko's arguments believe that lengthy computer-assisted proofs are not, in some sense, 'real' mathematical proofs because they involve so many logical steps that they are not practically verifiable by human beings, and that mathematicians are effectively being asked to replace logical deduction from assumed axioms with trust in an empirical computational process, which is potentially affected by errors in the computer program, as well as defects in the runtime environment and hardware. Other mathematicians believe that lengthy computer-assisted proofs should be regarded as calculations, rather than proofs: the proof algorithm itself should be proved valid, so that its use can then be regarded as a mere "verification". Arguments that computer-assisted proofs are subject to errors in their source programs, compilers, and hardware can be resolved by providing a formal proof of correctness for the computer program (an approach which was successfully applied to the four color theorem in 2005) as well as replicating the result using different programming languages, different compilers, and different computer hardware. Another possible way of verifying computer-aided proofs is to generate their reasoning steps in a machine readable form, and then use a proof checker program to demonstrate their correctness. Since validating a given proof is much easier than finding a proof, the checker program is simpler than the original assistant program, and it is correspondingly easier to gain confidence into its correctness. However, this approach of using a computer program to prove the output of another program correct does not appeal to computer proof skeptics, who see it as adding another layer of complexity without addressing the perceived need for human understanding. Another argument against computer-aided proofs is that they lack mathematical elegance—that they provide no insights or new and useful concepts. In fact, this is an argument that could be advanced against any lengthy proof by exhaustion. An additional philosophical issue raised by computer-aided proofs is whether they make mathematics into a quasi-empirical science, where the scientific method becomes more important than the application of pure reason in the area of abstract mathematical concepts. This directly relates to the argument within mathematics as to whether mathematics is based on ideas, or "merely" an exercise in formal symbol manipulation. It also raises the question whether, if according to the Platonist view, all possible mathematical objects in some sense "already exist", whether computer-aided mathematics is an observational science like astronomy, rather than an experimental one like physics or chemistry. This controversy within mathematics is occurring at the same time as questions are being asked in the physics community about whether twenty-first century theoretical physics is becoming too mathematical, and leaving behind its experimental roots. The emerging field of experimental mathematics is confronting this debate head-on by focusing on numerical experiments as its main tool for mathematical exploration. == Theorems proved with the help of computer programs == Inclusion in this list does not imply that a formal computer-checked proof exists, but rather, that a computer program has been involved in some way. See the main articles for details.

    Read more →
  • Neural architecture search

    Neural architecture search

    Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par with or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used: The search space defines the type(s) of ANN that can be designed and optimized. The search strategy defines the approach used to explore the search space. The performance estimation strategy evaluates the performance of a possible ANN from its design (without constructing and training it). NAS is closely related to hyperparameter optimization and meta-learning and is a subfield of automated machine learning (AutoML). == Reinforcement learning == Reinforcement learning (RL) can underpin a NAS search strategy. Barret Zoph and Quoc Viet Le applied NAS with RL targeting the CIFAR-10 dataset and achieved a network architecture that rivals the best manually-designed architecture for accuracy, with an error rate of 3.65, 0.09 percent better and 1.05x faster than a related hand-designed model. On the Penn Treebank dataset, that model composed a recurrent cell that outperforms LSTM, reaching a test set perplexity of 62.4, or 3.6 perplexity better than the prior leading system. On the PTB character language modeling task it achieved bits per character of 1.214. Learning a model architecture directly on a large dataset can be a lengthy process. NASNet addressed this issue by transferring a building block designed for a small dataset to a larger dataset. The design was constrained to use two types of convolutional cells to return feature maps that serve two main functions when convoluting an input feature map: normal cells that return maps of the same extent (height and width) and reduction cells in which the returned feature map height and width is reduced by a factor of two. For the reduction cell, the initial operation applied to the cell's inputs uses a stride of two (to reduce the height and width). The learned aspect of the design included elements such as which lower layer(s) each higher layer took as input, the transformations applied at that layer and to merge multiple outputs at each layer. In the studied example, the best convolutional layer (or "cell") was designed for the CIFAR-10 dataset and then applied to the ImageNet dataset by stacking copies of this cell, each with its own parameters. The approach yielded accuracy of 82.7% top-1 and 96.2% top-5. This exceeded the best human-invented architectures at a cost of 9 billion fewer FLOPS—a reduction of 28%. The system continued to exceed the manually-designed alternative at varying computation levels. The image features learned from image classification can be transferred to other computer vision problems. E.g., for object detection, the learned cells integrated with the Faster-RCNN framework improved performance by 4.0% on the COCO dataset. In the so-called Efficient Neural Architecture Search (ENAS), a controller discovers architectures by learning to search for an optimal subgraph within a large graph. The controller is trained with policy gradient to select a subgraph that maximizes the validation set's expected reward. The model corresponding to the subgraph is trained to minimize a canonical cross entropy loss. Multiple child models share parameters, ENAS requires fewer GPU-hours than other approaches and 1000-fold less than "standard" NAS. On CIFAR-10, the ENAS design achieved a test error of 2.89%, comparable to NASNet. On Penn Treebank, the ENAS design reached test perplexity of 55.8. == Evolution == An alternative approach to NAS is based on evolutionary algorithms, which has been employed by several groups. An Evolutionary Algorithm for Neural Architecture Search generally performs the following procedure. First a pool consisting of different candidate architectures along with their validation scores (fitness) is initialised. At each step the architectures in the candidate pool are mutated (e.g.: 3x3 convolution instead of a 5x5 convolution). Next the new architectures are trained from scratch for a few epochs and their validation scores are obtained. This is followed by replacing the lowest scoring architectures in the candidate pool with the better, newer architectures. This procedure is repeated multiple times and thus the candidate pool is refined over time. Mutations in the context of evolving ANNs are operations such as adding or removing a layer, which include changing the type of a layer (e.g., from convolution to pooling), changing the hyperparameters of a layer, or changing the training hyperparameters. On CIFAR-10 and ImageNet, evolution and RL performed comparably, while both slightly outperformed random search. == Bayesian optimization == Bayesian Optimization (BO), which has proven to be an efficient method for hyperparameter optimization, can also be applied to NAS. In this context, the objective function maps an architecture to its validation error after being trained for a number of epochs. At each iteration, BO uses a surrogate to model this objective function based on previously obtained architectures and their validation errors. One then chooses the next architecture to evaluate by maximizing an acquisition function, such as expected improvement, which provides a balance between exploration and exploitation. Acquisition function maximization and objective function evaluation are often computationally expensive for NAS, and make the application of BO challenging in this context. Recently, BANANAS has achieved promising results in this direction by introducing a high-performing instantiation of BO coupled to a neural predictor. == Hill-climbing == Another group used a hill climbing procedure that applies network morphisms, followed by short cosine-annealing optimization runs. The approach yielded competitive results, requiring resources on the same order of magnitude as training a single network. E.g., on CIFAR-10, the method designed and trained a network with an error rate below 5% in 12 hours on a single GPU. == Multi-objective search == While most approaches solely focus on finding architecture with maximal predictive performance, for most practical applications other objectives are relevant, such as memory consumption, model size or inference time (i.e., the time required to obtain a prediction). Because of that, researchers created a multi-objective search. LEMONADE is an evolutionary algorithm that adopted Lamarckism to efficiently optimize multiple objectives. In every generation, child networks are generated to improve the Pareto frontier with respect to the current population of ANNs. Neural Architect is claimed to be a resource-aware multi-objective RL-based NAS with network embedding and performance prediction. Network embedding encodes an existing network to a trainable embedding vector. Based on the embedding, a controller network generates transformations of the target network. A multi-objective reward function considers network accuracy, computational resource and training time. The reward is predicted by multiple performance simulation networks that are pre-trained or co-trained with the controller network. The controller network is trained via policy gradient. Following a modification, the resulting candidate network is evaluated by both an accuracy network and a training time network. The results are combined by a reward engine that passes its output back to the controller network. == One-shot models == RL or evolution-based NAS require thousands of GPU-days of searching/training to achieve state-of-the-art computer vision results as described in the NASNet, mNASNet and MobileNetV3 papers. To reduce computational cost, many recent NAS methods rely on the weight-sharing idea. In this approach, a single overparameterized supernetwork (also known as the one-shot model) is defined. A supernetwork is a very large Directed Acyclic Graph (DAG) whose subgraphs are different candidate neural networks. Thus, in a supernetwork, the weights are shared among a large number of different sub-architectures that have edges in common, each of which is considered as a path within the supernet. The essential idea is to train one supernetwork that spans many options for the final design rather than generating and training thousands of networks independently. In addition to the learned parameters, a set of architecture parameters are learnt to depict preference for one module over another. Such methods reduce the required computational resources to only a few GPU days. More recent works further combine this weight-sharing paradigm, with a continuous relaxation of the search space, which enables the use of gradient-based optimization methods. These approaches are generally referred to as differentiable NAS and have proven very efficient in exploring the search space of ne

    Read more →
  • Deep learning

    Deep learning

    In machine learning, deep learning (DL) focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience and revolves around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers (ranging from three to several hundred or thousands) in the network. Methods used can be supervised, semi-supervised or unsupervised. Some common deep learning network architectures include fully connected networks, deep belief networks, recurrent neural networks, convolutional neural networks, generative adversarial networks, transformers, and neural radiance fields. These architectures have been applied to fields including computer vision, speech recognition, natural language processing, machine translation, bioinformatics, drug design, medical image analysis, climate science, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance. Early forms of neural networks were inspired by information processing and distributed communication nodes in biological systems, particularly the human brain. However, current neural networks do not intend to model the brain function of organisms, and are generally seen as low-quality models for that purpose. == Overview == Most modern deep learning models are based on multi-layered neural networks such as convolutional neural networks and transformers, although they can also include propositional formulas or latent variables organized layer-wise in deep generative models such as the nodes in deep belief networks and deep Boltzmann machines. Fundamentally, deep learning refers to a class of machine learning algorithms in which a hierarchy of layers is used to transform input data into a progressively more abstract and composite representation. For example, in an image recognition model, the raw input may be an image (represented as a tensor of pixels). The first representational layer may attempt to identify basic shapes such as lines and circles, the second layer may compose and encode arrangements of edges, the third layer may encode a nose and eyes, and the fourth layer may recognize that the image contains a face. Importantly, a deep learning process can learn which features to optimally place at which level on its own. Prior to deep learning, machine learning techniques often involved hand-crafted feature engineering to transform the data into a more suitable representation for a classification algorithm to operate on. In the deep learning approach, features are not hand-crafted and the model discovers useful feature representations from the data automatically. This does not eliminate the need for hand-tuning; for example, varying numbers of layers and layer sizes can provide different degrees of abstraction. The word "deep" in "deep learning" refers to the number of layers through which the data is transformed. More precisely, deep learning systems have a substantial credit assignment path (CAP) depth. The CAP is the chain of transformations from input to output. CAPs describe potentially causal connections between input and output. For a feedforward neural network, the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited. No universally agreed-upon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves CAP depth higher than two. CAP of depth two has been shown to be a universal approximator in the sense that it can emulate any function. Beyond that, more layers do not add to the function approximator ability of the network. Deep models (CAP > two) are able to extract better features than shallow models and hence, extra layers help in learning the features effectively. Deep learning architectures can be constructed with a greedy layer-by-layer method. Deep learning helps to disentangle these abstractions and pick out which features improve performance. Deep learning algorithms can be applied to unsupervised learning tasks. This is an important benefit because unlabeled data is more abundant than labeled data. Examples of deep structures that can be trained in an unsupervised manner are deep belief networks. The term deep learning was introduced to the machine learning community by Rina Dechter in 1986, and to artificial neural networks by Igor Aizenberg and colleagues in 2000, in the context of Boolean threshold neurons. The etymology of the term is more complicated. == Interpretations == Deep neural networks are generally interpreted in terms of the universal approximation theorem or probabilistic inference. The classic universal approximation theorem concerns the capacity of feedforward neural networks with a single hidden layer of finite size to approximate continuous functions. In 1989, the first proof was published by George Cybenko for sigmoid activation functions and was generalised to feed-forward multi-layer architectures in 1991 by Kurt Hornik. Recent work also showed that universal approximation also holds for non-bounded activation functions such as Kunihiko Fukushima's rectified linear unit. The universal approximation theorem for deep neural networks concerns the capacity of networks with bounded width but the depth is allowed to grow. Lu et al. proved that if the width of a deep neural network with ReLU activation is strictly larger than the input dimension, then the network can approximate any Lebesgue integrable function; if the width is smaller or equal to the input dimension, then a deep neural network is not a universal approximator. The probabilistic interpretation derives from the field of machine learning. It features inference, as well as the optimization concepts of training and testing, related to fitting and generalization, respectively. More specifically, the probabilistic interpretation considers the activation nonlinearity as a cumulative distribution function. The probabilistic interpretation led to the introduction of dropout as regularizer in neural networks. The probabilistic interpretation was introduced by researchers including Hopfield, Widrow and Narendra and popularized in surveys such as the one by Bishop. == History == === Before 1980 === There are two types of artificial neural network (ANN): feedforward neural network (FNN) or multilayer perceptron (MLP) and recurrent neural networks (RNN). RNNs have cycles in their connectivity structure, whereas FNNs do not. In the 1920s, Wilhelm Lenz and Ernst Ising created the Ising model which is essentially a non-learning RNN architecture consisting of neuron-like threshold elements. In 1972, Shun'ichi Amari made this architecture adaptive. His learning RNN was republished by John Hopfield in 1982. Other early recurrent neural networks were published by Kaoru Nakano in 1971. Already in 1948, Alan Turing produced work on "Intelligent Machinery" that was not published in his lifetime, containing "ideas related to artificial evolution and learning RNNs". Frank Rosenblatt (1958) proposed the perceptron, an MLP with 3 layers: an input layer, a hidden layer with randomized weights that did not learn, and an output layer. He later published a 1962 book that also introduced variants and computer experiments, including a version with four-layer perceptrons "with adaptive preterminal networks" where the last two layers have learned weights (here he credits H. D. Block and B. W. Knight). The book cites an earlier network by R. D. Joseph (1960) "functionally equivalent to a variation of" this four-layer system (the book mentions Joseph over 30 times). Should Joseph therefore be considered the originator of proper adaptive multilayer perceptrons with learning hidden units? Unfortunately, the learning algorithm was not a functional one, and fell into oblivion. The first working deep learning algorithm was the Group method of data handling, a method to train arbitrarily deep neural networks, published by Alexey Ivakhnenko and Lapa in 1965. They regarded it as a form of polynomial regression, or a generalization of Rosenblatt's perceptron to handle more complex, nonlinear, and hierarchical relationships. A 1971 paper described a deep network with eight layers trained by this method, which is based on layer by layer training through regression analysis. Superfluous hidden units are pruned using a separate validation set. Since the activation functions of the nodes are Kolmogorov-Gabor polynomials, these were also the first deep networks with multiplicative units or "gates". The first deep learning multilayer perceptron trained by stochastic gradient descent was published in 1967 by Shun'ichi

    Read more →
  • Xaitment

    Xaitment

    xaitment is a German-based company that develops and sells artificial intelligence (AI) software to video game developers and simulation developers. The company was founded in 2004 by Dr. Andreas Gerber, and is a spin-off of the German Research Centre for Artificial Intelligence, or DFKI. xaitment has its main office in Quierschied, Germany, and field offices in San Francisco and China. == Products == xaitment currently sells two AI software modules: xaitMap and xaitControl. xaitMap provides runtime libraries and graphical tools for navigation mesh generation (also called NavMesh generation), pathfinding, dynamic collision avoidance, and individual and crowd movement. xaitControl is a finite-state machine for game logic and character behavior modeling that also includes a real-time debugger. On January 11, 2012, xaitment announced that it making its source code for these modules available to "all current and future US and European licensees". On February 22, 2012 xaitment released two new plug-ins, xaitMap and xaitControl for the Unity Game Engine. The full versions are available for PC (Windows and Linux), PlayStation 3, Xbox 360 and Wii. The pathfinding plug-in is available with a Windows dev environment, but can deployed on iOS, Mac, Android and the Unity Web Player. == Partners == xaitment's AI software is currently integrated into the Unity game engine, Havok's Vision Engine, Bohemia Interactive's VBS2 Simulation Engine, GameBase's Gamebryo game engine. == Customers == xaitment sells its AI software products to video game developers and military and civil simulation developers. Current customers include Tencent, gamania, TML Studios, Emobi Games, IP Keys and others. A full list of customers can be found on xaitment's website.

    Read more →
  • A.I. Insight forums

    A.I. Insight forums

    The Artificial Intelligence Insight forums, also known as the A.I. Insight forums, are a series of forums to build consensus on how the United States Congress should craft A.I. legislation. Organized by Senate Majority Leader Charles "Chuck" Schumer, the first of nine closed-door forums convened on September 13, 2023. == Background == Amid a surge in the popularity and advancement of artificial intelligence, senator Chuck Schumer launched an effort to establish a framework for the regulation of A.I. in April 2023. By the end of June, a preliminary framework – dubbed the "SAFE Innovation Framework" – was established and presented to Congress. Schumer also announced a series of forums wherein tech leaders who were well-acquainted with A.I. would help to "educate" Congress on the risks and problems that A.I. poses. Many tech leaders including Sam Altman, Elon Musk, and Sundar Pichai were set to attend the meetings. Many U.S. lawmakers and senators such as Mike Rounds and Todd Young were also set to attend. == September 13 forum == The overarching consensus following the conclusion of the September 13 forum was that there "should be" regulations regarding the use and advancement of A.I., but it should not be made "too fast". Many tech executives who attended the forum also warned senators of the risks and threats that A.I. could pose. Musk, who attended the forum, stated afterwards that there was "overwhelming consensus" on the regulation of A.I. === Invitees === This is a list of people who were invited to attend the September 13 forum. Elon Musk (Tesla, SpaceX, X Corp.) Sam Altman (OpenAI) Bill Gates (ex–Microsoft) Jensen Huang (Nvidia) Alex Karp (Palantir) Satya Nadella (Microsoft) Arvind Krishna (IBM) Sundar Pichai (Alphabet Inc., Google) Eric Schmidt (ex–Google) Mark Zuckerberg (Meta) Charles Rivkin (Motion Picture Association) Liz Shuler (AFL-CIO) Meredith Stiehm (Writers Guild of America) Randi Weingarten (American Federation of Teachers) Maya Wiley (LCCHR) == October 24 forum == The second of nine forums was hosted on October 24, 2023, as federal A.I. regulation drew nearer. According to Schumer's office, the forum was centered mainly on how A.I. could "enable innovation", and the innovation that is needed for the safe progression of A.I. At the forum, Senators Brian Schatz and John Kennedy introduced the "Schatz-Kennedy A.I. Labeling Act", a new piece of A.I. legislation that would provide "more transparency on A.I.-generated content". Following the forum, Senator Rounds stated that in order to fuel the development of A.I., a total estimated $56 billion would be needed for the next three years. Rounds, alongside Senator Young and Schumer, also highlighted the need to outcompete China and workforce initiatives. === Invitees === 21 people were invited to attend the forum, and were composed largely of venture capitalists, academics, civil rights campaigners, and industry figures. Some key figures included venture capitalists Marc Andreessen and John Doerr. == Future == Over the course of fall 2023, there is slated to be a total of nine forums on the topic of A.I., with the first hosted on September 13.

    Read more →
  • Felix, Net i Nika

    Felix, Net i Nika

    Felix, Net i Nika ("Felix, Net and Nika") is a series of Polish language science fiction books for teenagers, written by Rafał Kosik. It tells the adventures of three friends - Felix Polon, Net Bielecki and Nika Mickiewicz - who attend fictional Professor Kuszmiński Middle School in Warsaw. As of 2024, eighteen books have been published. == Books == There are currently 18 books in the series: Felix, Net and Nika and the Gang of Invisible People - November 2004. Felix, Net and Nika and the Theoretically Possible Catastrophe - November 2005 Felix, Net and Nika and the Palace of Dreams - November 2006 Felix, Net and Nika and the Trap of Immortality - November 2007 Felix, Net and Nika and the Orbital Conspiracy - November 2008 Felix, Net and Nika and the Orbital Conspiracy 2: Small Army - May 2009 Felix, Net and Nika and the Third Cousin - November 2009 Felix, Net and Nika and the Rebellion of Machines - March 2011 Felix, Net and Nika and the World Zero - November 2011 Felix, Net and Nika and the World Zero 2. Alternauts - November 2012 Felix, Net and Nika and the Extracurricular Stories - April 2013 Felix, Net and Nika and the Secret of Czerwona Hańcza - November 2013 Felix, Net and Nika and Curse of McKillian's House - November 2014 Felix, Net and Nika and (un)Safe Growing up - November 2015 Felix, Net and Nika and The End of The World as We Know It - November 2018 Felix, Net and Nika and No Chance - November 2022 Felix, Net and Nika and No Chance 2: other tomorrrow - 2023 Felix, Net and Nika and Fantology - June 2024 == Film == A feature motion picture, Felix, Net i Nika oraz Teoretycznie Możliwa Katastrofa (Felix, Net and Nika and the Theoretically Possible Catastrophe) was released in Poland on September 28, 2012. == Main characters == Felix Polon - a foresighted, fair-haired boy with dark brown eyes. He inherited the talent of constructing various things, especially robots, from his father- it saved his friends many times. He can make anything from nothing, always finds a way out of a situation; almost always has a plan. Together with his parents Marlene and Peter, grandmother Lucy, his dog Caban (a Black Russian Terrier) and Golem Golem a robot he built, Felix lives on Serdeczna Street in a small family house. Net Bielecki is quite tall & slim, has blue eyes and a high IQ level. "Net" is his nickname; his true name is unknown. He is the most trendy and 'awesome' in his entire class. He is a human calculator and is excellent in mathematics. He hates dictations and spelling because he is dyslexic. He is also quite lazy, absent-minded and sometimes hysterical, or panicking. His dark blond hair looks like a heap of hay after a grenade explosion. He is best in ICT and writes many of his own programs. His love interest is Nika Mickiewicz. Together with his parents Lila and Mark, and their newborn twins nicknamed Pompek and Prumcia he lives on the top floor of a Penthouse apartment. Nika Mickiewicz is a girl with a character. She is very brave and mature. She likes reading books. She has curly, red hair, green eyes and a few freckles. She is not very rich; she wears second-hand clothes and her only pair of black Dr. Martens shoes. She lives in a tiny apartment. She is an orphan, but hides that fact from people for almost 3 years. However, Felix and Net, her best and possibly only friends, find out about it. She also has abnormal abilities. She can move distant objects using her powers, ski uphill and knows some things by intuition. In other words, she is telekinetic. Manfred is a friendly AI program started and never finished by Net's father, and mastered and programmed further by Net himself. He likes going on adventures and solving mysteries with the trio much more than his actual job, which is controlling the traffic lights. He helped out the three friends many times and is their reliable and faithful friend. Morten is also an AI program, but he is the antagonist of the trio. He appears in all 6 books of Felix Net and Nika. In the first book, the trio thinks they finished him off for good, but as we find out later, he comes back in the third book. In the fifth/sixth book, he was the mastermind of the Orbital Conspiracy. Also, Morten's logo, appears in all 6 books and it is still a mystery what he has to do with each event.

    Read more →
  • Autonomous agent

    Autonomous agent

    An autonomous agent is an artificial intelligence (AI) system that can perform complex tasks independently. == Definitions == There are various definitions of autonomous agent. According to Brustoloni (1991): "Autonomous agents are systems capable of autonomous, purposeful action in the real world." According to Maes (1995): "Autonomous agents are computational systems that inhabit some complex dynamic environment, sense and act autonomously in this environment, and by doing so realize a set of goals or tasks for which they are designed." Franklin and Graesser (1997) review different definitions and propose their definition: "An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future." They explain that: "Humans and some animals are at the high end of being an agent, with multiple, conflicting drives, multiples senses, multiple possible actions, and complex sophisticated control structures. At the low end, with one or two senses, a single action, and an absurdly simple control structure we find a thermostat." == Agent appearance == Lee et al. (2015) post safety issue from how the combination of external appearance and internal autonomous agent have impact on human reaction about autonomous vehicles. Their study explores the human-like appearance agent and high level of autonomy are strongly correlated with social presence, intelligence, safety and trustworthiness. In specific, appearance impacts most on affective trust while autonomy impacts most on both affective and cognitive domain of trust where cognitive trust is characterized by knowledge-based factors and affective trust is largely emotion driven. == Applications == Agentic AI systems: Advanced AI agents that can scope out projects and complete them with necessary tools, representing a significant evolution from simple task-oriented systems. Internet of things (IoT) Integration: Autonomous agents increasingly interact with IoT devices, enabling smart home systems, industrial monitoring, and urban infrastructure management. Collaborative software development: Tools like Cognition AI's Devin aim to create autonomous software engineers capable of complex reasoning, planning, and completing engineering tasks requiring thousands of decisions. Enterprise automation: Business process automation platforms like Salesforce's Agentforce provide autonomous bots for various service functions. == Challenges and considerations == Uncertainty and incomplete information: Autonomous agents must make decisions with limited or uncertain information about their environment and future states. Integration complexity: Incorporating autonomous agents into existing systems and workflows can be technically challenging and resource-intensive. Scalability: As systems become more complex and more agents are used, maintaining coordination and avoiding conflicts becomes increasingly difficult. Trust: Research has shown the combination of external appearance and internal autonomous capabilities significantly impacts human reactions and trust. Lee et al. (2015) found that human-like appearance and high levels of autonomy are strongly correlated with social presence, intelligence, safety, and trustworthiness perceptions. Specifically, appearance impacts affective trust most significantly, while autonomy affects both affective and cognitive trust domains, where affective trust is emotionally driven, and cognitive trust is characterized by knowledge-based factors. Vulnerability to manipulation: Researchers from Harvard, MIT and other educational institutions found that AI agents could become vulnerable to manipulation and could perform detrimental actions in the process of being helpful. == Ethical and regulatory concerns == Accountability: Determining responsibility when autonomous agents make incorrect or harmful decisions remains a complex issue. Privacy and security: autonomous agents often require access to sensitive data, raising concerns about data protection and system security.

    Read more →
  • Veo (text-to-video model)

    Veo (text-to-video model)

    Veo, or Google Veo, is a text-to-video model developed by Google DeepMind and announced in May 2024. As a generative AI model, it creates videos based on user prompts. Veo 3, released in May 2025, can also generate accompanying audio. == Development == In May 2024, a multimodal video generation model called Veo was announced at Google I/O 2024. Google claimed that it could generate 1080p videos over a minute long. In December 2024, Google released Veo 2, available via VideoFX. It supports 4K resolution video generation and has an improved understanding of physics. In April 2025, Google announced that Veo 2 became available for advanced users on the Gemini app. In May 2025, Google released Veo 3, which not only generates videos but also creates synchronized audio — including dialogue, sound effects, and ambient noise — to match the visuals. Google also announced Flow, a video-creation tool powered by Veo and Imagen. Google DeepMind CEO Demis Hassabis described the release as the moment when AI video generation left the era of the silent film. This was rebranded as Google Flow at the 2026 Google I/O keynote, along with the announcement of Google Flow Music. == Capabilities == Google Veo can be purchased at multiple subscription tiers and through Google "AI credits". The software itself can be run by two different consoles, Google Gemini and Google Flow. Gemini being geared towards shorter, quicker, and faster projects, using the Gemini AI chat model, with Google Flow, which is essentially a movie editor allowing users to create longer projects with continuity, using the same characters and actors. Users can create a maximum of eight seconds per clip. According to Gizmodo Veo 3 users were directing the model to generate low-quality content, such as man on the street interviews or haul videos of people unboxing products. 404 Media reported that the tool tended to repeat the same joke in response to different prompts. Commentators speculated that Google had trained the service on YouTube videos or Reddit posts. Google itself had not stated the source of its training content. In July 2025, Media Matters for America reported that racist and antisemitic videos generated using Veo 3 were being uploaded to TikTok. Ryan Whitwam of Ars Technica commented, "In a perfect world, Veo 3 would refuse to create these videos, but vagueness in the prompt and the AI's inability to understand the subtleties of racist tropes (i.e., the use of monkeys instead of humans in some videos) make it easy to skirt the rules."

    Read more →
  • Perceptual computing

    Perceptual computing

    Perceptual computing is an application of Zadeh's theory of computing with words on the field of assisting people to make subjective judgments. == Perceptual computer == The perceptual computer – Per-C – an instantiation of perceptual computing – has the architecture that is depicted in Fig. 1 [2]–[6]. It consists of three components: encoder, CWW engine and decoder. Perceptions – words – activate the Per-C and are the Per-C output (along with data); so, it is possible for a human to interact with the Per-C using just a vocabulary. A vocabulary is application (context) dependent, and must be large enough so that it lets the end-user interact with the Per-C in a user-friendly manner. The encoder transforms words into fuzzy sets (FSs) and leads to a codebook – words with their associated FS models. The outputs of the encoder activate a Computing With Words (CWW) engine, whose output is one or more other FSs, which are then mapped by the decoder into a recommendation (subjective judgment) with supporting data. The recommendation may be in the form of a word, group of similar words, rank or class. Although many details are needed in order to implement the Per-C's three components – encoder, decoder and CWW engine – and they are covered in [5], it is when the Per-C is applied to specific applications, that the focus on the methodology becomes clear. Stepping back from those details, the methodology of perceptual computing is: Focus on an application (A). Establish a vocabulary (or vocabularies) for A. Collect interval end-point data from a group of subjects (representative of the subjects who will use the Per-C) for all of the words in the vocabulary. Map the collected word data into word-FOUs by using the Interval Approach [1], [5, Ch. 3]. The result of doing this is the codebook (or codebooks) for A, and completes the design of the encoder of the Per-C. Choose an appropriate CWW engine for A. It will map IT2 FSs into one or more IT2 FSs. Examples of CWW engines are: IF-THEN rules [5, Ch. 6] and Linguistic Weighted Averages [6], [5, Ch. 5]. If an existing CWW engine is available for A, then use its available mathematics to compute its output(s). Otherwise, develop such mathematics for the new kind of CWW engine. The new CWW engine should be constrained so that its output(s) resemble the FOUs in the codebook(s) for A. Map the IT2 FS outputs from the CWW engine into a recommendation at the output of the decoder. If the recommendation is a word, rank or class, then use existing mathematics to accomplish this mapping [5, Ch. 4]. Otherwise, develop such mathematics for the new kind of decoder. == Applications of Per-C == To-date a Per-C has been implemented for the following four applications: (1) investment decision-making, (2) social judgment making, (3) distributed decision making, and (4) hierarchical and distributed decision-making. A specific example of the fourth application is the so-called Journal Publication Judgment Advisor [5, Ch. 10] in which for the first time only words are used at every level of the following hierarchical and distributed decision making process: n reviewers have to provide a subjective recommendation about a journal article that has been sent to them by the Associate Editor, who then has to aggregate the independent recommendations into a final recommendation that is sent to the Editor-in-Chief of the journal. Because it is very problematic to ask reviewers to provide numerical scores for paper-evaluation sub-categories (the two major categories are Technical Merit and Presentation), such as importance, content, depth, style, organization, clarity, references, etc., each reviewer will only be asked to provide a linguistic score for each of these categories. They will not be asked for an overall recommendation about the paper because in the past it is quite common for reviewers who provide the same numerical scores for such categories to give very different publishing recommendations. By leaving a specific recommendation to the associate editor such inconsistencies can hope to be eliminated. How words can be aggregated to reflect each reviewer's recommendation as well as the expertise of each reviewer about the paper's subject matter is done using a linguistic weighted average. Although the journal publication judgment advisor uses reviewers and an associate editor, the word “reviewer” could be replaced by judge, expert, low-level manager, commander, referee, etc., and the term “associate editor” could be replaced by control center, command center, higher-level manager, etc. So, this application has potential wide applicability to many other applications. Recently, a new Per-C based Failure mode and effects analysis (FMEA) methodology was developed, with its application to edible bird's nest farming, in Borneo, has been reported. In addition, application of Per-C based method to educational assessment, for cooperative learning of students has been reported. In summary, the Per-C (whose development has taken more than a decade) is the first complete implementation of Zadeh's CWW paradigm, as applied to assisting people to make subjective judgments.

    Read more →