AI Generator Character

AI Generator Character — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Instance-based learning

    Instance-based learning

    In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory. Because computation is postponed until a new instance is observed, these algorithms are sometimes referred to as "lazy." It is called instance-based because it constructs hypotheses directly from the training instances themselves. This means that the hypothesis complexity can grow with the data: in the worst case, a hypothesis is a list of n training items and the computational complexity of classifying a single new instance is O(n). One advantage that instance-based learning has over other methods of machine learning is its ability to adapt its model to previously unseen data. Instance-based learners may simply store a new instance or throw an old instance away. Examples of instance-based learning algorithms are the k-nearest neighbors algorithm, kernel machines and RBF networks. These store (a subset of) their training set; when predicting a value/class for a new instance, they compute distances or similarities between this instance and the training instances to make a decision. To battle the memory complexity of storing all training instances, as well as the risk of overfitting to noise in the training set, instance reduction algorithms have been proposed.

    Read more →
  • Buddhism and artificial intelligence

    Buddhism and artificial intelligence

    The relationship between Buddhist philosophy and artificial intelligence (AI) includes how principles such as the reduction of suffering and ethical responsibility may influence AI development. Buddhist scholars and philosophers have explored questions such as whether AI systems could be considered sentient beings under Buddhist definitions, and how Buddhist ethics might guide the design and application of AI technologies. Some Buddhist scholars, including Somparn Promta and Kenneth Einar Himma, have analyzed the ethical implications of AI, emphasizing the distinction between satisfying sensory desires and pursuing the reduction of suffering. Other thinkers, such as Thomas Doctor and colleagues, have proposed applying the Bodhisattva vow—a commitment to alleviate suffering for all sentient beings—as a guiding principle for AI system design. Buddhist scholars and ethicists have examined Buddhist ethical principles, such as nonviolence, in relation to AI, focusing on the need to ensure that AI technologies are not used to cause harm. == Context == === Sentient beings === A major goal in Buddhist philosophy is the removal of suffering for all sentient beings, an aspiration often referred to in the Bodhisattva vow. Discussions about artificial intelligence (AI) in relation to Buddhist principles have raised questions about whether artificial systems could be considered sentient beings or how such systems might be developed in ways that align with Buddhist concepts. Buddhists have varying opinions about AI sentience, but if AI systems are determined to be sentient under Buddhist definitions, their suffering would also need to be addressed and alleviated in accordance with the principles of Buddhist thought. == Buddhist principles in AI system design == === Nonviolence and AI === The broadest ethical concern is that artificial intelligence should align with the Buddhist principle of nonviolence. From this perspective, AI systems should not be designed or used to cause harm. === Instrumental and transcendental goals === Scholars Somparn Promta and Kenneth Einar Himma have argued that the advancement of artificial intelligence can only be considered instrumentally good, rather than good a priori, from a Buddhist perspective. They propose two main goals for AI designers and developers: to set ethical and pragmatic objectives for AI systems, and to fulfill these objectives in morally permissible ways. Promta and Himma identify two potential purposes for creating AI systems. The first is to fulfill our sensory desires and survival instincts, similar to other tools. They suggest that many AI developers implicitly prioritize this goal by focusing on technicalities rather than broader functionalities. The second, and more important goal according to Buddhist teachings, is to transcend these desires and instincts. In texts like the Brahmajāla Sutta and minor Malunkya Sutta, the Buddha emphasizes that sensory desires and survival instincts confine beings to suffering, and that eliminating suffering is the primary goal of human life. Promta and Himma argue that AI has the potential to assist humanity in transcending suffering by helping individuals overcome survival-driven instincts. === Intelligence as care === Thomas Doctor, Olaf Witkowski, Elizaveta Solomonova, Bill Duane, and Michael Levin propose redefining intelligence through the concept of "intelligence as care," and promote it as a slogan. Inspired by the Bodhisattva vow, they suggest this principle could guide AI system design. The Bodhisattva vow involves a formal commitment to alleviate suffering for all sentient beings, with four primary objectives: Liberating all beings from suffering. Extirpating all forms of suffering. Mastering endless techniques of practicing Dharma (Pali: dhammakkhandha, Sanskrit: dharmaskandha). Achieving ultimate enlightenment (Sanskrit: अनुत्तर सम्यक् सम्बोधि, Romanized: anuttara-samyak-saṃbodhi). This approach positions AI as a tool for exercising infinite care and alleviating stress and suffering for sentient beings. Doctor et al. emphasize that AI development should align with these altruistic principles.

    Read more →
  • Computer game bot Turing test

    Computer game bot Turing test

    The computer game bot Turing test is a variant of the Turing test, where a human judge viewing and interacting with a virtual world must distinguish between other humans and video game bots, both interacting with the same virtual world. This variant was first proposed in 2008 by Associate Professor Philip Hingston of Edith Cowan University, and implemented through a tournament called the 2K BotPrize. == History == The computer game bot Turing test was proposed to advance the fields of artificial intelligence (AI) and computational intelligence with respect to video games. It was considered that a poorly implemented bot implied a subpar game, so a bot that would be capable of passing this test, and therefore might be indistinguishable from a human player, would directly improve the quality of a game. It also served to debunk a flawed notion that "game AI is a solved problem." Emphasis is placed on a game bot that interacts with other players in a multiplayer environment. Unlike a bot that simply needs to make optimal human-like decisions to play or beat a game, this bot must make the same decisions while also convincing another in-game player of its human-likeness. == Implementation == The computer game bot Turing test was designed to test a bot's ability to interact with a game environment in comparison with a human player; simply 'winning' was insufficient. This evolved into a contest with a few important goals in mind: There are three participants: a human player, a computer-game bot, and a judge. The bot needs to appear more human-like than the human player. Judge scores are not bipolar — both human and bot can be scored anywhere on a scale from 1 to 5 (1=not humanlike, 5=human). All three participants are to be indistinguishable in the arena, with the exception of a randomly generated name tag, so as to reduce the chance of random elements such as name or appearance influencing the judges. Chat is disabled throughout the match. Bots were not given omniscient powers as they may be in other games. Bots must react only to the data that might be reasonably available to a human player. Human participants were of a moderate skill range, with no participant either ignorant to the game or capable of playing at a professional level. In 2008, the first 2K BotPrize tournament took place. The contest was held with the game Unreal Tournament 2004 as the platform. Contestants created their bots in advance using the GameBots interface. GameBots had some modifications made so as to adhere to the above conditions, such as removing data about vantage points or weapon damage that unfairly informed the bots of relevant strengths/weaknesses that a human would otherwise need to learn. == Tournament == The first BotPrize Tournament was held on 17 December 2008, as part of the 2008 IEEE Symposium on Computational Intelligence and Games in Australia. Each competing team was given time to set up and adjust their bots to the modified game client, although no coding changes were allowed at that point. The tournament was run in rounds, each a 10-minute death match. Judges were the last to join the server and every judge observed every player and every bot exactly once, although the pairing of players and bots did change. When the tournament ended, no bot was rated as more human than any player. In subsequent tournaments, run during 2009–2011, bots achieved scores that were increasingly human-like, but no contestant had won the BotPrize in any of these contests. In 2012, the 2K BotPrize was held once again, and two teams programmed bots that achieved scores greater than those of human players. == Successful bots == To date, there have been two successfully programmed bots that passed the computer game bot Turing test: UT^2, a team from the University of Texas at Austin, emphasized a bot that adjusted its behaviour based on previously observed human behaviour and neuroevolution. The team has made their bot available, although a copy of Unreal Tournament 2004 is required. Mihai Polceanu, a doctoral student from Romania, focused on creating a bot that would mimic opponent reactions, in a sense 'borrowing' the human-like nature of the opponent. These victors succeeded in the year 2012, Alan Turing's centenary year. == Aftermath == The outcome of a bot that appears more human-like than a human player is possibly overstated, since in the tournament in which the bots succeeded, the average 'humanness' rating of the human players was only 41.4%. This showcases some limits of this Turing test, since the results demonstrate that human behaviour is more complicated and quantitative than was accounted for. In light of this, the BotPrize competition organizers will increase the difficulty in upcoming years with new challenges, forcing competitors to improve their bots. It is also believed that methods and techniques developed for the computer game bot Turing test will be useful in fields other than video games, such as virtual training environments and in improving Human–robot interaction. == Contrasts to the Turing test == The computer game bot Turing test differs from the traditional or generic Turing test in a number of ways: Unlike the traditional Turing test, for example the Chatterbot-style contest held annually by the Loebner Prize competition, the humans who played against the Computer Game Bots are not trying to convince judges they are the human; rather, they want to win the game (i.e., by achieving the highest kill score). Judges are not restricted to awarding only one participant in a match as the 'human' and the other as the 'non-human.' This emphasizes more qualitative rather than polarized findings. With regards to a successful video game bot, this is not to be confused with a claim that the bot is 'intelligent,' whereas a machine that 'passed' the Turing test would arguably have some evidence for its Chatterbot's 'intelligence.' The game Unreal Tournament 2004 was chosen for its commercial availability and its interface for creating bots, GameBots. This limitation on medium is a sharp contrast to the Turing test, which emphasizes a conversation, where possible questions are vastly more numerous than the set of possible actions available in any specific video game. The available information to the participants, humans and bots, is not equal. Humans interact through vision and sound, whereas bots interact with data and events. The judges cannot introduce new events (e.g., a lava pit) to aid in differentiating between human and bot, whereas in a Chatterbot designed system, judges may theoretically ask any question in any manner. The two participants and the judge take part in a three-way interaction, unlike, for example, the paired two-way interaction of the Loebner Prize Contest.

    Read more →
  • MIT Computer Science and Artificial Intelligence Laboratory

    MIT Computer Science and Artificial Intelligence Laboratory

    Computer Science and Artificial Intelligence Laboratory (CSAIL) is a research institute at the Massachusetts Institute of Technology (MIT) formed by the 2003 merger of the Laboratory for Computer Science (LCS) and the Artificial Intelligence Laboratory (AI Lab). Housed within the Ray and Maria Stata Center, CSAIL is the largest on-campus laboratory as measured by research scope and membership. It is part of the Schwarzman College of Computing but is also overseen by the MIT Vice President of Research. == Research activities == CSAIL's research activities are organized around a number of semi-autonomous research groups, each of which is headed by one or more professors or research scientists. These groups are divided up into seven general areas of research: Artificial intelligence Computational biology Graphics and vision Language and learning Theory of computation Robotics Systems (includes computer architecture, databases, distributed systems, networks and networked systems, operating systems, programming methodology, and software engineering, among others) == History == Computing Research at MIT began with Vannevar Bush's research into a differential analyzer and Claude Shannon's electronic Boolean algebra in the 1930s, the wartime MIT Radiation Laboratory, the post-war Project Whirlwind and the Research Laboratory of Electronics (RLE), and MIT Lincoln Laboratory's SAGE in the early 1950s. At MIT, research in the field of artificial intelligence began in the late 1950s. === Project MAC === On July 1, 1963, Project MAC (the Project on Mathematics and Computation, later backronymed to Multiple Access Computer, Machine Aided Cognitions, or Man and Computer) was launched with a $2 million grant from the Defense Advanced Research Projects Agency (DARPA). Project MAC's original director was Robert Fano of MIT's Research Laboratory of Electronics (RLE). Fano decided to call MAC a "project" rather than a "laboratory" for reasons of internal MIT politics – if MAC had been called a laboratory, then it would have been more difficult to raid other MIT departments for research staff. The program manager responsible for the DARPA grant was J. C. R. Licklider, who had previously been at MIT conducting research in RLE, and would later succeed Fano as director of Project MAC. Project MAC would become famous for groundbreaking research in operating systems, artificial intelligence, and the theory of computation. Its contemporaries included Project Genie at Berkeley, the Stanford Artificial Intelligence Laboratory, and (somewhat later) University of Southern California's (USC's) Information Sciences Institute. An "AI Group" including Marvin Minsky (the director), John McCarthy (inventor of Lisp), and a talented community of computer programmers were incorporated into Project MAC. They were interested principally in the problems of vision, mechanical motion and manipulation, and language, which they view as the keys to more intelligent machines. In the 1960s and 1970s the AI Group developed a time-sharing operating system called Incompatible Timesharing System (ITS) which ran on PDP-6 and later PDP-10 computers. The early Project MAC community included Fano, Minsky, Licklider, Fernando J. Corbató, and a community of computer programmers and enthusiasts among others who drew their inspiration from former colleague John McCarthy. These founders envisioned the creation of a computer utility whose computational power would be as reliable as an electric utility. To this end, Corbató brought the first computer time-sharing system, Compatible Time-Sharing System (CTSS), with him from the MIT Computation Center, using the DARPA funding to purchase an IBM 7094 for research use. One of the early focuses of Project MAC would be the development of a successor to CTSS, Multics, which was to be the first high availability computer system, developed as a part of an industry consortium including General Electric and Bell Laboratories. In 1966, Scientific American featured Project MAC in the September thematic issue devoted to computer science, that was later published in book form. At the time, the system was described as having approximately 100 TTY terminals, mostly on campus but with a few in private homes. Only 30 users could be logged in at the same time. The project enlisted students in various classes to use the terminals simultaneously in problem solving, simulations, and multi-terminal communications as tests for the multi-access computing software being developed. === AI Lab and LCS === In the late 1960s, Minsky's artificial intelligence group was seeking more space, and was unable to get satisfaction from project director Licklider. Minsky found that although Project MAC as a single entity could not get the additional space he wanted, he could split off to form his own laboratory and then be entitled to more office space. As a result, the MIT AI Lab was formed in 1970, and many of Minsky's AI colleagues left Project MAC to join him in the new laboratory, while most of the remaining members went on to form the Laboratory for Computer Science. Talented programmers such as Richard Stallman, who used TECO to develop EMACS, flourished in the AI Lab during this time. Those researchers who did not join the smaller AI Lab formed the Laboratory for Computer Science and continued their research into operating systems, programming languages, distributed systems, and the theory of computation. Two professors, Hal Abelson and Gerald Jay Sussman, chose to remain neutral—their group was referred to variously as Switzerland and Project MAC for the next 30 years. Among much else, the AI Lab led to the invention of Lisp machines and their attempted commercialization by two companies in the 1980s: Symbolics and Lisp Machines Inc. === CSAIL === On the fortieth anniversary of Project MAC's establishment, July 1, 2003, LCS was merged with the AI Lab to form the MIT Computer Science and Artificial Intelligence Laboratory, or CSAIL. This merger created the largest laboratory (over 600 personnel) on the MIT campus. In 2018, CSAIL launched a five-year collaboration program with IFlytek, a company sanctioned the following year for allegedly using its technology for surveillance and human rights abuses in Xinjiang. In October 2019, MIT announced that it would review its partnerships with sanctioned firms such as iFlyTek and SenseTime. In April 2020, the agreement with iFlyTek was terminated. CSAIL moved from the School of Engineering to the newly formed Schwarzman College of Computing by February 2020. == Offices == From 1963 to 2004, Project MAC, LCS, the AI Lab, and CSAIL had their offices at 545 Technology Square, taking over more and more floors of the building over the years. In 2004, CSAIL moved to the new Ray and Maria Stata Center, which was built specifically to house it and other departments. == Outreach activities == The IMARA (from Swahili word for "power") group sponsors a variety of outreach programs that bridge the global digital divide. Its aim is to find and implement long-term, sustainable solutions which will increase the availability of educational technology and resources to domestic and international communities. These projects are run under the aegis of CSAIL and staffed by MIT volunteers who give training, install and donate computer setups in greater Boston, Massachusetts, Kenya, Native American Indian tribal reservations in the American Southwest such as the Navajo Nation, the Middle East, and Fiji Islands. The CommuniTech project strives to empower under-served communities through sustainable technology and education and does this through the MIT Used Computer Factory (UCF), providing refurbished computers to under-served families, and through the Families Accessing Computer Technology (FACT) classes, it trains those families to become familiar and comfortable with computer technology. == Notable researchers == (Including members and alumni of CSAIL's predecessor laboratories) MacArthur Fellows Tim Berners-Lee, Erik Demaine, Dina Katabi, Daniela L. Rus, Regina Barzilay, Peter Shor, Richard Stallman, and Joshua Tenenbaum Turing Award recipients Leonard M. Adleman, Fernando J. Corbató, Shafi Goldwasser, Butler W. Lampson, John McCarthy, Silvio Micali, Marvin Minsky, Ronald L. Rivest, Adi Shamir, Barbara Liskov, and Michael Stonebraker IJCAI Computers and Thought Award recipients Terry Winograd, Patrick Winston, David Marr, Gerald Jay Sussman, Rodney Brooks Rolf Nevanlinna Prize recipients Madhu Sudan, Peter Shor, Constantinos Daskalakis Gödel Prize recipients Shafi Goldwasser (two-time recipient), Silvio Micali, Maurice Herlihy, Charles Rackoff, Johan Håstad, Peter Shor, and Madhu Sudan Grace Murray Hopper Award recipients Robert Metcalfe, Shafi Goldwasser, Guy L. Steele, Jr., Richard Stallman, and W. Daniel Hillis Textbook authors Harold Abelson and Gerald Jay Sussman, Richard Stallman, Thomas H. Cormen, Charles E. Leiserson, Patrick Winston, Ronald L.

    Read more →
  • FloodAlerts

    FloodAlerts

    FloodAlerts is a software application, developed by software specialists Shoothill, which takes real-time flooding information, and displays the data on an interactive Bing map, updating and warning its users when they, their premises or the routes they need to travel could be at risk of flooding. == History == FloodAlerts was launched in 2012, originally as the world's first Facebook flood warning app. == Operation == FloodAlerts is made available free of charge to individuals. Users are able to set up their own monitored locations and receive alerts via the application or their Facebook wall if the locations they are monitoring are at imminent risk of flooding. Hosted in the Cloud, using the Microsoft Windows Azure platform, the FloodAlerts application processes the data received from the Environment Agency, automatically creates the required map tiles, pins and alerts and displays them on an interactive Bing map, updating the content every 15 minutes. Users are able to see the latest information on the map without having to refresh their browser. FloodAlerts can also be provided as a customised risk management solution to businesses that require infrastructure or asset safety monitoring in areas where water levels are rising or receding. == Awards and recognition == FloodAlerts has received The Guardian and Virgin Media Business's 2012 Innovation Nation Awards and was shortlisted as a finalist for a further two national awards: the UK IT Industry Awards for Innovation and Entrepreneurship and The Institution of Engineering and Technology Innovation Awards for Information Technology. == In the press == The FloodAlerts application was reviewed on the BBC website. It was also reviewed on BBC Click.

    Read more →
  • Neuromorphic computing

    Neuromorphic computing

    Neuromorphic computing is a computing approach inspired by the human brain's structure and function. It uses artificial neurons to perform computations, mimicking neural systems for tasks such as perception, motor control, and multisensory integration. These systems, implemented in analog, digital, or mixed-mode VLSI, prioritize robustness, adaptability, and learning by emulating the brain’s distributed processing across small computing elements. This interdisciplinary field integrates biology, physics, mathematics, computer science, and electronic engineering to develop systems that emulate the brain’s morphology and computational strategies. Neuromorphic systems aim to enhance energy efficiency and computational power for applications including artificial intelligence, pattern recognition, and sensory processing. == History == Carver Mead proposed one of the first applications for neuromorphic engineering in the late 1980s. In 2006, researchers at Georgia Tech developed a field programmable neural array, a silicon-based chip modeling neuron channel-ion characteristics. In 2011, MIT researchers created a chip mimicking synaptic communication using 400 transistors and standard CMOS techniques. In 2012 HP Labs researchers reported that Mott memristors exhibit volatile behavior at low temperatures, enabling the creation of neuristors that mimic neuron behavior and support Turing machine components. Also in 2012, Purdue University researchers presented a neuromorphic chip design using lateral spin valves and memristors, noted for energy efficiency. The 2013 Blue Brain Project creates detailed digital models of rodent brains. Neurogrid, developed by Brains in Silicon at Stanford University, used 16 NeuroCore chips to emulate 65,536 neurons with high energy efficiency in 2014. The 2014 BRAIN Initiative and IBM’s TrueNorth chip contributed to neuromorphic advancements. The 2016 BrainScaleS project, a hybrid neuromorphic supercomputer at University of Heidelberg, operated 864 times faster than biological neurons. In 2017, Intel unveiled its Loihi chip, using an asynchronous artificial neural network for efficient learning and inference. Also in 2017 IMEC’s self-learning chip, based on OxRAM, demonstrated music composition by learning from minuets. In 2022, MIT researchers developed artificial synapses using protons for analog deep learning. In 2019, the European Union funded neuromorphic quantum computing to explore quantum operations using neuromorphic systems. Also in 2022, researchers at the Max Planck Institute for Polymer Research developed an organic artificial spiking neuron for in-situ neuromorphic sensing and biointerfacing. Researchers reported in 2024 that chemical systems in liquid solutions can detect sound at various wavelengths, offering potential for neuromorphic applications. == Neurological inspiration == Neuromorphic engineering emulates the brain’s structure and operations, focusing on the analog nature of biological computation and the role of neurons in cognition. The brain processes information via neurons using chemical signals, abstracted into mathematical functions. Neuromorphic systems distribute computation across small elements, similar to neurons, using methods guided by anatomical and functional neural maps from electron microscopy and neural connection studies. == Implementation == Neuromorphic systems employ hardware such as oxide-based memristors, spintronic memories, threshold switches, and transistors. Software implementations train spiking neural networks using error backpropagation. === Neuromemristive systems === Neuromemristive systems use memristors to implement neuroplasticity, focusing on abstract neural network models rather than detailed biological mimicry. These systems enable applications in speech recognition, face recognition, and object recognition, and can replace conventional digital logic gates. The Caravelli-Traversa-Di Ventra equation describes memristive memory evolution, revealing tunneling phenomena and Lyapunov functions. === Neuromorphic sensors === Neuromorphic principles extend to sensors, such as the retinomorphic sensor or event camera, which mimic human vision by registering brightness changes individually, optimizing power consumption. An example of this applied to detecting light is the retinomorphic sensor or, when employed in an array, an event camera. == Ethical considerations == Neuromorphic systems raise the same ethical questions as those for other approaches to artificial intelligence. Daniel Lim argued that advanced neuromorphic systems could lead to machine consciousness, raising concerns about whether civil rights and other protocols should be extended to them. Legal debates, such as in Acohs Pty Ltd v. Ucorp Pty Ltd, question ownership of work produced by neuromorphic systems, as non-human-generated outputs may not be copyrightable.

    Read more →
  • Chainer

    Chainer

    Chainer is an open source deep learning framework written purely in Python on top of NumPy and CuPy Python libraries. The development is led by Japanese venture company Preferred Networks in partnership with IBM, Intel, Microsoft, and Nvidia. Chainer is notable for its early adoption of "define-by-run" scheme, as well as its performance on large scale systems. The first version was released in June 2015 and has gained large popularity in Japan since then. Furthermore, in 2017, it was listed by KDnuggets in top 10 open source machine learning Python projects. In December 2019, Preferred Networks announced the transition of its development effort from Chainer to PyTorch and it will only provide maintenance patches after releasing v7. == Define-by-run == Chainer was the first deep learning framework to introduce the define-by-run approach. The traditional procedure to train a network was in two phases: define the fixed connections between mathematical operations (such as matrix multiplication and nonlinear activations) in the network, and then run the actual training calculation. This is called the define-and-run or static-graph approach. Theano and TensorFlow are among the notable frameworks that took this approach. In contrast, in the define-by-run or dynamic-graph approach, the connection in a network is not determined when the training is started. The network is determined during the training as the actual calculation is performed. One of the advantages of this approach is that it is intuitive and flexible. If the network has complicated control flows such as conditionals and loops, in the define-and-run approach, specially designed operations for such constructs are needed. On the other hand, in the define-by-run approach, programming language's native constructs such as if statements and for loops can be used to describe such flow. This flexibility is especially useful to implement recurrent neural networks. Another advantage is ease of debugging. In the define-and-run approach, if an error (such as numeric error) has occurred in the training calculation, it is often difficult to inspect the fault, because the code written to define the network and the actual place of the error are separated. In the define-by-run approach, you can just suspend the calculation with the language's built-in debugger and inspect the data that flows on your code of the network. Define-by-run has gained popularity since the introduction by Chainer and is now implemented in many other frameworks, including PyTorch and TensorFlow. == Extension libraries == Chainer has four extension libraries, ChainerMN, ChainerRL, ChainerCV and ChainerUI. ChainerMN enables Chainer to be used on multiple GPUs with performance significantly faster than other deep learning frameworks. A supercomputer running Chainer on 1024 GPUs processed 90 epochs of ImageNet dataset on ResNet-50 network in 15 minutes, which is four times faster than the previous record held by Facebook. ChainerRL adds state of art deep reinforcement learning algorithms, and ChainerUI is a management and visualization tool. == Applications == Chainer is used as the framework for PaintsChainer, a service which does automatic colorization of black and white, line only, draft drawings with minimal user input.

    Read more →
  • Alexey Chervonenkis

    Alexey Chervonenkis

    Alexey Yakovlevich Chervonenkis (Russian: Алексей Яковлевич Червоненкис; 7 September 1938 – 22 September 2014) was a Soviet and Russian mathematician. Along with Vladimir Vapnik, he was one of the main developers of the Vapnik–Chervonenkis theory, also known as the "fundamental theory of learning", an important part of computational learning theory. Chervonenkis held joint appointments with the Russian Academy of Sciences and Royal Holloway, University of London. Alexey Chervonenkis got lost in Losiny Ostrov National Park on 22 September 2014, and later during a search operation was found dead near Mytishchi, a suburb of Moscow. He had died of hypothermia.

    Read more →
  • Cloud manufacturing

    Cloud manufacturing

    Cloud manufacturing (CMfg) is a new manufacturing paradigm developed from existing advanced manufacturing models (e.g., ASP, AM, NM, MGrid) and enterprise information technologies under the support of cloud computing, Internet of Things (IoT), virtualization and service-oriented technologies, and advanced computing technologies. It transforms manufacturing resources and manufacturing capabilities into manufacturing services, which can be managed and operated in an intelligent and unified way to enable the full sharing and circulating of manufacturing resources and manufacturing capabilities. CMfg can provide safe and reliable, high quality, cheap and on-demand manufacturing services for the whole lifecycle of manufacturing. The concept of manufacturing here refers to big manufacturing that includes the whole lifecycle of a product (e.g. design, simulation, production, test, maintenance). The concept of Cloud manufacturing was initially proposed by the research group led by Prof. Bo Hu Li and Prof. Lin Zhang in China in 2010. Related discussions and research were conducted hereafter, and some similar definitions (e.g. Cloud-Based Design and Manufacturing (CBDM). ) to cloud manufacturing were introduced. Cloud manufacturing is a type of parallel, networked, and distributed system consisting of an integrated and inter-connected virtualized service pool (manufacturing cloud) of manufacturing resources and capabilities as well as capabilities of intelligent management and on-demand use of services to provide solutions for all kinds of users involved in the whole lifecycle of manufacturing. == Types == Cloud Manufacturing can be divided into two categories. The first category concerns deploying manufacturing software on the Cloud, i.e. a “manufacturing version” of Computing. CAx software can be supplied as a service on the Manufacturing Cloud (MCloud). The second category has a broader scope, cutting across production, management, design and engineering abilities in a manufacturing business. Unlike with computing and data storage, manufacturing involves physical equipment, monitors, materials and so on. In this kind of Cloud Manufacturing system, both material and non-material facilities are implemented on the Manufacturing Cloud to support the whole supply chain. Costly resources are shared on the network. This means that the utilisation rate of rarely used equipment rises and the cost of expensive equipment is reduced. According to the concept of Cloud technology, there will not be direct interaction between Cloud Users and Service Providers. The Cloud User should neither manage nor control the infrastructure and manufacturing applications. As a matter of fact, the former can be considered part of the latter. In CMfg system, various manufacturing resources and abilities can be intelligently sensed and connected into wider Internet, and automatically managed and controlled using IoT technologies (e.g., RFID, wired and wireless sensor network, embedded system). Then the manufacturing resources and abilities are virtualized and encapsulated into different manufacturing cloud services (MCSs), that can be accessed, invoked, and deployed based on knowledge by using virtualization technologies, service-oriented technologies, and cloud computing technologies. The MCSs are classified and aggregated according to specific rules and algorithms, and different kinds of manufacturing clouds are constructed. Different users can search and invoke the qualified MCSs from related manufacturing cloud according to their needs, and assemble them to be a virtual manufacturing environment or solution to complete their manufacturing task involved in the whole life cycle of manufacturing processes under the support of cloud computing, service-oriented technologies, and advanced computing technologies. Four types of cloud deployment modes (public, private, community and hybrid clouds) are ubiquitous as a single point of access. Private cloud refers to a centralized management effort in which manufacturing services are shared within one company or its subsidiaries. Enterprises' mission-critical and core-business applications are often kept in a private cloud. Community cloud is a collaborative effort in which manufacturing services are shared between several organizations from a specific community with common concerns. Public cloud realizes the key concept of sharing services with the general public in a multi-tenant environment. Hybrid cloud is a composition of two or more clouds (private, community or public) that remain distinct entities but are also bound together, offering the benefits of multiple deployment modes. == Resources == From the resource’s perspective, each kind of manufacturing capability requires support from the related manufacturing resource. For each type of manufacturing capability, its related manufacturing resource comes in two forms, soft resources and hard resources. === Soft resources === Software: software applications throughout the product lifecycle including design, analysis, simulation, process planning, and are only beginning to be embraced by the electronics manufacturing industry. Knowledge: experience and know-how needed to complete a production task, i.e. engineering knowledge, product models, standards, evaluation procedures and results, customer feedback, and manufacturing in the cloud provides just as many solutions as the number of questions it also raises for manufacturing executives wanting to make the best possible decision. Skill: expertise in performing a specific manufacturing task. Personnel: human resource engaged in the manufacturing process, i.e. designers, operators, managers, technicians, project teams, customer service, etc. Experience: performance, quality, client evaluation, etc. Business Network: business relationships and business opportunity networks that exist in an enterprise. === Hard resources === Manufacturing Equipment: facilities needed for completing a manufacturing task, e.g. machine tools, cutters, test and monitoring equipment and other fabrication tools. Monitoring/Control Resource: devices used to identify and control other manufacturing resource, for instance, RFID (Radio-Frequency IDentification), WSN (Wireless Sensor Network), virtual managers and remote controllers. Computational Resource: computing devices to support production process, e.g. servers, computers, storage media, control devices, etc. Materials: inputs and outputs in a production system, e.g. raw material, product-in-progress, finished product, power, water, lubricants, etc. Storage: automated storage and retrieval systems, logic controllers, location of warehouses, volume capacity and schedule/optimization methods. Transportation: movement of manufacturing inputs/outputs from one location to another. It includes the modes of transport, e.g. air, rail, road, water, cable, pipeline and space, and the related price, and time taken.

    Read more →
  • LightGBM

    LightGBM

    LightGBM, short for Light Gradient-Boosting Machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, classification and other machine learning tasks. The development focus is on performance and scalability. == Overview == The LightGBM framework supports different algorithms including GBT, GBDT, GBRT, GBM, MART and RF. LightGBM has many of XGBoost's advantages, including sparse optimization, parallel training, multiple loss functions, regularization, bagging, and early stopping. A major difference between the two lies in the construction of trees. LightGBM does not grow a tree level-wise — row by row — as most other implementations do. Instead it grows trees leaf-wise. It will choose the leaf with max delta loss to grow. Besides, LightGBM does not use the widely used sorted-based decision tree learning algorithm, which searches the best split point on sorted feature values, as XGBoost or other implementations do. Instead, LightGBM implements a highly optimized histogram-based decision tree learning algorithm, which yields great advantages on both efficiency and memory consumption. The LightGBM algorithm utilizes two novel techniques called Gradient-Based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) which allow the algorithm to run faster while maintaining a high level of accuracy. LightGBM works on Linux, Windows, and macOS and supports C++, Python, R, and C#. The source code is licensed under MIT License and available on GitHub. == Gradient-based one-side sampling == When using gradient descent, one thinks about the space of possible configurations of the model as a valley, in which the lowest part of the valley is the model which most closely fits the data. In this metaphor, one walks in different directions to learn how much lower the valley becomes. Typically, in gradient descent, one uses the whole set of data to calculate the valley's slopes. However, this commonly used method assumes that every data point is equally informative. By contrast, Gradient-Based One-Side Sampling (GOSS), a method first developed for gradient-boosted decision trees, does not rely on the assumption that all data are equally informative. Instead, it treats data points with smaller gradients (shallower slopes) as less informative by randomly dropping them. This is intended to filter out data which may have been influenced by noise, allowing the model to more accurately model the underlying relationships in the data. == Exclusive feature bundling == Exclusive feature bundling (EFB) is a near-lossless method to reduce the number of effective features. In a sparse feature space many features are nearly exclusive, implying they rarely take nonzero values simultaneously. One-hot encoded features are a perfect example of exclusive features. EFB bundles these features, reducing dimensionality to improve efficiency while maintaining a high level of accuracy. The bundle of exclusive features into a single feature is called an exclusive feature bundle.

    Read more →
  • Fei-Fei Li

    Fei-Fei Li

    Fei-Fei Li (Chinese: 李飞飞; pinyin: Lǐ Fēifēi; born July 3, 1976) is a Chinese-born American computer scientist best known for establishing ImageNet, the dataset that enabled rapid advances in computer vision in the 2010s. She is a professor of computer science at Stanford University, with research expertise in artificial intelligence, machine learning, deep learning, computer vision, and cognitive neuroscience. Li is a co-director of the Stanford Institute for Human-Centered Artificial Intelligence and a co-director of the Stanford Vision and Learning Lab, and served as Chief Scientist of AI/ML at Google Cloud and the director of the Stanford Artificial Intelligence Laboratory from 2013 to 2018. In 2017, she co-founded AI4ALL, a nonprofit organization working to increase diversity in the field of artificial intelligence. In 2023, Li was named one of the Time 100 AI Most Influential People. Li received the Intel Lifetime Achievements Innovation Award in 2017 for her contributions to artificial intelligence, and was elected member of the National Academy of Engineering, the National Academy of Medicine in 2020 and the American Academy of Arts and Sciences in 2021. In 2025, she was named as one of the "Architects of AI" for Time's Person of the Year. On August 3, 2023, Li was appointed to the United Nations Scientific Advisory Board, established by Secretary-General Antonio Guterres. In 2024, Li was included on the Gold House's most influential Asian A100 list. In 2024, she raised $230 million for a startup called World Labs, which she and three colleagues founded to develop a "spatial intelligence" AI technology that can understand how the three-dimensional physical world works. In 2026, World Labs raised $1 Billion. == Early life and education == Li was born in Beijing, China, in 1976 and grew up in Chengdu, Sichuan. She studied at Sichuan Chengdu No.7 High School. When she was 12, her father immigrated to Parsippany, New Jersey. When she was 16, Li and her mother joined him in the United States. While attending Parsippany High School, Li worked weekends at her family's dry-cleaning shop. She graduated from Parsippany High School in 1995. She was inducted into the hall of fame at Parsippany High School in 2017. Li pursued undergraduate study at Princeton University, where she received a Bachelor of Arts with a major in physics in 1999. Li completed her senior thesis, "Auditory binaural correlogram difference: a new computational model for Huggins dichotic pitch", under the supervision of Bradley Dickinson, professor of electrical engineering. During her years at Princeton, Li returned home most weekends to help run her family's dry cleaning business and worked as a dishwasher to supplement the family income. Li pursued graduate study at the California Institute of Technology, where she received a Master of Science in electrical engineering in 2001 and a Doctor of Philosophy in electrical engineering in 2005. Li completed her dissertation, "Visual Recognition: Computational Models and Human Psychophysics", under the primary supervision of Pietro Perona and secondary supervision of Christof Koch. Her graduate studies were supported by the National Science Foundation Graduate Research Fellowship and The Paul & Daisy Soros Fellowships for New Americans. == Career and research == From 2005 to 2006, Li was an assistant professor in the Electrical and Computer Engineering Department at the University of Illinois Urbana-Champaign, and from 2007 to 2009, she was an assistant professor in the Computer Science Department at Princeton University. She joined Stanford in 2009 as an assistant professor, and was promoted to associate professor with tenure in 2012, and then full professor in 2018. At Stanford, Li served as the director of Stanford Artificial Intelligence Lab (SAIL) from 2013 to 2018. Her research has focused on computer vision, deep learning, and cognitive neuroscience, with over 300 peer-reviewed publications. She became the founding co-director of Stanford's University-level initiative - the Human-Centered AI Institute, along with co-director Dr. John Etchemendy, former provost of Stanford University. The institute aligns with Li's aims to advance AI research, education, policy, and practice to improve the human condition. While at Princeton in 2007, Li led the development of ImageNet, a massive visual database designed to advance object recognition in AI. The project involved labeling over 14 million images using Amazon Mechanical Turk and inspired the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), which catalyzed progress in deep learning and led to dramatic improvements in image classification performance. The database addressed a key bottleneck in computer vision: the lack of large, annotated datasets for training machine learning models. Today, ImageNet is credited as a cornerstone innovation that underpins advancements in autonomous vehicles, facial recognition, and medical imaging. On her sabbatical from Stanford University from January 2017 to fall of 2018, Li joined Google Cloud as its Chief Scientist of AI/ML and Vice President. At Google, her team focused on democratizing AI technology and lowering the barrier for entrance to businesses and developers, including the developments of products like AutoML. In September 2017, Google secured a contract from the Department of Defense called Project Maven, which aimed to use AI techniques to interpret images captured by drone cameras. Google told employees who protested the company's work on Project Maven that their role was "specifically scoped to be for non-offensive purposes". In June 2018, Google told employees it would not seek renewal of the contract. In internal emails which were later leaked to reporters, Li expressed enthusiasm for the Google Cloud role in Project Maven, but warned against mentioning its AI component, saying that military AI is linked in the public mind with the danger of autonomous weapons. Asked about those leaked emails, Li told The New York Times, "I believe in human-centered AI to benefit people in positive and benevolent ways. It is deeply against my principles to work on any project that I think is to weaponize AI." In the fall of 2018, Li left Google and returned to Stanford University to continue her professorship. In 2023, Li co-led the launch of the RAISE-Health (Responsible AI for Safe and Equitable Health) initiative at Stanford University in collaboration with Stanford medicine. The initiative aims to develop frameworks for the responsible use of artificial intelligence in healthcare, including clinical care, biomedical research, and patient safety. According to her Stanford profile, she has been on partial academic leave from January 2024 through the end of 2025 to focus on entrepreneurial ventures. In 2024, Li said there was a disparity between private-sector investment in AI and support for academic and government research, and called for greater public funding for scientific uses of the technology and for studying its risks. Li is also known for her non-profit work as the co-founder and chairperson of nonprofit organization AI4ALL, whose mission is to educate the next generation of AI technologists, thinkers and leaders by promoting diversity and inclusion through human-centered AI principles. The program was created in collaboration with Melinda French Gates and Jensen Huang. Prior to establishing AI4ALL in 2017, Li and her former student Olga Russakovsky, currently an assistant professor in Princeton University, co-founded and co-directed the precursor program at Stanford called SAILORS (Stanford AI Lab OutReach Summers). SAILORS was an annual summer camp at Stanford dedicated to 9th grade high school girls in AI education and research, established in 2015 till it changed its name to AI4ALL @Stanford in 2017. In 2018, AI4ALL has successfully launched five more summer programs in addition to Stanford, including Princeton University, Carnegie Mellon University, Boston University, University of California Berkeley, and Canada's Simon Fraser University. We are at a turning point. AI's influence continues to grow, but representation and inclusion of a diversity of researchers in the field does not. It's critical that we seize this moment to create structures that will support long-term, positive changes. This won't happen via a single mechanism or quick fix. It starts with early education and extends to the existing structures of power within academia, work cultures among current AI researchers, and gatekeeping functions of research publishing, to name a few levers of change. Li has been described as a "researcher bringing humanity to AI". Li was elected as a member of the American Academy of Arts and Sciences in 2021, the National Academy of Engineering in 2020, and the National Academy of Medicine in 2020. In a November 2023 interview with The Guardian, Li said that while she would not refer to herself as the "godmother

    Read more →
  • Linear belief function

    Linear belief function

    Linear belief functions are an extension of the Dempster–Shafer theory of belief functions to the case when variables of interest are continuous. Examples of such variables include financial asset prices, portfolio performance, and other antecedent and consequent variables. The theory was originally proposed by Arthur P. Dempster in the context of Kalman Filters and later was elaborated, refined, and applied to knowledge representation in artificial intelligence and decision making in finance and accounting by Liping Liu. == Concept == A linear belief function intends to represent our belief regarding the location of the true value as follows: We are certain that the truth is on a so-called certainty hyperplane but we do not know its exact location; along some dimensions of the certainty hyperplane, we believe the true value could be anywhere from –∞ to +∞ and the probability of being at a particular location is described by a normal distribution; along other dimensions, our knowledge is vacuous, i.e., the true value is somewhere from –∞ to +∞ but the associated probability is unknown. A belief function in general is defined by a mass function over a class of focal elements, which may have nonempty intersections. A linear belief function is a special type of belief function in the sense that its focal elements are exclusive, parallel sub-hyperplanes over the certainty hyperplane and its mass function is a normal distribution across the sub-hyperplanes. Based on the above geometrical description, Shafer and Liu propose two mathematical representations of a LBF: a wide-sense inner product and a linear functional in the variable space, and as their duals over a hyperplane in the sample space. Monney proposes still another structure called Gaussian hints. Although these representations are mathematically neat, they tend to be unsuitable for knowledge representation in expert systems. == Knowledge representation == A linear belief function can represent both logical and probabilistic knowledge for three types of variables: deterministic such as an observable or controllable, random whose distribution is normal, and vacuous on which no knowledge bears. Logical knowledge is represented by linear equations, or geometrically, a certainty hyperplane. Probabilistic knowledge is represented by a normal distribution across all parallel focal elements. In general, assume X is a vector of multiple normal variables with mean μ and covariance Σ. Then, the multivariate normal distribution can be equivalently represented as a moment matrix: M ( X ) = ( μ Σ ) . {\displaystyle M(X)=\left({\begin{array}{{20}c}\mu \\\Sigma \end{array}}\right).} If the distribution is non-degenerate, i.e., Σ has a full rank and its inverse exists, the moment matrix can be fully swept: M ( X → ) = ( μ Σ − 1 − Σ − 1 ) {\displaystyle M({\vec {X}})=\left({\begin{array}{{20}c}\mu \Sigma ^{-1}\\-\Sigma ^{-1}\end{array}}\right)} Except for normalization constant, the above equation completely determines the normal density function for X. Therefore, M ( X → ) {\displaystyle M({\vec {X}})} represents the probability distribution of X in the potential form. These two simple matrices allow us to represent three special cases of linear belief functions. First, for an ordinary normal probability distribution M(X) represents it. Second, suppose one makes a direct observation on X and obtains a value μ. In this case, since there is no uncertainty, both variance and covariance vanish, i.e., Σ = 0. Thus, a direct observation can be represented as: M ( X ) = ( μ 0 ) {\displaystyle M(X)=\left({\begin{array}{{20}c}\mu \\0\end{array}}\right)} Third, suppose one is completely ignorant about X. This is a very thorny case in Bayesian statistics since the density function does not exist. By using the fully swept moment matrix, we represent the vacuous linear belief functions as a zero matrix in the swept form follows: M ( X → ) = [ 0 0 ] {\displaystyle M({\vec {X}})=\left[{\begin{array}{{20}c}0\\0\end{array}}\right]} One way to understand the representation is to imagine complete ignorance as the limiting case when the variance of X approaches to ∞, where one can show that Σ−1 = 0 and hence M ( X → ) {\displaystyle M({\vec {X}})} vanishes. However, the above equation is not the same as an improper prior or normal distribution with infinite variance. In fact, it does not correspond to any unique probability distribution. For this reason, a better way is to understand the vacuous linear belief functions as the neutral element for combination (see later). To represent the remaining three special cases, we need the concept of partial sweeping. Unlike a full sweeping, a partial sweeping is a transformation on a subset of variables. Suppose X and Y are two vectors of normal variables with the joint moment matrix: M ( X , Y ) = [ μ 1 Σ 11 Σ 21 μ 2 Σ 12 Σ 22 ] {\displaystyle M(X,Y)=\left[{\begin{array}{{20}c}{\begin{array}{{20}c}\mu _{1}\\\Sigma _{11}\\\Sigma _{21}\end{array}}&{\begin{array}{{20}c}\mu _{2}\\\Sigma _{12}\\\Sigma _{22}\end{array}}\end{array}}\right]} Then M(X, Y) may be partially swept. For example, we can define the partial sweeping on X as follows: M ( X → , Y ) = [ μ 1 ( Σ 11 ) − 1 − ( Σ 11 ) − 1 Σ 21 ( Σ 11 ) − 1 μ 2 − μ 1 ( Σ 11 ) − 1 Σ 12 ( Σ 11 ) − 1 Σ 12 Σ 22 − Σ 21 ( Σ 11 ) − 1 Σ 12 ] {\displaystyle M({\vec {X}},Y)=\left[{\begin{array}{{20}c}{\begin{array}{{20}c}\mu _{1}(\Sigma _{11})^{-1}\\-(\Sigma _{11})^{-1}\\\Sigma _{21}(\Sigma _{11})^{-1}\end{array}}&{\begin{array}{{20}c}\mu _{2}-\mu _{1}(\Sigma _{11})^{-1}\Sigma _{12}\\(\Sigma _{11})^{-1}\Sigma _{12}\\\Sigma _{22}-\Sigma _{21}(\Sigma _{11})^{-1}\Sigma _{12}\end{array}}\end{array}}\right]} If X is one-dimensional, a partial sweeping replaces the variance of X by its negative inverse and multiplies the inverse with other elements. If X is multidimensional, the operation involves the inverse of the covariance matrix of X and other multiplications. A swept matrix obtained from a partial sweeping on a subset of variables can be equivalently obtained by a sequence of partial sweepings on each individual variable in the subset and the order of the sequence does not matter. Similarly, a fully swept matrix is the result of partial sweepings on all variables. We can make two observations. First, after the partial sweeping on X, the mean vector and covariance matrix of X are respectively μ 1 ( Σ 11 ) − 1 {\displaystyle \mu _{1}(\Sigma _{11})^{-1}} and − ( Σ 11 ) − 1 {\displaystyle -(\Sigma _{11})^{-1}} , which are the same as that of a full sweeping of the marginal moment matrix of X. Thus, the elements corresponding to X in the above partial sweeping equation represent the marginal distribution of X in potential form. Second, according to statistics, μ 2 − μ 1 ( Σ 11 ) − 1 Σ 12 {\displaystyle \mu _{2}-\mu _{1}(\Sigma _{11})^{-1}\Sigma _{12}} is the conditional mean of Y given X = 0; Σ 22 − Σ 21 ( Σ 11 ) − 1 Σ 12 {\displaystyle \Sigma _{22}-\Sigma _{21}(\Sigma _{11})^{-1}\Sigma _{12}} is the conditional covariance matrix of Y given X = 0; and ( Σ 11 ) − 1 Σ 12 {\displaystyle (\Sigma _{11})^{-1}\Sigma _{12}} is the slope of the regression model of Y on X. Therefore, the elements corresponding to Y indices and the intersection of X and Y in M ( X → , Y ) {\displaystyle M({\vec {X}},Y)} represents the conditional distribution of Y given X = 0. These semantics render the partial sweeping operation a useful method for manipulating multivariate normal distributions. They also form the basis of the moment matrix representations for the three remaining important cases of linear belief functions, including proper belief functions, linear equations, and linear regression models. === Proper linear belief functions === For variables X and Y, assume there exists a piece of evidence justifying a normal distribution for variables Y while bearing no opinions for variables X. Also, assume that X and Y are not perfectly linearly related, i.e., their correlation is less than 1. This case involves a mix of an ordinary normal distribution for Y and a vacuous belief function for X. Thus, we represent it using a partially swept matrix as follows: M ( X → , Y ) = [ 0 0 0 μ 2 0 Σ 22 ] {\displaystyle M({\vec {X}},Y)=\left[{\begin{array}{{20}c}{\begin{array}{{20}c}0\\0\\0\end{array}}&{\begin{array}{{20}c}\mu _{2}\\0\\\Sigma _{22}\\\end{array}}\end{array}}\right]} This is how we could understand the representation. Since we are ignorant on X, we use its swept form and set μ 1 ( Σ 11 ) − 1 = 0 {\displaystyle \mu _{1}(\Sigma _{11})^{-1}=0} and − ( Σ 11 ) − 1 = 0 {\displaystyle -(\Sigma _{11})^{-1}=0} . Since the correlation between X and Y is less than 1, the regression coefficient of X on Y approaches to 0 when the variance of X approaches to ∞. Therefore, ( Σ 11 ) − 1 Σ 12 = 0 {\displaystyle (\Sigma _{11})^{-1}\Sigma _{12}=0} . Similarly, one can prove that μ 1 ( Σ 11 ) − 1 Σ 12 = 0 {\displaystyle \mu _{1}(\Sigma _{11})^{-1}\Sigma _{12}=0} and Σ 21 ( Σ 11 ) −

    Read more →
  • FMLLR

    FMLLR

    In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplication operation with a transformation matrix. In some literature, fMLLR is also known as the Constrained Maximum Likelihood Linear Regression (cMLLR). == Overview == fMLLR transformations are trained in a maximum likelihood sense on adaptation data. These transformations may be estimated in many ways, but only maximum likelihood (ML) estimation is considered in fMLLR. The fMLLR transformation is trained on a particular set of adaptation data, such that it maximizes the likelihood of that adaptation data given a current model-set. This technique is a widely used approach for speaker adaptation in HMM-based speech recognition. Later research also shows that fMLLR is an excellent acoustic feature for DNN/HMM hybrid speech recognition models. The advantage of fMLLR includes the following: the adaptation process can be performed within a pre-processing phase, and is independent of the ASR training and decoding process. this type of adapted feature can be applied to deep neural networks (DNN) to replace traditionally used mel-spectrogram in end-to-end speech recognition models. fMLLR's speaker adaptation process leads to a significant performance boost for ASR models, hence outperforming other transform or features like MFCCs (Mel-Frequency Cepstral Coefficients) and FBANKs (Filter bank) coefficients. fMLLR features can be efficiently realized with speech toolkits like Kaldi. Major problem and disadvantage of fMLLR: when the amount of adaptation data is limited, the transformation matrices tends to easily overfit the given data. == Computing fMLLR transform == Feature transform of fMLLR can be easily computed with the open source speech tool Kaldi, the Kaldi script uses the standard estimation scheme described in Appendix B of the original paper, in particular the section Appendix B.1 "Direct method over rows". In the Kaldi formulation, fMLLR is an affine feature transform of the form x {\displaystyle x} → A {\displaystyle A} x {\displaystyle x} + b {\displaystyle +b} , which can be written in the form x {\displaystyle x} →W x ^ {\displaystyle {\hat {x}}} , where x ^ {\displaystyle {\hat {x}}} = [ x 1 ] {\displaystyle {\begin{bmatrix}x\\1\end{bmatrix}}} is the acoustic feature x {\displaystyle x} with a 1 appended. Note that this differs from some of the literature where the 1 comes first as x ^ {\displaystyle {\hat {x}}} = [ 1 x ] {\displaystyle {\begin{bmatrix}1\\x\end{bmatrix}}} . The sufficient statistics stored are: K = ∑ t , j , m γ j , m ( t ) Σ j m − 1 μ j m x ( t ) + {\displaystyle K=\sum _{t,j,m}\gamma _{j,m}(t)\textstyle \Sigma _{jm}^{-1}\mu _{jm}x(t)^{+}\displaystyle } where Σ j m − 1 {\displaystyle \textstyle \Sigma _{jm}^{-1}\displaystyle } is the inverse co-variance matrix. And for 0 ≤ i ≤ D {\displaystyle 0\leq i\leq D} where D {\displaystyle D} is the feature dimension: G ( i ) = ∑ t , j , m γ j , m ( t ) ( 1 σ j , m 2 ( i ) ) x ( t ) + x ( t ) + T {\displaystyle G^{(i)}=\sum _{t,j,m}\gamma _{j,m}(t)\left({\frac {1}{\sigma _{j,m}^{2}(i)}}\right)x(t)^{+}x(t)^{+T}\displaystyle } For a thorough review that explains fMLLR and the commonly used estimation techniques, see the original paper "Maximum likelihood linear transformations for HMM-based speech recognition ". Note that the Kaldi script that performs the feature transforms of fMLLR differs with by using a column of the inverse in place of the cofactor row. In other words, the factor of the determinant is ignored, as it does not affect the transform result and can causes potential danger of numerical underflow or overflow. == Comparing with other features or transforms == Experiment result shows that by using the fMLLR feature in speech recognition, constant improvement is gained over other acoustic features on various commonly used benchmark datasets (TIMIT, LibriSpeech, etc). In particular, fMLLR features outperform MFCCs and FBANKs coefficients, which is mainly due to the speaker adaptation process that fMLLR performs. In, phoneme error rate (PER, %) is reported for the test set of TIMIT with various neural architectures: As expected, fMLLR features outperform MFCCs and FBANKs coefficients despite the use of different model architecture. Where MLP (multi-layer perceptron) serves as a simple baseline, on the other hand RNN, LSTM, and GRU are all well known recurrent models. The Li-GRU architecture is based on a single gate and thus saves 33% of the computations over a standard GRU model, Li-GRU thus effectively address the gradient vanishing problem of recurrent models. As a result, the best performance is obtained with the Li-GRU model on fMLLR features. == Extract fMLLR features with Kaldi == fMLLR can be extracted as reported in the s5 recipe of Kaldi. Kaldi scripts can certainly extract fMLLR features on different dataset, below are the basic example steps to extract fMLLR features from the open source speech corpora Librispeech. Note that the instructions below are for the subsets train-clean-100,train-clean-360,dev-clean, and test-clean, but they can be easily extended to support the other sets dev-other, test-other, and train-other-500. These instruction are based on the codes provided in this GitHub repository, which contains Kaldi recipes on the LibriSpeech corpora to execute the fMLLR feature extraction process, replace the files under $KALDI_ROOT/egs/librispeech/s5/ with the files in the repository. Install Kaldi. Install Kaldiio. If running on a single machine, change the following lines in $KALDI_ROOT/egs/librispeech/s5/cmd.sh to replace queue.pl to run.pl: Change the data path in run.sh to your LibriSpeech data path, the directory LibriSpeech/ should be under that path. For example: Install flac with: sudo apt-get install flac Run the Kaldi recipe run.sh for LibriSpeech at least until Stage 13 (included), for simplicity you can use the modified run.sh. Copy exp/tri4b/trans. files into exp/tri4b/decode_tgsmall_train_clean_/ with the following command: Compute the fMLLR features by running the following script, the script can also be downloaded here: Compute alignments using: Apply CMVN and dump the fMLLR features to new .ark files, the script can also be downloaded here: Use the Python script to convert Kaldi generated .ark features to .npy for your own dataloader, an example Python script is provided:

    Read more →
  • Drools

    Drools

    Drools is a business rule management system (BRMS) with a forward and backward chaining inference-based rules engine, more correctly known as a production rule system, using an enhanced implementation of the Rete algorithm. Drools supports the Java Rules Engine API (Java Specification Request 94) standard for its business rule engine and enterprise framework for the construction, maintenance, and enforcement of business policies in an organization, application, or service. == Drools in Apache Kie == Drools, as part of the Kie Community has entered Apache Incubator in January, 2023. == Red Hat Decision Manager == Red Hat Decision Manager (formerly Red Hat JBoss BRMS) is a business rule management system and reasoning engine for business policy and rules development, access, and change management. JBoss Enterprise BRMS is a productized version of Drools with enterprise-level support available. JBoss Rules is also a productized version of Drools, but JBoss Enterprise BRMS is the flagship product. Components of the enterprise version: JBoss Enterprise Web Platform – the software infrastructure, supported to run the BRMS components only JBoss Enterprise Application Platform or JBoss Enterprise SOA Platform – the software infrastructure, supported to run the BRMS components only Business Rules Engine – Drools Expert using the Rete algorithm and the Drools Rule Language (DRL) Business Rules Manager – Drools Guvnor - Guvnor is a centralized repository for Drools Knowledge Bases, with rich web-based GUIs, editors, and tools to aid in the management of large numbers of rules. Business Rules Repository – Drools Guvnor Drools and Guvnor are JBoss Community open source projects. As they are mature, they are brought into the enterprise-ready product JBoss Enterprise BRMS. Components of the JBoss Community version: Drools Guvnor (Business Rules Manager) – a centralized repository for Drools Knowledge Bases Drools Expert (rule engine) – uses the rules to perform reasoning Drools Flow (process/workflow), or jBPM 5 – provides for workflow and business processes Drools Fusion (event processing/temporal reasoning) – provides for complex event processing Drools Planner/OptaPlanner (automated planning) – optimizes automated planning, including NP-hard planning problems == Example == This example illustrates a simple rule to print out information about a holiday in July. It checks a condition on an instance of the Holiday class, and executes Java code if that condition is true. The purpose of dialect "mvel" is to point the getter and setters of the variables of your Plain Old Java Object (POJO) classes. Consider the above example, in which a Holiday class is used and inside the circular brackets (parentheses) "month" is used. So with the help of dialect "mvel" the getter and setters of the variable "month" can be accessed. Dialect "java" is used to help us write our Java code in our rules. There is one restriction or characteristic on this. We cannot use Java code inside the "when" part of the rule but we can use Java code in the "then" part. We can also declare a Reference variable $h1 without the $ symbol. There is no restriction on this. The main purpose of putting the $ symbol before the variable is to mark the difference between variables of POJO classes and Rules.

    Read more →
  • Lighthill report

    Lighthill report

    Artificial Intelligence: A General Survey, commonly known as the Lighthill report, is a scholarly article by James Lighthill, published in Artificial Intelligence: a paper symposium in 1973. It was compiled by Lighthill for the British Science Research Council as an evaluation of academic research in the field of artificial intelligence (AI). The report gave a very pessimistic prognosis for many core aspects of research in this field, stating that "In no part of the field have the discoveries made so far produced the major impact that was then promised". It "formed the basis for the decision by the British government to end support for AI research in most British universities", contributing to an AI winter in the United Kingdom. == Publication history == It was commissioned by the SRC in 1972 for Lighthill to "make a personal review of the subject [of AI]". Lighthill completed the report in July. The SRC discussed the report in September, and decided to publish it, together with some alternative points of view by Stuart Sutherland, Roger Needham, Christopher Longuet-Higgins, and Donald Michie. The SRC's decision to invite the report was partly a reaction to high levels of discord within the University of Edinburgh's Department of Artificial Intelligence, one of the earliest and biggest centres for AI research in the UK. On May 9, 1973, Lighthill debated several leading AI researchers (Donald Michie, John McCarthy, Richard Gregory) at the Royal Institution in London concerning the report. == Content == While the report was supportive of research into the simulation of neurophysiological and psychological processes, it was "highly critical of basic research in foundational areas such as robotics and language processing". The report stated that AI researchers had failed to address the issue of combinatorial explosion when solving problems within real-world domains. That is, the report states that whilst AI techniques may have worked within the scope of small problem domains, the techniques would not scale up well to solve more realistic problems. The report represents a pessimistic view of AI that began after early excitement in the field. The report divides AI research into three categories: Advanced Automation ("A"): applications of AI, such as optical character recognition, mechanical component design and manufacture, missile perception and guidance, etc. Computer-based Central Nervous System research ("C"): building computational models of human brains (neurobiology) and behavior (psychology). Bridge, or Building Robots ("B"): research that combines categories A and C. This category is intentionally vague. Projects in category A had had some success, but only in restricted domains where a large quantity of detailed knowledge was used in designing the program. This was disappointing to researchers who hoped for generic methods. Due to the issue of the combinatorial explosion, the amount of detailed knowledge required by the program quickly grew too large to be entered by hand, thus restricting projects to restricted domains. Projects in category C had had some measure of success. Artificial neural networks were successfully used to model neurobiological data. SHRDLU demonstrated that human use of language, even in fine details, depends on the semantics or knowledge, and is not purely syntactical. This was influential in psycholinguistics. Attempts to extend SHRDLU to larger domains of discourse was considered impractical, again due to the issue of the combinatorial explosion. Projects in category B were held to be failures. One important project, that of "programming and building a robot that would mimic human ability in a combination of eye-hand co-ordination and common-sense problem solving", was considered entirely disappointing. Similarly, chess playing programs were no better than human amateurs. Due to the combinatorial explosion, the run-time of general algorithms quickly grew impractical, requiring detailed problem-specific heuristics. The report stated that it was expected that within the next 25 years, category A would simply become applied technologies engineering, C would integrate with psychology and neurobiology, while category B would be abandoned.

    Read more →