AI Assistant Grok

AI Assistant Grok — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Coalition for App Fairness

    Coalition for App Fairness

    The Coalition for App Fairness (CAF) is a coalition comprised by companies, who aim to reach a fairer deal for the inclusion of their apps into the Apple App Store or the Google Play Store. The organization's executive director is Meghan DiMuzio and its headquarters are located in Washington, D.C. == Background == In July 2015, Spotify launched an email campaign to urge its App Store subscribers to cancel their subscriptions and start new ones through its website, bypassing the 30% transaction fee for in-app purchases required for iOS applications by technology company Apple Inc. A later update to the Spotify app on iOS was rejected by Apple, prompting Spotify's general counsel Horacio Gutierrez to write a letter to Apple's then-general counsel Bruce Sewell, stating: "This latest episode raises serious concerns under both U.S. and EU competition law. It continues a troubling pattern of behavior by Apple to exclude and diminish the competitiveness of Spotify on iOS and as a rival to Apple Music, particularly when seen against the backdrop of Apple's previous anticompetitive conduct aimed at Spotify … we cannot stand by as Apple uses the App Store approval process as a weapon to harm competitors." In August 2020, Epic Games updated their Fortnite Battle Royale game app on both Apple's App Store and Google's Google Play to include its own storefront that offered a 20% discount on V-Bucks, the in-game currency, if players bought through there rather than through the app stores' storefront, both which take a 30% revenue cut of the sale. Both Apple and Google removed the Fortnite app within hours, as this alternate storefront violated their terms of use that required all in-app purchases to be made through their storefronts. Epic immediately filed lawsuits against both companies challenging their storefront policies on antitrust principles, arguing that their non-negotiable 30% revenue cut is too high and the restrictions against alternate storefronts anticompetitive. Apple countersued Epic over its behavior, leading to a highly publicized 2021 bench trial. Ultimately, Epic largely lost its lawsuit against Apple, though the court did order Apple to allow developers to point users to alternative payment methods. Conversely, Epic won its antitrust lawsuit against Google in late 2023. == Foundation == On 24 September 2020, Epic Games joined forces with thirteen other prominent companies—including the music streaming platform Spotify, Tinder owner Match Group, the encrypted mail service Proton Mail, and the crypto currency website Blockchain.com—to establish the Coalition for App Fairness. It also includes Basecamp. The coalition criticizes the fact that for now the app stores of both Apple and Google charge their clients a 30% fee on any purchases made over their stores. Apple and Google defended themselves by arguing that the 30% transaction fee is a standard in the industry while the Coalition for App Fairness states that there is no other transaction fee which is even close to the 30%. In October 2020, it was reported that the coalition grew from 13 to 40 members since its foundation and received more than 400 applications for membership. In October 2025, X (formerly Twitter) joined CAF. This was seen as a larger pushback in the industry against Apple and Google, and a step towards hopefully passing the Bipartisan Open App Markets Act. == Aims == The group has broadened their demands for the app stores and now also aim for a better treatment for the apps available in the App Store. They claim that Apple favors its own services before other services available on the market and unjustifiably excludes other apps from their App Store. The group has also been viewing other transaction fees like the 5% fee which is charged by credit card companies, and states that Apple charges up to 600% more and would like the 30% fee, which was only included in 2011 by Apple, adapted to a comparable percentage that charge other providers of payment solutions. Its demands are mainly directed at Apple's strict control over its App Store, but to a lesser extent are also directed towards Google. Google allows apps to be downloaded over an independent web link or also another App Store, such as the Epic Game App Store. The organization emphasizes that no app developer should come into the position in which they are discriminated and are not granted the same rights as to the developers of the owner of the app store. == Reactions == In October 2020, Microsoft presented a new framework concerning the access to its Windows 10 operating system by app stores other than the one offered by Microsoft. The new framework is based on the demands of the Coalition for App Fairness. Microsoft emphasized though, that these principles would not apply to the Xbox. In December 2020, Apple announced that they would be lowering the revenue cut Apple takes for app developers making $1M or less from 30% to 15% if app developers fill out an application for the lowered revenue cut. In March 2021, Google followed suit by also lowering the revenue cut from the Play Store from 30% to 15% for the first million in revenue earned by a developer each year. == Notable members == Members listed are notable companies listed as members the groups website: Blockchain.com Deezer Epic Games European Digital SME Alliance Fanfix Life360 Masimo Nium Proton Mail Spotify TapTap Threema Vipps

    Read more →
  • Wayve

    Wayve

    Wayve Technologies Ltd is a British autonomous driving technology company focused on developing self-driving vehicle systems through end-to-end deep learning. Founded in 2017 by researchers from the University of Cambridge, Wayve’s approach eschews detailed 3D maps and hand-coded rules, in favor of a self-learning “AI driver” that learns from camera data and driving experience. The London-headquartered startup has garnered significant attention and funding for its visually-based method. == History == Wayve was founded in Cambridge, England, on August 21, 2017, by Amar Shah and Alex Kendall, two machine learning PhD students at the University of Cambridge. Shah initially served as CEO while Kendall was CTO, and the pair set out to develop an unconventional self-driving car system using machine learning at every layer of the driving task. In May 2018, Wayve emerged from stealth mode with backing from early-stage investors. At this time the company had around 10 employees, and its advisory investors included Uber’s Chief Scientist, Zoubin Ghahramani, who shared Wayve’s vision of a learning-centric driving AI. In 2019, Wayve achieved a milestone by training a car to drive autonomously on public roads it had never seen before, using only cameras, a basic GPS map, and end-to-end deep learning control. The company moved its base to London and secured a $20 million Series A funding round in November 2019. This investment enabled Wayve to launch a pilot fleet of autonomous electric vehicles in central London for real-world testing. During these trials, Wayve’s cars (such as retrofitted Jaguar I-Pace SUVs) began navigating the complex, narrow streets of London to prove the system’s ability to adapt to challenging urban scenarios. In 2020, co-founder Amar Shah departed the company, and Alex Kendall assumed the role of CEO. The startup joined the Microsoft for Startups: Autonomous Driving program in 2020, leveraging Microsoft Azure’s cloud computing for training its machine learning models at scale. It also committed to testing exclusively on electric vehicles, and a goal to reduce carbon emissions. In 2021, Wayve entered pilot programs with major UK retailers. It launched a 12-month autonomous delivery trial with supermarket chain Asda, and received a £10 million ($13.6 million) investment from online grocer Ocado Group as part of a partnership to develop self-driving grocery delivery vans. Ocado’s backing gave Wayve access to a fleet of delivery vans for data collection and testing on busy London routes (with human safety drivers present) to train its AI in urban traffic. In 2022, after a successful Series B funding round, the company extended road testing beyond the UK to other regions, and, by 2023, in multiple countries. The company had begun operating in the United States and in continental Europe, in preparation for larger commercial deployments. In 2023, Wayve announced a collaboration with Nissan to integrate Wayve’s AI-driven software into its ProPilot ADAS system, slated to launch in fiscal year 2027. Wayve received strategic investment from Uber, in 2024, to jointly develop autonomous ride-hailing services. The two companies plan to trial a fully driverless robotaxi service in London, supported by a UK government program to accelerate commercial self-driving pilots to as early as 2026. To demonstrate the scalability of its technology, Wayve conducted an “AI-500” roadshow project, driving in dozens of cities across Asia, Europe, and North America using the same AI model. By mid-2025, it had completed autonomous driving demos in 90 cities without prior HD mapping. In April 2025, Wayve opened its first Asian research hub in Japan, with investment by SoftBank, to improve its model’s generalization using local driving data. That year, the company conducted driving tests in over 500 cities in Europe, North America and Japan without city-specific programming. In February 2026, Nissan, Uber and Wayve announced their collaboration on robotaxi development, with the aim of launching a pilot programme in Tokyo by late 2026. Wayve also formed a strategic alliance with Mercedes-Benz and Stellantis on personal vehicle and robotaxi applications. == Financing and investors == Wayve has been backed by a mix of venture capital (VC) firms, corporate investors, and individuals. Its initial seed funding came from funds such as Compound (NYC) and Firstminute Capital (London), as well as Cambridge-based angel investors, in 2018. Academic Pieter Abbeel and Uber’s chief scientist, Zoubin Ghahramani, were early backers. In November 2019, Wayve raised a $20 million Series A led by Eclipse Ventures, with participation from Balderton Capital and other prior investors. The Series A financing was used to fund the company’s first autonomous trials in London, and marked the first time a European self-driving car startup had secured a U.S. VC as lead investor. In October 2021, Ocado Group invested £10 million (approximately $13.6 million) in Wayve as a strategic partner in autonomous grocery delivery. This brought Wayve’s total funding to around $60 million at that time. The Series B round followed in January 2022, when Wayve announced $200 million in new funding led by Eclipse Ventures, with D1 Capital Partners, Moore Strategic Ventures, and Linse Capital. Balderton, Microsoft and Virgin Group joined as strategic backers. Baillie Gifford and Compound also participated; Ocado increased its stake as a strategic investor; and Meta AI head Yann LeCun and Richard Branson also became investors. Wayve’s Series C in May 2024 closed a $1.05 billion, led by Japan’s SoftBank Group. The funding round was the largest-ever for a UK AI company, and included new investor Nvidia, and returning investors Microsoft and Eclipse Ventures, among others. Uber also joined as a stratgic partner and a stakeholder. The Series C round increased Wayve’s total funding raised to about $1.3 billion to date from investors including SoftBank, Microsoft and Nvidia, and lifted Wayve’s valuation into “unicorn” status. In February 2026, Wayve announced a $1.2 billion Series D funding round; later that month, the company reported that $1.5 billion had been raised from, primarily, Mercedes-Benz, Stellantis, Nissan, and existing backers Uber, Microsoft and Nvidia, increasing Wayve's overall valuation to $8.6 billion. == Technology == Wayve’s self-driving approach centers on end-to-end deep learning and a vision-based AI system. Unlike conventional autonomous vehicles that depend on high-definition maps, hand-coded rules, and arrays of expensive lidar sensors, Wayve’s platform learns to drive predominantly using camera data and machine learning algorithms. The company refers to its AI-driven driving software as an “Embodied AI” or AI Driver, emphasizing that the system learns from experience (both real and simulated) to handle complex or novel situations rather than following pre-programmed instructions, not unlike Tesla's approach. The Wayve hardware-agnostic autonomy stack consists of a suite of video cameras, with basic automotive sensors, mounted on the vehicle, and paired with onboard compute units that are powered by GPUs to run the AI models. This vision-only philosophy is similar to Tesla’s Autopilot/FSDB model, but Wayve’s solution is vehicle-agnostic and mapless. Wayve’s strategy is to provide its driving AI as an OEM-ready platform; it plans to license or embed its technology into vehicles made by established automakers rather than build its own cars. Wayve’s development vehicles currently use Nvidia’s Orin system-on-chip as the onboard computer for running the AI model, but CEO Kendall has noted that the software can run on “whatever GPU [an automaker] already has in their vehicles” Wayve has built a cloud infrastructure, largely on Microsoft Azure, to process petabytes of this data, and uses simulation tools (known internally as the “Wayve Infinity” simulator) to synthetically generate and practice rare or dangerous scenarios for the AI to learn from. == Corporate affairs == Wayve is a privately held company headquartered in London, England, with its primary research and development office in the Kings Cross area of London. The company was initially incorporated as Wayve Technologies Ltd in the UK. Wayve has also established a presence in the U.S., in Silicon Valley); in Canada, with a research hub in Vancouver; in Yokohama, Japan; in Leonberg, Germany; and in Herzliya, Israel. The Leadership team includes research scientists and engineers with backgrounds in computer vision, robotics, and automotive systems. President Erez Dagan was hired in 2024, following two decades at Mobileye; chief scientist Jamie Shotton is formerly of Microsoft Research; CEO Alex Kendall, originally from New Zealand with a PhD in computer vision from Cambridge, took over as CEO in 2020 after the departure of his co-founder Amar Shah.

    Read more →
  • Probabilistic database

    Probabilistic database

    Most real databases contain data whose correctness is uncertain. In order to work with such data, there is a need to quantify the integrity of the data. This is achieved by using probabilistic databases. A probabilistic database is an uncertain database in which the possible worlds have associated probabilities. Probabilistic database management systems are currently an active area of research. "While there are currently no commercial probabilistic database systems, several research prototypes exist..." Probabilistic databases distinguish between the logical data model and the physical representation of the data much like relational databases do in the ANSI-SPARC Architecture. In probabilistic databases this is even more crucial since such databases have to represent very large numbers of possible worlds, often exponential in the size of one world (a classical database), succinctly. == Terminology == In a probabilistic database, each tuple is associated with a probability between 0 and 1, with 0 representing that the data is certainly incorrect, and 1 representing that it is certainly correct. === Possible worlds === A probabilistic database could exist in multiple states. For example, if there is uncertainty about the existence of a tuple in the database, then the database could be in two different states with respect to that tuple—the first state contains the tuple, while the second one does not. Similarly, if an attribute can take one of the values x, y or z, then the database can be in three different states with respect to that attribute. Each of these states is called a possible world. Consider the following database: (Here {b3, b3′, b3′′} denotes that the attribute can take any of the values b3, b3′ or b3′′) Assuming that there is uncertainty about the first tuple, certainty about the second tuple, and uncertainty about the value of attribute B in the third tuple. Then the actual state of the database may or may not contain the first tuple (depending on whether it is correct or not). Similarly, the value of the attribute B may be b3, b3′ or b3′′. Consequently, the possible worlds corresponding to the database are as follows: === Types of Uncertainties === There are essentially two kinds of uncertainties that could exist in a probabilistic database, as described in the table below: By assigning values to random variables associated with the data items, different possible worlds can be represented. == History == The first published use of the term "probabilistic database" was probably in the 1987 VLDB conference paper "The theory of probabilistic databases", by Cavallo and Pittarelli. The title (of the 11 page paper) was intended as a bit of a joke, since David Maier's 600 page monograph, The Theory of Relational Databases, would have been familiar at that time to many of the conference participants and readers of the conference proceedings.

    Read more →
  • Libby Heaney

    Libby Heaney

    Libby Heaney is a British artist and quantum physicist known for her pioneering work on AI and quantum computing. She works on the impact of future technologies and is widely known to be the first artist to use quantum computing as a functioning artistic medium. Her work has been featured internationally, including in the Victoria and Albert Museum, Tate Modern and the Science Gallery. == Early life and scientific career == Heaney is from Tamworth, Staffordshire. She lived in Amington, and went to Greenacres Primary School and Woodhouse High School, now called Landau Forte Academy Amington. She took her GCSEs in 1999. She studied physics at Imperial College London, graduating in 2005 with first class honours. Libby pursued a successful career in quantum physics, completing a PhD thesis on mode entanglement in ultra-cold atomic gases at the University of Leeds, and pursued her own research as a postdoctoral fellow at the University of Oxford and at the National University of Singapore. In 2008, Heaney was awarded the Institute of Physics Very Early Career Woman in Physics Award (now Jocelyn Bell Burnell Medal and Prize). == Artistic career == In 2013 Heaney returned to the UK and completed a master's degree at the University of the Arts London. She studied arts and science at Central Saint Martins and graduated in 2015. She then became a lecturer at the Royal College of Art, teaching Information Experience Design. In 2016, she created Lady Chatterley's Tinderbot which presented Tinder conversations between real users and AI bots programmed using Lady Chatterley's Lover. Lady Chatterley's Tinderbot was covered by BBC News, TheJournal.ie and the Irish Examiner and was exhibited internationally. In 2017, Heaney was commissioned by Sky Arts and the Barbican Centre to design Britbot, an internet bot built using artificial intelligence and the citizenship book Life in the UK: a guide for new residents. The book, a manual for the citizenship test, has been described by Heaney as being "largely a white male privileged version of British history and culture". The bot spoke to the public about what it meant to be British and learnt from their responses to become an ever changing, plural version of Britishness. She was awarded an Arts Council England grant to widen participation of the Britbot to social media. Heaney has exhibited Britbot at the Victoria and Albert Museum, at CogX, the Sheffield Documentary Festival the Edinburgh TV festival, and Art Ai in Leicester. She has been creating with quantum computing since 2019, and has created artworks using quantum computing for Light Art Space (LAS) in Berlin, Somerset House and arebyte in London. Using quantum code, storytelling, and immersive installations and performances, Libby Heaney's works such as Ent- and slimeqore explore and warn against the double-edged potential of quantum computing and its exploitation by private companies. In 2022, Ent- received the Lumen Prize immersive environment award. == Major works == === Ent- and The Evolution of Ent-: QX (2022) === In 2022, Libby Heaney was commissioned by Light Art Space to create Ent-, a 360 immersive installation that revisits Bosch's Garden of Earthly Delights through quantum. The work uses quantum computing as both a medium and a paradigm through which to conceive human and non-human relations. Ent- was exhibited at LAS, Ars Electronica, and arebyte gallery in London. The work was also modified to fit a full dome projection at the Deutsches Museum in Munich, projected onto a public facade in Seoul, and turned into a playable version for an exhibition at Nahmad Contemporary in New York. In 2022, Ent- was a winner in the Art Science Category of the Falling Walls prize and received the Lumen Prize immersive environment award. The Evolution of Ent-:QX, first displayed at arebyte gallery in London, builds on Ent- and imagines a fictional quantum computing company (QX) that appropriates, parodies and subverts the language of big tech in order to educate the viewer on current profit-oriented uses of quantum computing as well as propose new ways to think about and use the technology. In 2023, Ent- was acquired and displayed by the 0xCollection, a new media arts institution based in Basel, in their inaugural exhibition in Prague. === Touch is response-ability (2020) === Touch is response-ability is an instagram performance and touch screen installation where participants activate animations by flicking through instagram stories. The performance investigates representations of the female body in art history and through computer vision to see how stereotypes are socially constructed and maintained. Images of the body are passed through a quantum algorithm, and as the users interact with them they progressively become fragmented and dissolve beyond recognition. The work was originally commissioned by Hervisions at LUX in 2020 and performed on the LUX instagram account. It was also exhibited at Etopia Zaragoza in 2021 and at Art SG with Gazelli Art House in 2023. === Lady Chatterley's Tinderbot (2016) === In Lady Chatterley's Tinderbot, Libby Heaney programmed a bot to engage in conversations on Tinder by using lines from the 1928 novel Lady Chatterley's Lover, by D.H. Lawrence. The work was first shown as an interactive installation in 2016 at the Dublin Science Gallery, allowing visitors to swipe left or right to navigate through various conversations. Lady Chatterley's Tinderbot was also exhibited at Sonar+D in Barcelona (2017), the Telefonica Fundacion in Lima (2017), the Lowry in Salford (2018), RMIT gallery in Melbourne (2021), Microwave Festival in Hong Kong (2022) and was shortlisted for the HEK-Basel Net-based art award in 2018. == Selected exhibitions == 2023 - Synesthetic Immersion, 0xCollection, Prague 2023 - slimeQrawl, Shoreditch Arts Club, London 2023 - ...and that's only (half) the story, PLUS ONE Gallery, Antwerp 2023–Present Futures Festival, Centre of Contemporary Art, Glasgow 2023 - Realtime: Lilypads: Mediating Exponential Systems, NXT Museum, Amsterdam 2023 - My Rhino is not a Myth, Art Encounters Biennial, Timisoara 2023 - Ent-er the Garden of Forking Paths, Gazelli Art House, London 2023 - Energeia, Etopia, Zaragoza 2022 - Every Kind of Wind: Calder and the 21st Century, Nahmad Contemporary, New York 2022 - remiQXing still, Fiumano Clase, London 2022 - the Evolution of Ent-: QX, arebyte, London 2022 - Ent-, Light Art Space x Schering Stiftung, Berlin 2022 - Among the Machines, Zabludowicz Collection, London 2022 - BioMedia, ZKM, Karlsruhe 2021 - CASCADE, Southbank Centre, London 2021 - Agency is the Ability to Act, Holden Gallery, Manchester 2021 - BIAS, Science Gallery, Dublin 2021 - Ars Electronica, Linz 2021 - AI & Music, S+T+ARTS & Sonar Festival, CCCB, Barcelona 2020 - Real Time Constraints, arebyte, London 2019 - Euro(re)visions, Goethe Institut, London 2019 - Higher Resolutions with Hyphen Labs, Tate Modern, London 2019 - Open Fest with Sky Arts, Barbican, London 2018 - Digital Design Weekend, V&A, London 2018 - FAKE, Science Gallery, Dublin 2017 - Ars Electronica, Linz 2017 - Entangled: Quantum Computer Art, Royal College of Art, London 2017 - Humans Need Not Apply, Science Gallery, Dublin == Awards and honours == Her awards include: 2022 - Lumen Prize, BCS Immersive Environment Award (for Ent-) 2022 - Mozilla Foundation Creative Media Award, USA 2022 - nominated for the S+T+ARTS prize 2021 - Adaptation Award, Artquest, London 2021 - British Council Amplify Collaboration Award 2018 - Arts Council England, National Lottery Project Grant 2018 - HeK Basel Net Based Art Award (shortlisted for Tinderbot)

    Read more →
  • Physics-informed neural networks

    Physics-informed neural networks

    In machine learning, physics-informed neural networks (PINNs), also referred to as theory-trained neural networks (TTNs), are a type of universal function approximator that can embed the knowledge of any physical laws that govern a given data-set in the learning process, and can be described by partial differential equations (PDEs). Low data availability for some biological and engineering problems limit the robustness of conventional machine learning models used for these applications. The prior knowledge of general physical laws acts in the training of neural networks (NNs) as a regularization agent that limits the space of admissible solutions, increasing the generalizability of the function approximation. This way, embedding this prior information into a neural network results in enhancing the information content of the available data, facilitating the learning algorithm to capture the right solution and to generalize well even with a low amount of training examples. Because they process continuous spatial and time coordinates and output continuous PDE solutions, they can be categorized as neural fields. == Function approximation == Most of the physical laws that govern the dynamics of a system can be described by partial differential equations. For example, the Navier–Stokes equations are a set of partial differential equations derived from the conservation laws (i.e., conservation of mass, momentum, and energy) that govern fluid mechanics. The solution of the Navier–Stokes equations with appropriate initial and boundary conditions allows the quantification of flow dynamics in a precisely defined geometry. However, these equations cannot be solved exactly and therefore numerical methods must be used (such as finite differences, finite elements and finite volumes). In this setting, these governing equations must be solved while accounting for prior assumptions, linearization, and adequate time and space discretization. Recently, solving the governing partial differential equations of physical phenomena using deep learning has emerged as a new field of scientific machine learning (SciML), leveraging the universal approximation theorem and high expressivity of neural networks. In general, deep neural networks could approximate any high-dimensional function given that sufficient training data are supplied. However, such networks do not consider the physical characteristics underlying the problem, and the level of approximation accuracy provided by them is still heavily dependent on careful specifications of the problem geometry as well as the initial and boundary conditions. Without this preliminary information, the solution is not unique and may lose physical correctness. To remedy this, Physics-Informed Neural Networks (PINNs) leverage governing physical equations in neural network training. Namely, PINNs are designed to be trained to satisfy the given training data as well as the imposed governing equations. In this fashion, a neural network can be guided with training datasets that do not necessarily need to be large or complete. An accurate solution of partial differential equations can potentially be found without knowing the boundary conditions. Therefore, with some knowledge about the physical characteristics of the problem and some form of training data (even sparse and incomplete), PINNs may be used for finding an optimal solution with high fidelity. PINNs can be applied to a wide range of problems in computational science, and are a pioneering technology leading to the development of new classes of numerical solvers for PDEs. PINNs can be thought of as a mesh-free alternative to traditional approaches (e.g., CFD for fluid dynamics), and new data-driven approaches for model inversion and system identification. Notably, a trained PINN network can be used to predict values on simulation grids of different resolutions without needing to be retrained. Additionally, the derivatives used in the partial differential equations can be computed using automatic differentiation (AD), which is assessed to be superior to numerical or symbolic differentiation. == Modeling and computation == A general nonlinear partial differential equation can be written as: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} where u ( t , x ) {\displaystyle u(t,x)} denotes the solution, N [ ⋅ ; λ ] {\displaystyle {\mathcal {N}}[\cdot ;\lambda ]} is a nonlinear operator parameterized by λ {\displaystyle \lambda } , and Ω {\displaystyle \Omega } is a subset of R D {\displaystyle \mathbb {R} ^{D}} . This general form of governing equations summarizes a wide range of problems in mathematical physics, such as conservative laws, diffusion process, advection-diffusion systems, and kinetic equations. Given noisy measurements of a generic dynamic system described by the equation above, PINNs can be designed to solve two classes of problems: data-driven solutions of partial differential equations data-driven discovery of partial differential equations === Data-driven solution of partial differential equations === The data-driven solution of PDE computes the hidden state u ( t , x ) {\displaystyle u(t,x)} of the system given boundary data and/or measurements z {\displaystyle z} , and fixed model parameters λ {\displaystyle \lambda } . We solve: u t + N [ u ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u]=0,\quad x\in \Omega ,\quad t\in [0,T]} . by defining the residual f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ] {\displaystyle f:=u_{t}+{\mathcal {N}}[u]} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network. This network can be differentiated using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} is the error between the PINN u ( t , x ) {\displaystyle u(t,x)} and the set of boundary conditions and measured data on the set of points Γ {\displaystyle \Gamma } where the boundary conditions and data are defined. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the mean-squared error of the residual function. This second term encourages the PINN to learn the structural information expressed by the PDE during the training process. This approach has been used to yield computationally efficient physics-informed surrogate models with applications in the forecasting of physical processes, model predictive control, multi-physics and multi-scale modeling, and simulation. It has been shown to converge to the solution of the PDE. === Data-driven discovery of partial differential equations === Given noisy and incomplete measurements z {\displaystyle z} of the state of the system, the data-driven discovery of PDEs results in computing the unknown state u ( t , x ) {\displaystyle u(t,x)} and learning model parameters λ {\displaystyle \lambda } that best describe the observed data: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} By defining f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ; λ ] = 0 {\displaystyle f:=u_{t}+{\mathcal {N}}[u;\lambda ]=0} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network, f ( t , x ) {\displaystyle f(t,x)} results in a PINN. This network can be derived using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} , together with the parameter λ {\displaystyle \lambda } of the differential operator can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} , with u {\displaystyle u} and z {\displaystyle z} state solutions and measurements at sparse location Γ {\displaystyle \Gamma } , respectively. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the residual function. This second term requires the structured information represented by the partial differential equations to be satisfied in the training process. This strategy allows for discovering dynamic models described by nonlinear PDEs assembling computationally efficient and fully differentiable surrogate models that may find application in predictive forecasting, control, and data assimilation. == Extensions and applications == === For piece-wise function approximation === PINNs are unable to approximate PDEs that have strong non-linearity or sharp gradients (such as those that commonly occur in practical fluid flow problems). Piecewise approximation has been an old practic

    Read more →
  • Type-1 OWA operators

    Type-1 OWA operators

    Type-1 OWA operators are a set of aggregation operators that generalise the Yager's OWA (ordered weighted averaging) operators in the interest of aggregating fuzzy sets rather than crisp values in soft decision making and data mining. These operators provide a mathematical technique for directly aggregating uncertain information with uncertain weights via OWA mechanism in soft decision making and data mining, where these uncertain objects are modelled by fuzzy sets. The two definitions for type-1 OWA operators are based on Zadeh's Extension Principle and α {\displaystyle \alpha } -cuts of fuzzy sets. The two definitions lead to equivalent results. == Definitions == === Definition 1 === Let F ( X ) {\displaystyle F(X)} be the set of fuzzy sets with domain of discourse X {\displaystyle X} , a type-1 OWA operator is defined as follows: Given n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,1]} , a type-1 OWA operator is a mapping, Φ {\displaystyle \Phi } , Φ : F ( X ) × ⋯ × F ( X ) ⟶ F ( X ) {\displaystyle \Phi \colon F(X)\times \cdots \times F(X)\longrightarrow F(X)} ( A 1 , ⋯ , A n ) ↦ Y {\displaystyle (A^{1},\cdots ,A^{n})\mapsto Y} such that μ Y ( y ) = sup ∑ k = 1 n w ¯ i a σ ( i ) = y ( μ W 1 ( w 1 ) ∧ ⋯ ∧ μ W n ( w n ) ∧ μ A 1 ( a 1 ) ∧ ⋯ ∧ μ A n ( a n ) ) {\displaystyle \mu _{Y}(y)=\displaystyle \sup _{\displaystyle \sum _{k=1}^{n}{\bar {w}}_{i}a_{\sigma (i)}=y}\left({\begin{array}{{1}l}\mu _{W^{1}}(w_{1})\wedge \cdots \wedge \mu _{W^{n}}(w_{n})\wedge \mu _{A^{1}}(a_{1})\wedge \cdots \wedge \mu _{A^{n}}(a_{n})\end{array}}\right)} where w ¯ i = w i ∑ i = 1 n w i {\displaystyle {\bar {w}}_{i}={\frac {w_{i}}{\sum _{i=1}^{n}{w_{i}}}}} , and σ : { 1 , ⋯ , n } ⟶ { 1 , ⋯ , n } {\displaystyle \sigma \colon \{1,\cdots ,n\}\longrightarrow \{1,\cdots ,n\}} is a permutation function such that a σ ( i ) ≥ a σ ( i + 1 ) , ∀ i = 1 , ⋯ , n − 1 {\displaystyle a_{\sigma (i)}\geq a_{\sigma (i+1)},\ \forall i=1,\cdots ,n-1} , i.e., a σ ( i ) {\displaystyle a_{\sigma (i)}} is the i {\displaystyle i} th highest element in the set { a 1 , ⋯ , a n } {\displaystyle \left\{{a_{1},\cdots ,a_{n}}\right\}} . === Definition 2 === Using the alpha-cuts of fuzzy sets: Given the n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,\;\;1]} , then for each α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,\;1]} , an α {\displaystyle \alpha } -level type-1 OWA operator with α {\displaystyle \alpha } -level sets { W α i } i = 1 n {\displaystyle \left\{{W_{\alpha }^{i}}\right\}_{i=1}^{n}} to aggregate the α {\displaystyle \alpha } -cuts of fuzzy sets { A i } i = 1 n {\displaystyle \left\{{A^{i}}\right\}_{i=1}^{n}} is: Φ α ( A α 1 , … , A α n ) = { ∑ i = 1 n w i a σ ( i ) ∑ i = 1 n w i | w i ∈ W α i , a i ∈ A α i , i = 1 , … , n } {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\ldots ,A_{\alpha }^{n}}\right)=\left\{{{\frac {\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}}}{\sum \limits _{i=1}^{n}{w_{i}}}}\left|{w_{i}\in W_{\alpha }^{i},\;a_{i}}\right.\in A_{\alpha }^{i},\;i=1,\ldots ,n}\right\}} where W α i = { w | μ W i ( w ) ≥ α } , A α i = { x | μ A i ( x ) ≥ α } {\displaystyle W_{\alpha }^{i}=\{w|\mu _{W_{i}}(w)\geq \alpha \},A_{\alpha }^{i}=\{x|\mu _{A_{i}}(x)\geq \alpha \}} , and σ : { 1 , ⋯ , n } → { 1 , ⋯ , n } {\displaystyle \sigma :\{\;1,\cdots ,n\;\}\to \{\;1,\cdots ,n\;\}} is a permutation function such that a σ ( i ) ≥ a σ ( i + 1 ) , ∀ i = 1 , ⋯ , n − 1 {\displaystyle a_{\sigma (i)}\geq a_{\sigma (i+1)},\;\forall \;i=1,\cdots ,n-1} , i.e., a σ ( i ) {\displaystyle a_{\sigma (i)}} is the i {\displaystyle i} th largest element in the set { a 1 , ⋯ , a n } {\displaystyle \left\{{a_{1},\cdots ,a_{n}}\right\}} . == Representation theorem of Type-1 OWA operators == Given the n linguistic weights { W i } i = 1 n {\displaystyle \left\{{W^{i}}\right\}_{i=1}^{n}} in the form of fuzzy sets defined on the domain of discourse U = [ 0 , 1 ] {\displaystyle U=[0,\;\;1]} , and the fuzzy sets A 1 , ⋯ , A n {\displaystyle A^{1},\cdots ,A^{n}} , then we have that Y = G {\displaystyle Y=G} where Y {\displaystyle Y} is the aggregation result obtained by Definition 1, and G {\displaystyle G} is the result obtained by in Definition 2. == Programming problems for Type-1 OWA operators == According to the Representation Theorem of Type-1 OWA Operators, a general type-1 OWA operator can be decomposed into a series of α {\displaystyle \alpha } -level type-1 OWA operators. In practice, this series of α {\displaystyle \alpha } -level type-1 OWA operators is used to construct the resulting aggregation fuzzy set. So we only need to compute the left end-points and right end-points of the intervals Φ α ( A α 1 , ⋯ , A α n ) {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)} . Then, the resulting aggregation fuzzy set is constructed with the membership function as follows: μ G ( x ) = ⋁ α : x ∈ Φ α ( A α 1 , ⋯ , A α n ) α ⁡ α {\displaystyle \mu _{G}(x)=\operatorname {\bigvee } \limits _{\alpha :x\in \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{\alpha }}\alpha } For the left end-points, we need to solve the following programming problem: Φ α ( A α 1 , ⋯ , A α n ) − = min W α − i ≤ w i ≤ W α + i A α − i ≤ a i ≤ A α + i ⁡ ∑ i = 1 n w i a σ ( i ) / ∑ i = 1 n w i {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{-}=\operatorname {\min } \limits _{\begin{array}{l}W_{\alpha -}^{i}\leq w_{i}\leq W_{\alpha +}^{i}A_{\alpha -}^{i}\leq a_{i}\leq A_{\alpha +}^{i}\end{array}}\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}/\sum \limits _{i=1}^{n}{w_{i}}}} while for the right end-points, we need to solve the following programming problem: Φ α ( A α 1 , ⋯ , A α n ) + = max W α − i ≤ w i ≤ W α + i A α − i ≤ a i ≤ A α + i ⁡ ∑ i = 1 n w i a σ ( i ) / ∑ i = 1 n w i {\displaystyle \Phi _{\alpha }\left({A_{\alpha }^{1},\cdots ,A_{\alpha }^{n}}\right)_{+}=\operatorname {\max } \limits _{\begin{array}{l}W_{\alpha -}^{i}\leq w_{i}\leq W_{\alpha +}^{i}A_{\alpha -}^{i}\leq a_{i}\leq A_{\alpha +}^{i}\end{array}}\sum \limits _{i=1}^{n}{w_{i}a_{\sigma (i)}/\sum \limits _{i=1}^{n}{w_{i}}}} A fast method has been presented to solve two programming problem so that the type-1 OWA aggregation operation can be performed efficiently, for details, please see the paper. == Alpha-level approach to Type-1 OWA operation == Three-step process: Step 1—To set up the α {\displaystyle \alpha } - level resolution in [0, 1]. Step 2—For each α ∈ [ 0 , 1 ] {\displaystyle \alpha \in [0,1]} , Step 2.1—To calculate ρ α + i 0 ∗ {\displaystyle \rho _{\alpha +}^{i_{0}^{\ast }}} Let i 0 = 1 {\displaystyle i_{0}=1} ; If ρ α + i 0 ≥ A α + σ ( i 0 ) {\displaystyle \rho _{\alpha +}^{i_{0}}\geq A_{\alpha +}^{\sigma (i_{0})}} , stop, ρ α + i 0 {\displaystyle \rho _{\alpha +}^{i_{0}}} is the solution; otherwise go to Step 2.1-3. i 0 ← i 0 + 1 {\displaystyle i_{0}\leftarrow i_{0}+1} , go to Step 2.1-2. Step 2.2 To calculate ρ α − i 0 ∗ {\displaystyle \rho _{\alpha -}^{i_{0}^{\ast }}} Let i 0 = 1 {\displaystyle i_{0}=1} ; If ρ α − i 0 ≥ A α − σ ( i 0 ) {\displaystyle \rho _{\alpha -}^{i_{0}}\geq A_{\alpha -}^{\sigma (i_{0})}} , stop, ρ α − i 0 {\displaystyle \rho _{\alpha -}^{i_{0}}} is the solution; otherwise go to Step 2.2-3. i 0 ← i 0 + 1 {\displaystyle i_{0}\leftarrow i_{0}+1} , go to step Step 2.2-2. Step 3—To construct the aggregation resulting fuzzy set G {\displaystyle G} based on all the available intervals [ ρ α − i 0 ∗ , ρ α + i 0 ∗ ] {\displaystyle \left[{\rho _{\alpha -}^{i_{0}^{\ast }},\;\rho _{\alpha +}^{i_{0}^{\ast }}}\right]} : μ G ( x ) = ⋁ α : x ∈ [ ρ α − i 0 ∗ , ρ α + i 0 ∗ ] ⁡ α {\displaystyle \mu _{G}(x)=\operatorname {\bigvee } \limits _{\alpha :x\in \left[{\rho _{\alpha -}^{i_{0}^{\ast }},\;\rho _{\alpha +}^{i_{0}^{\ast }}}\right]}\alpha } == Some Examples == The type-1 OWA operator with the weights shown in the top figure is used to aggregate the fuzzy sets (solide lines) in the bottom figure, and the dashed line is the aggregation result. == Special cases == Any OWA operators, like maximum, minimum, mean operators; Join operators of (type-1) fuzzy sets, i.e., fuzzy maximum operators; Meet operators of (type-1) fuzzy sets, i.e., fuzzy minimum operators; Join-like operators of (type-1) fuzzy sets; Meet-like operators of (type-1) fuzzy sets. == Generalizations == Type-2 OWA operators have been suggested to aggregate the type-2 fuzzy sets for soft decision making. == Applications == Type-1 OWA operators have been applied to different domains for soft decision making. Improved efficiency of computing approach ; Type reduction of type-2 fuzzy sets ; Group decision making ; Credit risk evaluation ; Information fusion ; Linguistic expressions and symbolic translation ; Sentiment analysis ; Ro

    Read more →
  • Hindsight optimization

    Hindsight optimization

    Hindsight optimisation (HOP) is a computer science technique used in artificial intelligence for analysis of actions which have stochastic results. HOP is used in combination with a deterministic planner. By creating sample results for each of the possible actions from the given state (i.e. determinising the actions), and using the deterministic planner to analyse those sample results, HOP allows an estimate of the actual action.

    Read more →
  • Fuzzy electronics

    Fuzzy electronics

    Fuzzy electronics is an electronic technology that uses fuzzy logic, instead of the two-state Boolean logic more commonly used in digital electronics. Fuzzy electronics is fuzzy logic implemented on dedicated hardware. This is to be compared with fuzzy logic implemented in software running on a conventional processor. Fuzzy electronics has a wide range of applications, including control systems and artificial intelligence. == History == The first fuzzy electronic circuit was built by Takeshi Yamakawa et al. in 1980 using discrete bipolar transistors. The first industrial fuzzy application was in a cement kiln in Denmark in 1982. The first VLSI fuzzy electronics was by Masaki Togai and Hiroyuki Watanabe in 1984. In 1987, Yamakawa built the first analog fuzzy controller. The first digital fuzzy processors came in 1988 by Togai (Russo, pp. 2–6). In the early 1990s, the first fuzzy logic chips were presented to the public. Two companies which are Omron and NEC have announced the development of dedicated fuzzy electronic hardware in the year 1991. Two years later, the Japanese Omron Cooperation has shown a working fuzzy chip during a technical fair.

    Read more →
  • Reparameterization trick

    Reparameterization trick

    The reparameterization trick (aka "reparameterization gradient estimator") is a technique used in statistical machine learning, particularly in variational inference, variational autoencoders, and stochastic optimization. It allows for the efficient computation of gradients through random variables, enabling the optimization of parametric probability models using stochastic gradient descent, and the variance reduction of estimators. It was developed in the 1980s in operations research, under the name of "pathwise gradients", or "stochastic gradients". Its use in variational inference was proposed in 2013. == Mathematics == Let z {\displaystyle z} be a random variable with distribution q ϕ ( z ) {\displaystyle q_{\phi }(z)} , where ϕ {\displaystyle \phi } is a vector containing the parameters of the distribution. === REINFORCE estimator === Consider an objective function of the form: L ( ϕ ) = E z ∼ q ϕ ( z ) [ f ( z ) ] {\displaystyle L(\phi )=\mathbb {E} _{z\sim q_{\phi }(z)}[f(z)]} Without the reparameterization trick, estimating the gradient ∇ ϕ L ( ϕ ) {\displaystyle \nabla _{\phi }L(\phi )} can be challenging, because the parameter appears in the random variable itself. In more detail, we have to statistically estimate: ∇ ϕ L ( ϕ ) = ∇ ϕ ∫ d z q ϕ ( z ) f ( z ) {\displaystyle \nabla _{\phi }L(\phi )=\nabla _{\phi }\int dz\;q_{\phi }(z)f(z)} The REINFORCE estimator, widely used in reinforcement learning and especially policy gradient, uses the following equality: ∇ ϕ L ( ϕ ) = ∫ d z q ϕ ( z ) ∇ ϕ ( ln ⁡ q ϕ ( z ) ) f ( z ) = E z ∼ q ϕ ( z ) [ ∇ ϕ ( ln ⁡ q ϕ ( z ) ) f ( z ) ] {\displaystyle \nabla _{\phi }L(\phi )=\int dz\;q_{\phi }(z)\nabla _{\phi }(\ln q_{\phi }(z))f(z)=\mathbb {E} _{z\sim q_{\phi }(z)}[\nabla _{\phi }(\ln q_{\phi }(z))f(z)]} This allows the gradient to be estimated: ∇ ϕ L ( ϕ ) ≈ 1 N ∑ i = 1 N ∇ ϕ ( ln ⁡ q ϕ ( z i ) ) f ( z i ) {\displaystyle \nabla _{\phi }L(\phi )\approx {\frac {1}{N}}\sum _{i=1}^{N}\nabla _{\phi }(\ln q_{\phi }(z_{i}))f(z_{i})} The REINFORCE estimator has high variance, and many methods were developed to reduce its variance. === Reparameterization estimator === The reparameterization trick expresses z {\displaystyle z} as: z = g ϕ ( ϵ ) , ϵ ∼ p ( ϵ ) {\displaystyle z=g_{\phi }(\epsilon ),\quad \epsilon \sim p(\epsilon )} Here, g ϕ {\displaystyle g_{\phi }} is a deterministic function parameterized by ϕ {\displaystyle \phi } , and ϵ {\displaystyle \epsilon } is a noise variable drawn from a fixed distribution p ( ϵ ) {\displaystyle p(\epsilon )} . This gives: L ( ϕ ) = E ϵ ∼ p ( ϵ ) [ f ( g ϕ ( ϵ ) ) ] {\displaystyle L(\phi )=\mathbb {E} _{\epsilon \sim p(\epsilon )}[f(g_{\phi }(\epsilon ))]} Now, the gradient can be estimated as: ∇ ϕ L ( ϕ ) = E ϵ ∼ p ( ϵ ) [ ∇ ϕ f ( g ϕ ( ϵ ) ) ] ≈ 1 N ∑ i = 1 N ∇ ϕ f ( g ϕ ( ϵ i ) ) {\displaystyle \nabla _{\phi }L(\phi )=\mathbb {E} _{\epsilon \sim p(\epsilon )}[\nabla _{\phi }f(g_{\phi }(\epsilon ))]\approx {\frac {1}{N}}\sum _{i=1}^{N}\nabla _{\phi }f(g_{\phi }(\epsilon _{i}))} == Examples == For some common distributions, the reparameterization trick takes specific forms: Normal distribution: For z ∼ N ( μ , σ 2 ) {\displaystyle z\sim {\mathcal {N}}(\mu ,\sigma ^{2})} , we can use: z = μ + σ ϵ , ϵ ∼ N ( 0 , 1 ) {\displaystyle z=\mu +\sigma \epsilon ,\quad \epsilon \sim {\mathcal {N}}(0,1)} Exponential distribution: For z ∼ Exp ( λ ) {\displaystyle z\sim {\text{Exp}}(\lambda )} , we can use: z = − 1 λ log ⁡ ( ϵ ) , ϵ ∼ Uniform ( 0 , 1 ) {\displaystyle z=-{\frac {1}{\lambda }}\log(\epsilon ),\quad \epsilon \sim {\text{Uniform}}(0,1)} Discrete distribution can be reparameterized by the Gumbel distribution (Gumbel-softmax trick or "concrete distribution") and diffusion models. In general, any distribution that is differentiable with respect to its parameters can be reparameterized by inverting the multivariable CDF function, then apply the implicit method. See for an exposition and application to the Gamma, Beta, Dirichlet, and von Mises distributions. == Applications == === Variational autoencoder === In Variational Autoencoders (VAEs), the VAE objective function, known as the Evidence Lower Bound (ELBO), is given by: ELBO ( ϕ , θ ) = E z ∼ q ϕ ( z | x ) [ log ⁡ p θ ( x | z ) ] − D KL ( q ϕ ( z | x ) | | p ( z ) ) {\displaystyle {\text{ELBO}}(\phi ,\theta )=\mathbb {E} _{z\sim q_{\phi }(z|x)}[\log p_{\theta }(x|z)]-D_{\text{KL}}(q_{\phi }(z|x)||p(z))} where q ϕ ( z | x ) {\displaystyle q_{\phi }(z|x)} is the encoder (recognition model), p θ ( x | z ) {\displaystyle p_{\theta }(x|z)} is the decoder (generative model), and p ( z ) {\displaystyle p(z)} is the prior distribution over latent variables. The gradient of ELBO with respect to θ {\displaystyle \theta } is simply E z ∼ q ϕ ( z | x ) [ ∇ θ log ⁡ p θ ( x | z ) ] ≈ 1 L ∑ l = 1 L ∇ θ log ⁡ p θ ( x | z l ) {\displaystyle \mathbb {E} _{z\sim q_{\phi }(z|x)}[\nabla _{\theta }\log p_{\theta }(x|z)]\approx {\frac {1}{L}}\sum _{l=1}^{L}\nabla _{\theta }\log p_{\theta }(x|z_{l})} but the gradient with respect to ϕ {\displaystyle \phi } requires the trick. Express the sampling operation z ∼ q ϕ ( z | x ) {\displaystyle z\sim q_{\phi }(z|x)} as: z = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ , ϵ ∼ N ( 0 , I ) {\displaystyle z=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon ,\quad \epsilon \sim {\mathcal {N}}(0,I)} where μ ϕ ( x ) {\displaystyle \mu _{\phi }(x)} and σ ϕ ( x ) {\displaystyle \sigma _{\phi }(x)} are the outputs of the encoder network, and ⊙ {\displaystyle \odot } denotes element-wise multiplication. Then we have ∇ ϕ ELBO ( ϕ , θ ) = E ϵ ∼ N ( 0 , I ) [ ∇ ϕ log ⁡ p θ ( x | z ) + ∇ ϕ log ⁡ q ϕ ( z | x ) − ∇ ϕ log ⁡ p ( z ) ] {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi ,\theta )=\mathbb {E} _{\epsilon \sim {\mathcal {N}}(0,I)}[\nabla _{\phi }\log p_{\theta }(x|z)+\nabla _{\phi }\log q_{\phi }(z|x)-\nabla _{\phi }\log p(z)]} where z = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ {\displaystyle z=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon } . This allows us to estimate the gradient using Monte Carlo sampling: ∇ ϕ ELBO ( ϕ , θ ) ≈ 1 L ∑ l = 1 L [ ∇ ϕ log ⁡ p θ ( x | z l ) + ∇ ϕ log ⁡ q ϕ ( z l | x ) − ∇ ϕ log ⁡ p ( z l ) ] {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi ,\theta )\approx {\frac {1}{L}}\sum _{l=1}^{L}[\nabla _{\phi }\log p_{\theta }(x|z_{l})+\nabla _{\phi }\log q_{\phi }(z_{l}|x)-\nabla _{\phi }\log p(z_{l})]} where z l = μ ϕ ( x ) + σ ϕ ( x ) ⊙ ϵ l {\displaystyle z_{l}=\mu _{\phi }(x)+\sigma _{\phi }(x)\odot \epsilon _{l}} and ϵ l ∼ N ( 0 , I ) {\displaystyle \epsilon _{l}\sim {\mathcal {N}}(0,I)} for l = 1 , … , L {\displaystyle l=1,\ldots ,L} . This formulation enables backpropagation through the sampling process, allowing for end-to-end training of the VAE model using stochastic gradient descent or its variants. === Variational inference === More generally, the trick allows using stochastic gradient descent for variational inference. Let the variational objective (ELBO) be of the form: ELBO ( ϕ ) = E z ∼ q ϕ ( z ) [ log ⁡ p ( x , z ) − log ⁡ q ϕ ( z ) ] {\displaystyle {\text{ELBO}}(\phi )=\mathbb {E} _{z\sim q_{\phi }(z)}[\log p(x,z)-\log q_{\phi }(z)]} Using the reparameterization trick, we can estimate the gradient of this objective with respect to ϕ {\displaystyle \phi } : ∇ ϕ ELBO ( ϕ ) ≈ 1 L ∑ l = 1 L ∇ ϕ [ log ⁡ p ( x , g ϕ ( ϵ l ) ) − log ⁡ q ϕ ( g ϕ ( ϵ l ) ) ] , ϵ l ∼ p ( ϵ ) {\displaystyle \nabla _{\phi }{\text{ELBO}}(\phi )\approx {\frac {1}{L}}\sum _{l=1}^{L}\nabla _{\phi }[\log p(x,g_{\phi }(\epsilon _{l}))-\log q_{\phi }(g_{\phi }(\epsilon _{l}))],\quad \epsilon _{l}\sim p(\epsilon )} === Dropout === The reparameterization trick has been applied to reduce the variance in dropout, a regularization technique in neural networks. The original dropout can be reparameterized with Bernoulli distributions: y = ( W ⊙ ϵ ) x , ϵ i j ∼ Bernoulli ( α i j ) {\displaystyle y=(W\odot \epsilon )x,\quad \epsilon _{ij}\sim {\text{Bernoulli}}(\alpha _{ij})} where W {\displaystyle W} is the weight matrix, x {\displaystyle x} is the input, and α i j {\displaystyle \alpha _{ij}} are the (fixed) dropout rates. More generally, other distributions can be used than the Bernoulli distribution, such as the gaussian noise: y i = μ i + σ i ⊙ ϵ i , ϵ i ∼ N ( 0 , I ) {\displaystyle y_{i}=\mu _{i}+\sigma _{i}\odot \epsilon _{i},\quad \epsilon _{i}\sim {\mathcal {N}}(0,I)} where μ i = m i ⊤ x {\displaystyle \mu _{i}=\mathbf {m} _{i}^{\top }x} and σ i 2 = v i ⊤ x 2 {\displaystyle \sigma _{i}^{2}=\mathbf {v} _{i}^{\top }x^{2}} , with m i {\displaystyle \mathbf {m} _{i}} and v i {\displaystyle \mathbf {v} _{i}} being the mean and variance of the i {\displaystyle i} -th output neuron. The reparameterization trick can be applied to all such cases, resulting in the variational dropout method.

    Read more →
  • Batch normalization

    Batch normalization

    In artificial neural networks, batch normalization (also known as batch norm) is a normalization technique used to make training faster and more stable by adjusting the inputs to each layer—re-centering them around zero and re-scaling them to a standard size. It was introduced by Sergey Ioffe and Christian Szegedy in 2015. Experts still debate why batch normalization works so well. It was initially thought to tackle internal covariate shift, a problem where parameter initialization and changes in the distribution of the inputs of each layer affect the learning rate of the network. However, newer research suggests it doesn’t fix this shift but instead smooths the objective function—a mathematical guide the network follows to improve—enhancing performance. In very deep networks, batch normalization can initially cause a severe gradient explosion—where updates to the network grow uncontrollably large—but this is managed with shortcuts called skip connections in residual networks. Another theory is that batch normalization adjusts data by handling its size and path separately, speeding up training. == Internal covariate shift == Each layer in a neural network has inputs that follow a specific distribution, which shifts during training due to two main factors: the random starting values of the network’s settings (parameter initialization) and the natural variation in the input data. This shifting pattern affecting the inputs to the network’s inner layers is called internal covariate shift. While a strict definition isn’t fully agreed upon, experiments show that it involves changes in the means and variances of these inputs during training. Batch normalization was first developed to address internal covariate shift. During training, as the parameters of preceding layers adjust, the distribution of inputs to the current layer changes accordingly, such that the current layer needs to constantly readjust to new distributions. This issue is particularly severe in deep networks, because small changes in shallower hidden layers will be amplified as they propagate within the network, resulting in significant shift in deeper hidden layers. Batch normalization was proposed to reduced these unwanted shifts to speed up training and produce more reliable models. Beyond possibly tackling internal covariate shift, batch normalization offers several additional advantages. It allows the network to use a higher learning rate—a setting that controls how quickly the network learns—without causing problems like vanishing or exploding gradients, where updates become too small or too large. It also appears to have a regularizing effect, improving the network’s ability to generalize to new data, reducing the need for dropout, a technique used to prevent overfitting (when a model learns the training data too well and fails on new data). Additionally, networks using batch normalization are less sensitive to the choice of starting settings or learning rates, making them more robust and adaptable. == Procedures == === Transformation === In a neural network, batch normalization is achieved through a normalization step that fixes the means and variances of each layer's inputs. Ideally, the normalization would be conducted over the entire training set, but to use this step jointly with stochastic optimization methods, it is impractical to use the global information. Thus, normalization is restrained to each mini-batch in the training process. Let us use B to denote a mini-batch of size m of the entire training set. The empirical mean and variance of B could thus be denoted as μ B = 1 m ∑ i = 1 m x i {\displaystyle \mu _{B}={\frac {1}{m}}\sum _{i=1}^{m}x_{i}} and σ B 2 = 1 m ∑ i = 1 m ( x i − μ B ) 2 {\displaystyle \sigma _{B}^{2}={\frac {1}{m}}\sum _{i=1}^{m}(x_{i}-\mu _{B})^{2}} . For a layer of the network with d-dimensional input, x = ( x ( 1 ) , . . . , x ( d ) ) {\displaystyle x=(x^{(1)},...,x^{(d)})} , each dimension of its input is then normalized (i.e. re-centered and re-scaled) separately, x ^ i ( k ) = x i ( k ) − μ B ( k ) ( σ B ( k ) ) 2 + ϵ {\displaystyle {\hat {x}}_{i}^{(k)}={\frac {x_{i}^{(k)}-\mu _{B}^{(k)}}{\sqrt {\left(\sigma _{B}^{(k)}\right)^{2}+\epsilon }}}} , where k ∈ [ 1 , d ] {\displaystyle k\in [1,d]} and i ∈ [ 1 , m ] {\displaystyle i\in [1,m]} ; μ B ( k ) {\displaystyle \mu _{B}^{(k)}} and σ B ( k ) {\displaystyle \sigma _{B}^{(k)}} are the per-dimension mean and standard deviation, respectively. ϵ {\displaystyle \epsilon } is added in the denominator for numerical stability and is an arbitrarily small positive constant. The resulting normalized activation x ^ ( k ) {\displaystyle {\hat {x}}^{(k)}} have zero mean and unit variance, if ϵ {\displaystyle \epsilon } is not taken into account. To restore the representation power of the network, a transformation step then follows as y i ( k ) = γ ( k ) x ^ i ( k ) + β ( k ) {\displaystyle y_{i}^{(k)}=\gamma ^{(k)}{\hat {x}}_{i}^{(k)}+\beta ^{(k)}} , where the parameters γ ( k ) {\displaystyle \gamma ^{(k)}} and β ( k ) {\displaystyle \beta ^{(k)}} are subsequently learned in the optimization process. Formally, the operation that implements batch normalization is a transform B N γ ( k ) , β ( k ) : x 1... m ( k ) → y 1... m ( k ) {\displaystyle BN_{\gamma ^{(k)},\beta ^{(k)}}:x_{1...m}^{(k)}\rightarrow y_{1...m}^{(k)}} called the Batch Normalizing transform. The output of the BN transform y ( k ) = B N γ ( k ) , β ( k ) ( x ( k ) ) {\displaystyle y^{(k)}=BN_{\gamma ^{(k)},\beta ^{(k)}}(x^{(k)})} is then passed to other network layers, while the normalized output x ^ i ( k ) {\displaystyle {\hat {x}}_{i}^{(k)}} remains internal to the current layer. === Backpropagation === The described BN transform is a differentiable operation, and the gradient of the loss l {\displaystyle l} with respect to the different parameters can be computed directly with the chain rule. Specifically, ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} depends on the choice of activation function, and the gradient against other parameters could be expressed as a function of ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} : ∂ l ∂ x ^ i ( k ) = ∂ l ∂ y i ( k ) γ ( k ) {\displaystyle {\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}={\frac {\partial l}{\partial y_{i}^{(k)}}}\gamma ^{(k)}} , ∂ l ∂ γ ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) x ^ i ( k ) {\displaystyle {\frac {\partial l}{\partial \gamma ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\hat {x}}_{i}^{(k)}} , ∂ l ∂ β ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial \beta ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}} , ∂ l ∂ σ B ( k ) 2 = ∑ i = 1 m ∂ l ∂ y i ( k ) ( x i ( k ) − μ B ( k ) ) ( − γ ( k ) 2 ( σ B ( k ) 2 + ϵ ) − 3 / 2 ) {\displaystyle {\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}(x_{i}^{(k)}-\mu _{B}^{(k)})\left(-{\frac {\gamma ^{(k)}}{2}}(\sigma _{B}^{(k)^{2}}+\epsilon )^{-3/2}\right)} , ∂ l ∂ μ B ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) − γ ( k ) σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 1 m ∑ i = 1 m ( − 2 ) ⋅ ( x i ( k ) − μ B ( k ) ) {\displaystyle {\frac {\partial l}{\partial \mu _{B}^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\frac {-\gamma ^{(k)}}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {1}{m}}\sum _{i=1}^{m}(-2)\cdot (x_{i}^{(k)}-\mu _{B}^{(k)})} , and ∂ l ∂ x i ( k ) = ∂ l ∂ x ^ i ( k ) 1 σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 2 ( x i ( k ) − μ B ( k ) ) m + ∂ l ∂ μ B ( k ) 1 m {\displaystyle {\frac {\partial l}{\partial x_{i}^{(k)}}}={\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}{\frac {1}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {2(x_{i}^{(k)}-\mu _{B}^{(k)})}{m}}+{\frac {\partial l}{\partial \mu _{B}^{(k)}}}{\frac {1}{m}}} . === Inference === During the training stage, the normalization steps depend on the mini-batches to ensure efficient and reliable training. However, in the inference stage, this dependence is not useful any more. Instead, the normalization step in this stage is computed with the population statistics such that the output could depend on the input in a deterministic manner. The population mean, E [ x ( k ) ] {\displaystyle E[x^{(k)}]} , and variance, Var ⁡ [ x ( k ) ] {\displaystyle \operatorname {Var} [x^{(k)}]} , are computed as: E [ x ( k ) ] = E B [ μ B ( k ) ] {\displaystyle E[x^{(k)}]=E_{B}[\mu _{B}^{(k)}]} , and Var ⁡ [ x ( k ) ] = m m − 1 E B [ ( σ B ( k ) ) 2 ] {\displaystyle \operatorname {Var} [x^{(k)}]={\frac {m}{m-1}}E_{B}[\left(\sigma _{B}^{(k)}\right)^{2}]} . The population statistics thus is a complete representation of the mini-batches. The BN transform in the inference step thus becomes y ( k ) = B N γ ( k ) , β ( k ) inf ( x ( k ) ) = γ ( k ) x ( k ) − E [ x ( k ) ] Var ⁡ [ x ( k ) ] + ϵ + β

    Read more →
  • Competitions and prizes in artificial intelligence

    Competitions and prizes in artificial intelligence

    There are a number of competitions and prizes to promote research in artificial intelligence. == General machine intelligence == The David E. Rumelhart Prize is an annual award for making a "significant contemporary contribution to the theoretical foundations of human cognition". The prize is $100,000. The Human-Competitive Award is an annual challenge started in 2004 to reward results "competitive with the work of creative and inventive humans". The prize is $10,000. Entries are required to use evolutionary computing. The Intel AI Global Impact Festival is an international annual competition held by Intel Corporation for school, and college students with prizes upwards of $15,000. It is about artificial intelligence technology. There are two age brackets in this competition, 13-18 Age Group, and 18 and Above Age Group. The IJCAI Award for Research Excellence is a biannual award given at the International Joint Conference on Artificial Intelligence (IJCAI) to researchers in artificial intelligence as a recognition of excellence of their career. The 2011 Federal Virtual World Challenge, advertised by The White House and sponsored by the U.S. Army Research Laboratory's Simulation and Training Technology Center, held a competition offering a total of US$52,000 in cash prize awards for general artificial intelligence applications, including "adaptive learning systems, intelligent conversational bots, adaptive behavior (objects or processes)" and more. The Machine Intelligence Prize is awarded annually by the British Computer Society for progress towards machine intelligence. The Kaggle – "the world's largest community of data scientists compete to solve most valuable problems". == Conversational behaviour == The Loebner prize is an annual competition to determine the best Turing test competitors. The winner is the computer system that, in the judges' opinions, demonstrates the "most human" conversational behaviour, they have an additional prize for a system that in their opinion passes a Turing test. This second prize has not yet been awarded. == Automatic control == === Pilotless aircraft === The International Aerial Robotics Competition is a long-running event begun in 1991 to advance the state of the art in fully autonomous air vehicles. This competition is restricted to university teams (although industry and governmental sponsorship of teams is allowed). Key to this event is the creation of flying robots which must complete complex missions without any human intervention. Successful entries are able to interpret their environment and make real-time decisions based only on a high-level mission directive (e.g., "find a particular target inside a building having certain characteristics which is among a group of buildings 3 kilometers from the aerial robot launch point"). In 2000, a $30,000 prize was awarded during the 3rd Mission (search and rescue), and in 2008, $80,000 in prize money was awarded at the conclusion of the 4th Mission (urban reconnaissance). === Driverless cars === The DARPA Grand Challenge is a series of competitions to promote driverless car technology, aimed at a congressional mandate stating that by 2015 one-third of the operational ground combat vehicles of the US Armed Forces should be unmanned. While the first race had no winner, the second awarded a $2 million prize for the autonomous navigation of a hundred-mile trail, using GPS, computers and a sophisticated array of sensors. In November 2007, DARPA introduced the DARPA Urban Challenge, a sixty-mile urban area race requiring vehicles to navigate through traffic. In November 2010 the US Armed Forces extended the competition with the $1.6 million prize Multi Autonomous Ground-robotic International Challenge to consider cooperation between multiple vehicles in a simulated-combat situation. Roborace will be a global motorsport championship with autonomously driving, electric vehicles. The series will be run as a support series during the Formula E championship for electric vehicles. This will be the first global championship for driverless cars. == Data-mining and prediction == The Netflix Prize was a competition for the best collaborative filtering algorithm that predicts user ratings for films, based on previous ratings. The competition was held by Netflix, an online DVD-rental service. The prize was $1,000,000. The Pittsburgh Brain Activity Interpretation Competition will reward analysis of fMRI data "to predict what individuals perceive and how they act and feel in a novel Virtual Reality world involving searching for and collecting objects, interpreting changing instructions, and avoiding a threatening dog." The prize in 2007 was $22,000. The Face Recognition Grand Challenge (May 2004 to March 2006) aimed to promote and advance face recognition technology. The American Meteorological Society's artificial intelligence competition involves learning a classifier to characterise precipitation based on meteorological analyses of environmental conditions and polarimetric radar data. == Cooperation and coordination == === Robot football === The RoboCup and Federation of International Robot-soccer Association (FIRA) are annual international robot soccer competitions. The International RoboCup Federation challenge is by 2050 "a team of fully autonomous humanoid robot soccer players shall win the soccer game, comply with the official rule of the FIFA, against the winner of the most recent World Cup." == Logic, reasoning and knowledge representation == The Herbrand Award is a prize given by Conference on Automated Deduction (CADE) Inc. to honour persons or groups for important contributions to the field of automated deduction. The prize is $1000. The CADE ATP System Competition (CASC) is a yearly competition of fully automated theorem provers for classical first order logic associated with the Conference on Automated Deduction (CADE) and International Joint Conference on Automated Reasoning (IJCAR). The competition was part of the Alan Turing Centenary Conference in 2012, with total prizes of 9000 GBP given by Google. The SUMO prize is an annual prize for the best open source ontology extension of the Suggested Upper Merged Ontology (SUMO), a formal theory of terms and logical definitions describing the world. The prize is $3000. The Hutter Prize for lossless compression of human knowledge is a cash prize which rewards compression improvements on a specific 100 MB English text file. The prize awards 500 euros for each one percent improvement, up to €50,000. The organizers believe that text compression and AI are equivalent problems and 3 prizes have been given, at around € 2k. The Cyc TPTP Challenge is a competition to develop reasoning methods for the Cyc comprehensive ontology and database of everyday common sense knowledge. The prize is 100 euros for "each winner of two related challenges". The Eternity II challenge was a constraint satisfaction problem very similar to the Tetravex game. The objective is to lay 256 tiles on a 16x16 grid while satisfying a number of constraints. The problem is known to be NP-complete. The prize was US$2,000,000. The competition ended in December 2010. == Games == The World Computer Chess Championship has been held since 1970. The International Computer Games Association continues to hold an annual Computer Olympiad which includes this event plus computer competitions for many other games. The Ing Prize was a substantial money prize attached to the World Computer Go Congress, starting from 1985 and expiring in 2000. It was a graduated set of handicap challenges against young professional players with increasing prizes as the handicap was lowered. At the time it expired in 2000, the unclaimed prize was 400,000 NT dollars for winning a 9-stone handicap match. The AAAI General Game Playing Competition is a competition to develop programs that are effective at general game playing. Given a definition of a game, the program must play it effectively without human intervention. Since the game is not known in advance the competitors cannot especially adapt their programs to a particular scenario. The prize in 2006 and 2007 was $10,000. The General Video Game AI Competition (GVGAI) poses the problem of creating artificial intelligence that can play a wide, and in principle unlimited, range of games. Concretely, it tackles the problem of devising an algorithm that is able to play any game it is given, even if the game is not known a priori. Additionally, the contests poses the challenge of creating level and rule generators for any game is given. This area of study can be seen as an approximation of General Artificial Intelligence, with very little room for game dependent heuristics. The competition runs yearly in different tracks: single player planning, two-player planning, single player learning, level and rule generation, and each track prizes ranging from 200 to 500 US dollars for winners and runner-ups. The 2007 Ultimate Computer Ches

    Read more →
  • Squirrel AI

    Squirrel AI

    Squirrel Ai Learning is an international educational technology company that specializes in intelligent adaptive learning and was one of the first companies in the world to offer large scale AI-powered adaptive education solutions. == Methodology == Squirrel Ai Learning uses artificial intelligence to tailor lesson plans to each individual student. The company's AI researchers have access to the world's largest student databases, which are used to train the AI algorithms. Squirrel Ai Learning works with teachers to identify the most fine-grained possible concepts ("knowledge points") for a course in order to precisely target learning gaps. For example, middle school mathematics is broken into over 10,000 points such as rational numbers, the properties of a triangle, and the Pythagorean theorem. Each point is linked to related items, forming a "knowledge graph". Each knowledge point is addressed by videos, examples and practice problems. A textbook might address 3,000 points; ALEKS, another adaptive learning platform, uses 1,000. Each student begins with a diagnostic test to identify where to begin their learning. The system continues to refine its graph as more students proceed. Learning is not student-directed. The system decides the order of topics. == History and milestones == Squirrel Ai Learning was founded by Derek Haoyang Li in 2014. In March, 2017, The Squirrel Ai Intelligent Adaptive Learning System (IALS) was launched. IALS utilizes artificial intelligence to customize lessons, practice and evaluations for each individual student. In 2018, Squirrel Ai Learning established a joint research lab of AI adaptive learning with the institute of Automation of the Chinese Academy of Sciences. By 2019, Squirrel Ai Learning had opened 2,000 learning centers in 200 cities and registered over a million students in Asia. In 2019, Squirrel Ai Learning opened a research lab in partnership with Carnegie Mellon University. As of 2019, Squirrel Ai Learning had raised over $180 million in funding and in 2018 it surpassed $1 billion in valuation. In 2020, Squirrel Ai Learning launched the $1 million AAAI Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity in partnership with AAAI. The inaugural award was given to Regina Barzilay for her work developing machine learning models to address drug synthesis and early-stage breast cancer diagnosis. In 2020, Squirrel Ai Learning established strategic partnership with DingTalk, Alibaba Group. As of 2021, Squirrel Ai Learning had served over 60,000 public schools, in over 1200 cities in Asia. Squirrel Ai plans to start offering its services in the United States in 2026. The American arm is separate from the Chinese company to avoid regulatory hurdles. As of January 2026, it had set up an "independent technology platform" in the US. == Recognition == Squirrel Ai Learning has gained recognition both in Asia and internationally including: Squirrel Ai Learning was named one of the World's Top 30 AI application case in the 2018 Synced Machine Intelligence Awards. In June 2019, Squirrel Ai Learning was named as one of the 50 smartest companies in China by MIT technology review. Squirrel Ai Learning won the GITEX 2019 Best Education Technology Award. In 2020, Squirrel Ai Learning won the UNESCO AI Innovation Award. Squirrel Ai Learning was listed in the 2020 CB Insight's AI 100, CB Insights' annual ranking of the 100 most promising AI startups in the world. Squirrel Ai Learning won Edtech Review's Best AI in Education Company of the Year award 2020.

    Read more →
  • ImageNet

    ImageNet

    The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes. == History == AI researcher Fei-Fei Li began working on the idea for ImageNet in 2006. At a time when most AI research focused on models and algorithms, Li wanted to expand and improve the data available to train AI algorithms. In 2007, Li met with Princeton professor Christiane Fellbaum, one of the creators of WordNet, to discuss the project. As a result of this meeting, Li went on to build ImageNet starting from the roughly 22,000 nouns of WordNet and using many of its features. She was also inspired by a 1987 estimate that the average person recognizes roughly 30,000 different kinds of objects. As an assistant professor at Princeton, Li assembled a team of researchers to work on the ImageNet project. They used Amazon Mechanical Turk to help with the classification of images. Labeling started in July 2008 and ended in April 2010. It took 49K workers from 167 countries filtering and labeling over 160M candidate images. They had enough budget to have each of the 14 million images labelled three times. The original plan called for 10,000 images per category, for 40,000 categories at 400 million images, each verified 3 times. They found that humans can classify at most 2 images/sec. At this rate, it was estimated to take 19 human-years of labor (without rest). They presented their database for the first time as a poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida, titled "ImageNet: A Preview of a Large-scale Hierarchical Dataset". The poster was reused at Vision Sciences Society 2009. In 2009, Alex Berg suggested adding object localization as a task. Li approached PASCAL Visual Object Classes contest in 2009 for a collaboration. It resulted in the subsequent ImageNet Large Scale Visual Recognition Challenge starting in 2010, which has 1000 classes and object localization, as compared to PASCAL VOC which had just 20 classes and 19,737 images (in 2010). === Significance for deep learning === On 30 September 2012, a convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner-up. Using convolutional neural networks was feasible due to the use of graphics processing units (GPUs) during training, an essential ingredient of the deep learning revolution. According to The Economist, "Suddenly people started to pay attention, not just within the AI community but across the technology industry as a whole." In 2015, AlexNet was outperformed by Microsoft's very deep CNN with over 100 layers, which won the ImageNet 2015 contest, having 3.57% error on the test set. Andrej Karpathy estimated in 2014 that with concentrated effort, he could reach 5.1% error rate, and ~10 people from his lab reached ~12-13% with less effort. It was estimated that with maximal effort, a human could reach 2.4%. == Dataset == ImageNet crowdsources its annotation process. Image-level annotations indicate the presence or absence of an object class in an image, such as "there are tigers in this image" or "there are no tigers in this image". Object-level annotations provide a bounding box around the (visible part of the) indicated object. ImageNet uses a variant of the broad WordNet schema to categorize objects, augmented with 120 categories of dog breeds to showcase fine-grained classification. In 2012, ImageNet was the world's largest academic user of Mechanical Turk. The average worker identified 50 images per minute. The original plan of the full ImageNet would have roughly 50M clean, diverse and full resolution images spread over approximately 50K synsets. This was not achieved. The summary statistics given on April 30, 2010: Total number of non-empty synsets: 21841 Total number of images: 14,197,122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million === Categories === The categories of ImageNet were filtered from the WordNet concepts. Each concept, since it can contain multiple synonyms (for example, "kitty" and "young cat"), so each concept is called a "synonym set" or "synset". There were more than 100,000 synsets in WordNet 3.0, majority of them are nouns (80,000+). The ImageNet dataset filtered these to 21,841 synsets that are countable nouns that can be visually illustrated. Each synset in WordNet 3.0 has a "WordNet ID" (wnid), which is a concatenation of part of speech and an "offset" (a unique identifying number). Every wnid starts with "n" because ImageNet only includes nouns. For example, the wnid of synset "dog, domestic dog, Canis familiaris" is "n02084071". The categories in ImageNet fall into 9 levels, from level 1 (such as "mammal") to level 9 (such as "German shepherd"). === Image format === The images were scraped from online image search (Google, Picsearch, MSN, Yahoo, Flickr, etc) using synonyms in multiple languages. For example: German shepherd, German police dog, German shepherd dog, Alsatian, ovejero alemán, pastore tedesco, 德国牧羊犬. ImageNet consists of images in RGB format with varying resolutions. For example, in ImageNet 2012, "fish" category, the resolution ranges from 4288 x 2848 to 75 x 56. In machine learning, these are typically preprocessed into a standard constant resolution, and whitened, before further processing by neural networks. For example, in PyTorch, ImageNet images are by default normalized by dividing the pixel values so that they fall between 0 and 1, then subtracting by [0.485, 0.456, 0.406], then dividing by [0.229, 0.224, 0.225]. These are the mean and standard deviations for ImageNet, so this whitens the input data. === Labels and annotations === Each image is labelled with exactly one wnid. Dense SIFT features (raw SIFT descriptors, quantized codewords, and coordinates of each descriptor/codeword) for ImageNet-1K were available for download, designed for bag of visual words. The bounding boxes of objects were available for about 3000 popular synsets with on average 150 images in each synset. Furthermore, some images have attributes. They released 25 attributes for ~400 popular synsets: Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow Pattern: spotted, striped Shape: long, round, rectangular, square Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet === ImageNet-21K === The full original dataset is referred to as ImageNet-21K. ImageNet-21k contains 14,197,122 images divided into 21,841 classes. Some papers round this up and name it ImageNet-22k. The full ImageNet-21k was released in Fall of 2011, as fall11_whole.tar. There is no official train-validation-test split for ImageNet-21k. Some classes contain only 1-10 samples, while others contain thousands. === ImageNet-1K === There are various subsets of the ImageNet dataset used in various context, sometimes referred to as "versions". One of the most highly used subsets of ImageNet is the "ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset". This is also referred to in the research literature as ImageNet-1K or ILSVRC2017, reflecting the original ILSVRC challenge that involved 1,000 classes. ImageNet-1K contains 1,281,167 training images, 50,000 validation images and 100,000 test images. Each category in ImageNet-1K is a leaf category, meaning that there are no child nodes below it, unlike ImageNet-21K. For example, in ImageNet-21K, there are some images categorized as simply "mammal", whereas in ImageNet-1K, there are only images categorized as things like "German shepherd", since there are no child-words below "German shepherd". === Later developments === In the WordNet they built ImageNet on, there were 2832 synsets in the "person" subtree. During 2018--2020 period, they removed the download of the ImageNet-21k as they went through extensive filtering in these person synsets. Out of these 2832 synsets, 1593 were deemed "potentially offensive". Out of the remaining 1239, 1081 were deemed not really "visual". The result was that only 158 syn

    Read more →
  • Interim Measures for the Management of Generative AI Services

    Interim Measures for the Management of Generative AI Services

    The Interim Measures for the Management of Generative AI Services (Chinese: 生成式人工智能服务管理暂行办法; pinyin: Shēngchéng shì réngōng zhìnéng fúwù guǎnlǐ zànxíng bànfǎ) are a set of regulations governing public-facing generative artificial intelligence services in China. Issued on 10 July 2023 and effective from 15 August 2023, they were China's first binding regulation specifically targeting generative AI. They have been described as among the earliest such regulations adopted by any country. The measures were jointly issued by the Cyberspace Administration of China (CAC) and six other national bodies: the National Development and Reform Commission, the Ministry of Education, the Ministry of Science and Technology, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the National Radio and Television Administration. Among the measures' most prominent requirements is that generative AI services must uphold Core Socialist Values and must not generate content that could subvert state power, harm national security, or undermine social stability. The measures also require providers of public-facing generative AI services to undergo security assessments and register their algorithms with the CAC. As of December 2025, 748 generative AI services had completed the filing process at the national level. == Background == The Interim Measures build on two earlier sets of regulations targeting specific algorithm applications. The Administrative Provisions on Algorithm Recommendation for Internet Information Services, effective from March 2022, established China's algorithm registry and required providers of recommendation algorithms with "public opinion properties or social mobilization capabilities" to file with the CAC and undergo security assessments. The Administrative Provisions on Deep Synthesis of Internet Information Services, effective from January 2023, extended similar requirements to algorithms used for generating synthetic media such as deepfakes. In April 2023, the CAC released a draft of the generative AI regulation for public comment. The draft included several requirements that attracted attention, including that generated content should "embody Core Socialist Values" and that training data should be "true and accurate". The public consultation period ran until May 2023. The final version, published in July 2023, was substantially revised from the draft. According to an analysis by the Future of Privacy Forum, changes appeared to reflect feedback from industry stakeholders including Baidu, Xiaomi, SenseTime, and others, as well as input from government-affiliated research institutes. The final measures adopted a more permissive tone, with the CAC describing its approach as "inclusive and prudent" (包容审慎) and emphasising "classified and graded" (分类分级) supervision. == Scope == The measures apply to services that use generative AI technology to provide text, images, audio, video, or other content to the public within mainland China (Article 2). They do not apply to organisations that develop or use generative AI internally without offering services to the domestic public, such as industry associations, enterprises, and research institutions. Overseas providers whose services are accessible to users in China are also subject to the measures. == Key provisions == === Content requirements === Article 4 sets out the core content obligations. Providers and users of generative AI services must uphold the Core Socialist Values. The measures prohibit generating content that incites subversion of national sovereignty or the socialist system, endangers national security or the nation's image, incites separatism, promotes terrorism or extremism, promotes ethnic hatred or discrimination, or contains violence, obscenity, or false information prohibited by law. These content prohibitions largely mirror those in Article 12 of the Cybersecurity Law and in prior regulations governing online content. Article 4 also requires that models be designed and trained to avoid discrimination, that services respect intellectual property rights, and that providers take effective measures to improve the transparency and accuracy of generated content. === Training data and labelling === Article 7 requires providers to ensure that training data is of high quality and legitimately sourced, and that it does not infringe upon intellectual property rights. Where personal information is used, consent must be obtained. The final version of this provision removed language from the draft that would have held providers responsible for the "legitimacy" of all pretraining data, replacing it with a requirement to "employ effective measures to improve the quality of training data". Article 8 requires providers to establish labelling rules for training data and to conduct quality assessments of data annotations. Article 12 requires that generated images, videos, and other synthetic content be labelled as AI-generated. === User rights and privacy === Article 11 requires providers to protect user privacy, to minimise the collection and retention of personal data, and to refrain from unlawfully sharing user information. Users have the right to request review, correction, or deletion of their personal information. Article 10 requires providers to take measures to prevent excessive dependence on or addiction to generative AI services by minors. === Security assessment and algorithm filing === Article 17 requires that providers of generative AI services with "public opinion properties or the capacity for social mobilization" (具有舆论属性或者社会动员能力) carry out security assessments and complete algorithm filing procedures in accordance with the Administrative Provisions on Algorithm Recommendation for Internet Information Services. == Implementation == === Algorithm filing process === In practice, the filing requirements under the Interim Measures have developed into a two-tier process. The first tier is the standard algorithm filing (算法备案) under the pre-existing Algorithm Recommendation Provisions, which involves submitting information about an algorithm's design, purpose, and data sources to the CAC. This process is primarily a registration mechanism. For public-facing generative AI products, there is an additional, more rigorous process commonly referred to as the "large model filing" (大模型备案). This involves submitting a security self-assessment report, data annotation rules, a keyword blocking list, and evaluation test question sets. The process includes technical testing at the provincial level, followed by review at the national CAC level. The algorithm filing targets specific algorithms, while the large model filing evaluates the broader system architecture, training data, model parameters, and potential social impact. The CAC publishes lists of generative AI services that have successfully completed the filing process. The first such list was published on 2 April 2024. According to the CAC's year-end announcements, 302 generative AI services had completed national-level filing by the end of 2024 (of which 238 were new that year), alongside 105 applications that completed local-level registration. By the end of 2025, the cumulative total had risen to 748 national-level filings and 435 local-level registrations. === Content compliance and testing === According to the Carnegie Endowment, the CAC has conducted compliance audits of generative AI services with a particular focus on ensuring appropriate responses to queries about politically sensitive topics. The large model filing process requires providers to pass both provincial-level and national-level technical testing before their services can be made available to the public. On 1 March 2024, the National Technical Committee 260 on Cybersecurity (TC260) published TC260-003, the Basic Security Requirements for Generative AI Services (生成式人工智能服务安全基本要求), a technical standard that provides detailed guidance on the security assessments required under the Interim Measures. The standard covers requirements for training data safety, model security, and content safety evaluation, and is used as a reference for the filing process. == Analysis == === Relationship to broader Chinese internet regulation === The content requirements in the Interim Measures extend China's existing framework for online information control to generative AI. Legal scholars have noted that the "Core Socialist Values" provision and the specific content prohibitions are consistent with longstanding requirements imposed on internet platforms under the Cybersecurity Law and related regulations. The Asia Society Policy Institute has described the Chinese government's highest regulatory priority in this area as retaining control of information, noting that content-related obligations receive stricter enforcement than other provisions. === Nature of the filing system === The character of the filing system has been debated by scholars. Angela Huyue Zh

    Read more →
  • Squirrel AI

    Squirrel AI

    Squirrel Ai Learning is an international educational technology company that specializes in intelligent adaptive learning and was one of the first companies in the world to offer large scale AI-powered adaptive education solutions. == Methodology == Squirrel Ai Learning uses artificial intelligence to tailor lesson plans to each individual student. The company's AI researchers have access to the world's largest student databases, which are used to train the AI algorithms. Squirrel Ai Learning works with teachers to identify the most fine-grained possible concepts ("knowledge points") for a course in order to precisely target learning gaps. For example, middle school mathematics is broken into over 10,000 points such as rational numbers, the properties of a triangle, and the Pythagorean theorem. Each point is linked to related items, forming a "knowledge graph". Each knowledge point is addressed by videos, examples and practice problems. A textbook might address 3,000 points; ALEKS, another adaptive learning platform, uses 1,000. Each student begins with a diagnostic test to identify where to begin their learning. The system continues to refine its graph as more students proceed. Learning is not student-directed. The system decides the order of topics. == History and milestones == Squirrel Ai Learning was founded by Derek Haoyang Li in 2014. In March, 2017, The Squirrel Ai Intelligent Adaptive Learning System (IALS) was launched. IALS utilizes artificial intelligence to customize lessons, practice and evaluations for each individual student. In 2018, Squirrel Ai Learning established a joint research lab of AI adaptive learning with the institute of Automation of the Chinese Academy of Sciences. By 2019, Squirrel Ai Learning had opened 2,000 learning centers in 200 cities and registered over a million students in Asia. In 2019, Squirrel Ai Learning opened a research lab in partnership with Carnegie Mellon University. As of 2019, Squirrel Ai Learning had raised over $180 million in funding and in 2018 it surpassed $1 billion in valuation. In 2020, Squirrel Ai Learning launched the $1 million AAAI Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity in partnership with AAAI. The inaugural award was given to Regina Barzilay for her work developing machine learning models to address drug synthesis and early-stage breast cancer diagnosis. In 2020, Squirrel Ai Learning established strategic partnership with DingTalk, Alibaba Group. As of 2021, Squirrel Ai Learning had served over 60,000 public schools, in over 1200 cities in Asia. Squirrel Ai plans to start offering its services in the United States in 2026. The American arm is separate from the Chinese company to avoid regulatory hurdles. As of January 2026, it had set up an "independent technology platform" in the US. == Recognition == Squirrel Ai Learning has gained recognition both in Asia and internationally including: Squirrel Ai Learning was named one of the World's Top 30 AI application case in the 2018 Synced Machine Intelligence Awards. In June 2019, Squirrel Ai Learning was named as one of the 50 smartest companies in China by MIT technology review. Squirrel Ai Learning won the GITEX 2019 Best Education Technology Award. In 2020, Squirrel Ai Learning won the UNESCO AI Innovation Award. Squirrel Ai Learning was listed in the 2020 CB Insight's AI 100, CB Insights' annual ranking of the 100 most promising AI startups in the world. Squirrel Ai Learning won Edtech Review's Best AI in Education Company of the Year award 2020.

    Read more →