AI Articles

AI Articles — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Cinema 4D

    Cinema 4D

    Cinema 4D is a 3D software suite developed by the German company Maxon. == Overview == As of R21, only a single version of Cinema 4D is available. It replaces all previous variants, including BodyPaint 3D, and includes all features of the past 'Studio' variant. With R21, all binaries were unified. There is no technical difference between commercial, educational, or demo versions. The difference is now only in licensing. 2014 saw the release of Cinema 4D Lite, which came packaged with Adobe After Effects Creative Cloud 2014. "Lite" acts as an introductory version, with many features withheld. This is part of a partnership between the two companies, where a Maxon-produced plug-in, called Cineware, allows any variant to create a seamless workflow with After Effects. The "Lite" variant is dependent on After Effects CC, needing the latter application running to launch, and is only sold as a package component included with After Effects CC through Adobe. Initially, Cinema 4D was developed for Amiga computers in the early 1990s, and the first three versions of the program were available exclusively for that platform. With v4, however, Maxon began to develop the application for Windows and Macintosh computers as well, citing the wish to reach a wider audience and the growing instability of the Amiga market following Commodore's bankruptcy. It was also released for BeOS. On Linux, Cinema 4D is available as a commandline rendering version. == Modules and older variants == From R12 to R20, Cinema 4D was available in four variants. A core Cinema 4D 'Prime' application, a 'Broadcast' version with additional motion-graphics features, 'Visualize,' which adds functions for architectural design and 'Studio,' which includes all modules. From Release 8 until Release 11.5, Cinema 4D had a modular approach to the application, with the ability to expand upon the core application with various modules. This ended with Release 12, though the functionality of these modules remains in the different flavors of Cinema 4D (Prime, Broadcast, Visualize, Studio) The old modules were: Advanced Render (global illumination/HDRI, caustics, ambient occlusion and sky simulation) BodyPaint 3D (direct painting on UVW meshes; now included in the core. In essence Cinema 4D Core/Prime and the BodyPaint 3D products are identical. The only difference between the two is the splash screen that is shown at startup and the default user interface.) Dynamics (for simulating soft body and rigid body dynamics) Hair (simulates hair, fur, grass, etc.) MOCCA (character animation and cloth simulation) MoGraph (Motion Graphics procedural modelling and animation toolset) NET Render (to render animations over a TCP/IP network in render farms) PyroCluster (simulation of smoke and fire effects) Prime (the core application) Broadcast (adds MoGraph2) Visualize (adds Virtual Walkthrough, Advanced Render, Sky, Sketch and Toon, data exchange, camera matching) Studio (the complete package) == Version history == == Use in industry == A number of films and related works have been modeled and rendered in Cinema 4D, including: == Cinebench == Cinebench is a cross-platform test suite which tests a computer's hardware capabilities. It can be used as a test for Cinema 4D's 3D modeling, animation, motion graphic and rendering performance on multiple CPU cores. The program "target[s] a certain niche and [is] better suited for high-end desktop and workstation platforms". Cinebench is commonly used to demonstrate hardware capabilities at tech shows to show a CPU performance, especially by tech YouTubers and review sites.

    Read more →
  • Simple Knowledge Organization System

    Simple Knowledge Organization System

    Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data. == History == === DESIRE II project (1997–2000) === The most direct ancestor to SKOS was the RDF Thesaurus work undertaken in the second phase of the EU DESIRE project . Motivated by the need to improve the user interface and usability of multi-service browsing and searching, a basic RDF vocabulary for Thesauri was produced. As noted later in the SWAD-Europe workplan, the DESIRE work was adopted and further developed in the SOSIG and LIMBER projects. A version of the DESIRE/SOSIG implementation was described in W3C's QL'98 workshop, motivating early work on RDF rule and query languages: A Query and Inference Service for RDF. === LIMBER (1999–2001) === SKOS built upon the output of the Language Independent Metadata Browsing of European Resources (LIMBER) project funded by the European Community, and part of the Information Society Technologies programme. In the LIMBER project CCLRC further developed an RDF thesaurus interchange format which was demonstrated on the European Language Social Science Thesaurus (ELSST) at the UK Data Archive as a multilingual version of the English language Humanities and Social Science Electronic Thesaurus (HASSET) which was planned to be used by the Council of European Social Science Data Archives CESSDA. === SWAD-Europe (2002–2004) === SKOS as a distinct initiative began in the SWAD-Europe project, bringing together partners from both DESIRE, SOSIG (ILRT) and LIMBER (CCLRC) who had worked with earlier versions of the schema. It was developed in the Thesaurus Activity Work Package, in the Semantic Web Advanced Development for Europe (SWAD-Europe) project. SWAD-Europe was funded by the European Community, and part of the Information Society Technologies programme. The project was designed to support W3C's Semantic Web Activity through research, demonstrators and outreach efforts conducted by the five project partners, ERCIM, the ILRT at Bristol University, HP Labs, CCLRC and Stilo. The first release of SKOS Core and SKOS Mapping were published at the end of 2003, along with other deliverables on RDF encoding of multilingual thesauri and thesaurus mapping. === Semantic web activity (2004–2005) === Following the termination of SWAD-Europe, SKOS effort was supported by the W3C Semantic Web Activity in the framework of the Best Practice and Deployment Working Group. During this period, focus was put both on consolidation of SKOS Core, and development of practical guidelines for porting and publishing thesauri for the Semantic Web. === Development as W3C Recommendation (2006–2009) === The SKOS main published documents — the SKOS Core Guide, the SKOS Core Vocabulary Specification, and the Quick Guide to Publishing a Thesaurus on the Semantic Web — were developed through the W3C Working Draft process. Principal editors of SKOS were Alistair Miles, initially Dan Brickley, and Sean Bechhofer. The Semantic Web Deployment Working Group, chartered for two years (May 2006 – April 2008), put in its charter to push SKOS forward on the W3C Recommendation track. The roadmap projected SKOS as a Candidate Recommendation by the end of 2007, and as a Proposed Recommendation in the first quarter of 2008. The main issues to solve were determining its precise scope of use, and its articulation with other RDF languages and standards used in libraries (such as Dublin Core). === Formal release (2009) === On August 18, 2009, W3C released the new standard that builds a bridge between the world of knowledge organization systems – including thesauri, classifications, subject headings, taxonomies, and folksonomies – and the linked data community, bringing benefits to both. Libraries, museums, newspapers, government portals, enterprises, social networking applications, and other communities that manage large collections of books, historical artifacts, news reports, business glossaries, blog entries, and other items can now use SKOS to leverage the power of linked data. === Historical view of components === SKOS was originally designed as a modular and extensible family of languages, organized as SKOS Core, SKOS Mapping, and SKOS Extensions, and a Metamodel. The entire specification is now complete within the namespace http://www.w3.org/2004/02/skos/core#. == Overview == In addition to the reference itself, the SKOS Primer (a W3C Working Group Note) summarizes the Simple Knowledge Organization System. The SKOS defines the classes and properties sufficient to represent the common features found in a standard thesaurus. It is based on a concept-centric view of the vocabulary, where primitive objects are not terms, but abstract notions represented by terms. Each SKOS concept is defined as an RDF resource. Each concept can have RDF properties attached, including: one or more preferred index terms (at most one in each natural language) alternative terms or synonyms definitions and notes, with specification of their language Concepts can be organized in hierarchies using broader-narrower relationships, or linked by non-hierarchical (associative) relationships. Concepts can be gathered in concept schemes, to provide consistent and structured sets of concepts, representing whole or part of a controlled vocabulary. === Element categories === The principal element categories of SKOS are concepts, labels, notations, documentation, semantic relations, mapping properties, and collections. The associated elements are listed in the table below. === Concepts === The SKOS vocabulary is based on concepts. Concepts are the units of thought—ideas, meanings, or objects and events (instances or categories)—which underlie many knowledge organization systems. As such, concepts exist in the mind as abstract entities which are independent of the terms used to label them. In SKOS, a Concept (based on the OWL Class) is used to represent items in a knowledge organization system (terms, ideas, meanings, etc.) or such a system's conceptual or organizational structure. A ConceptScheme is analogous to a vocabulary, thesaurus, or other way of organizing concepts. SKOS does not constrain a concept to be within a particular scheme, nor does it provide any way to declare a complete scheme—there is no way to say the scheme consists only of certain members. A topConcept is (one of) the upper concept(s) in a hierarchical scheme. === Labels and notations === Each SKOS label is a string of Unicode characters, optionally with language tags, that are associated with a concept. The prefLabel is the preferred human-readable string (maximum one per language tag), while altLabel can be used for alternative strings, and hiddenLabel can be used for strings that are useful to associate, but not meant for humans to read. A SKOS notation is similar to a label, but this literal string has a datatype, like integer, float, or date; the datatype can even be made up (see 6.5.1 Notations, Typed Literals and Datatypes in the SKOS Reference). The notation is useful for classification codes and other strings not recognizable as words. === Documentation === The Documentation or Note properties provide basic information about SKOS concepts. All the properties are considered a type of skos:note; they just provide more specific kinds of information. The property definition, for example, should contain a full description of the subject resource. More specific note types can be defined in a SKOS extension, if desired. A query for skos:note ? will obtain all the notes about , including definitions, examples, and scope, history and change, and editorial documentation. Any of these SKOS Documentation properties can refer to several object types: a literal (e.g., a string); a resource node that has its own properties; or a reference to another document, for example using a URI. This enables the documentation to have its own metadata, like creator and creation date. Specific guidance on SKOS documentation properties can be found in the SKOS Primer Documentary Notes. === Semantic relations === SKOS semantic relations are intended to provide ways to declare relationships between concepts within a concept scheme. While there are no restrictions precluding their use with two concepts from separate schemes, this is discouraged because it is likely to overstate what can be known about the two schemes, and perhaps link them inappropriately. The property related simply makes an association relationship between two concepts; no hierarchy or generality relation is implied. The properties broader and narrower are used to assert a direct hierarchical link between two concepts. The meaning may be unexpected; the relat

    Read more →
  • Moral Machine

    Moral Machine

    Moral Machine is an online platform, developed by Iyad Rahwan's Scalable Cooperation group at the Massachusetts Institute of Technology, that generates moral dilemmas and collects information on the decisions that people make between two destructive outcomes. The platform is the idea of Iyad Rahwan and social psychologists Azim Shariff and Jean-François Bonnefon, who conceived of the idea ahead of the publication of their article about the ethics of self-driving cars. The key contributors to building the platform were MIT Media Lab graduate students Edmond Awad and Sohan Dsouza. The presented scenarios are often variations of the trolley problem, and the information collected would be used for further research regarding the decisions that machine intelligence must make in the future. For example, as artificial intelligence plays an increasingly significant role in autonomous driving technology, research projects like Moral Machine help to find solutions for challenging life-and-death decisions that will face self-driving vehicles. Moral Machine was active from January 2016 to July 2020. The Moral Machine continues to be available on their website for people to experience. == The experiment == The Moral Machine was an ambitious project; it was the first attempt at using such an experimental design to test a large number of humans in over 200 countries worldwide. The study was approved by the Institute Review Board (IRB) at Massachusetts Institute of Technology (MIT). The setup of the experiment asks the viewer to make a decision on a single scenario in which a self-driving car is about to hit pedestrians. The user can decide to have the car either swerve to avoid hitting the pedestrians or keep going straight to preserve the lives it is transporting. Participants can complete as many scenarios as they want to, however the scenarios themselves are generated in groups of thirteen. Within this thirteen, a single scenario is entirely random while the other twelve are generated from a space in a database of 26 million different possibilities. They are chosen with two dilemmas focused on each of six dimensions of moral preferences: character gender, character age, character physical fitness, character social status, character species, and character number. The experiment setup remains the same throughout multiple scenarios but each scenario tests a different set of factors. Most notably, the characters involved in the scenario are different in each one. Characters may include ones such as: Stroller, girl, boy, pregnant, Male Doctor, Female Doctor, Female Athlete, Executive Female, Male Athlete, Executive Male, Large Woman, Large Man, homeless, old man, old woman, dog, criminal, and a cat. Through these different characters researchers were able to understand how a wide variety of people will judge scenarios based on those involved. == Analysis == The Moral Machine collected 40 million moral decisions from 4 million participants in 233 countries, analysis of which revealed trends within individual countries and humanity as a whole. It tested for nine factors: preference for sparing humans versus pets, passengers versus pedestrians, men versus women, young versus elderly, fit versus overweight, higher versus lower social status, jaywalkers versus law abiders, larger versus smaller groups, and inaction (i.e. staying on course) versus swerving. Globally, participants favored human lives over lives of animals like dogs and cats. They preferred to spare more lives if possible, and younger lives as opposed to older. Babies were most often spared with cats being the least spared. In terms of gender variations, people tended to spare men over women for doctors and the elderly. All countries generally shared the preference to spare pedestrians over passengers and law-abiders over criminals. Participants from less wealthy countries showed a higher tendency of sparing pedestrians who crossed illegally compared to those from more wealthy and developed countries. This is most likely due to their experience living in a society where individuals are more likely to deviate from rules due to less stringent enforcement of laws. Countries of higher economic inequality overwhelmingly prefer to save wealthier individuals over poorer ones. === Cultural differences === Researchers subdivided 130 countries with similar results into three ‘cultural clusters’. North America and European countries with significant Christian populations had a higher preference for inaction on the part of the driver and thus had less of a preference for sparing pedestrians as compared to other clusters. East Asian and Islamic countries, together constituting the second cluster, did not have as much preference to spare younger humans compared to the other two clusters and had a higher preference for sparing law-abiding humans. Latin America and Francophone countries had a higher preference for sparing women, the young, the fit, and those of higher status, but a lower preference for sparing humans over pets or other animals. Individualistic cultures tended to spare larger groups, and collectivist cultures had a stronger preference for sparing the lives of older people. For instance, China ranked far below the world average for preference to spare the younger over elderly, while the average respondent from the US exhibited a much higher tendency to save younger lives and larger groups. == Applications of the data == The findings from the moral machine can help decision makers when designing self-driving automotive systems. Designers must make sure that these vehicles are able to solve problems on the road that aligns with the moral values of humans around it. This is a challenge because of the complex nature of humans who may all make different decisions based on their personal values. However, by collecting a large amount of decisions from humans all over the world, researchers can begin to understand patterns in the context of a particular culture, community, and people. == Other features == The Moral Machine was deployed in June 2016. In October 2016, a feature was added that offered users the option to fill a survey about their demographics, political views, and religious beliefs. Between November 2016 and March 2017, the website was progressively translated into nine languages in addition to English (Arabic, Chinese, French, German, Japanese, Korean, Portuguese, Russian, and Spanish). Overall, the Moral Machine offers four different modes, with the focus being on the data-gathering feature of the website, called the Judge mode. This means that the Moral Machine, in addition to providing their own scenarios for users to judge, also invites users to create their own scenarios to be submitted and approved so that other people may also judge those scenarios. Data is also open sourced for anyone to explore via an interactive map that is featured on the Moral Machine website. == In the literature == Studies and research on the Moral Machine have taken a wide variety of approaches. However, theological examinations of the topic are still scarce where two bodies of work that examine such perspective currently exist in this regard: One is Buddhist while the other is Christian.

    Read more →
  • ComfyUI

    ComfyUI

    ComfyUI is an open source, node-based program that allows users to generate images from a series of text prompts. It uses free diffusion models such as Stable Diffusion as the base model for its image capabilities combined with other tools such as ControlNet and LCM Low-rank adaptation with each tool being represented by a node in the program. == History == ComfyUI was released on GitHub in January 2023. According to comfyanonymous, the creator, a major goal of the project was to improve on existing software designs in terms of the user interface. The creator had been involved with Stability AI but by 3 June 2024 that involvement had ended and an organization called Comfy Org had been created along with the core developers. In July 2024, Nvidia announced support for ComfyUI within its RTX Remix modding software. In August 2024, support was added for the Flux diffusion model developed by Black Forest Labs, and Comfy Org joined the Open Model Initiative created by the Linux Foundation. As of Sept 2025, the project has 89.2k stars on GitHub. ComfyUI is one of the most popular user interfaces for Stable Diffusion, along with Automatic1111. == Features == ComfyUI's main feature is that it is node based. Each node has a function such as "load a model" or "write a prompt". The nodes are connected to form a control-flow graph called a workflow. When a prompt is queued, a highlighted frame appears around the currently executing node, starting from "load checkpoint" and ending with the final image and its save location. Workflows commonly consist of tens of nodes, forming a complex directed acyclic graph. Node types include loading a model, specifying prompts, samplers, schedulers, VAE decoders, face restoration and upscaling models, LoRAs, embeddings, and ControlNets. Several samplers are supported, such as Euler, Euler_a, dpmpp_2m_sde and dpmpp_3m_sde. Workflows can be saved to a file, allowing users to re-use node workflows and share them with other users. The file format for the workflows is in JSON and can be embedded in the generated images. Users have also created custom extensions to the base system which are exposed as new nodes, such as the extension for AnimateDiff, which aims to create videos. ComfyUI has been described as more complex compared to other diffusion UIs such as Automatic1111. A default node group is also included with the program. As of December 2024, 1,674 nodes were supported. ComfyUI Supports multiple text-to-image models including, Stable Diffusion, Flux and Tencent's Hunyuan-DiT, as well as custom models from Civitai like Pony. == LLMVision extension compromise == In June 2024, a hacker group called "Nullbulge" compromised an extension of ComfyUI to add malicious code to it. The compromised extension, called ComfyUI_LLMVISION, was used for integrating the interface with AI language models GPT-4 and Claude 3, and was hosted on GitHub. Nullbulge hosted a list of hundreds of ComfyUI users' login details across multiple services on its website, while users of the extension reported receiving numerous login notifications. vpnMentor conducted security research on the extension and claimed it could "steal crypto wallets, screenshot the user’s screen, expose device information and IP addresses, and steal files that contain certain keywords or extensions". Nullbulge's website claims they targeted users who committed "one of our sins", which included AI-art generation, art theft, promoting cryptocurrency, and any other kind of theft from artists such as from Patreon. They claimed that they were "a collective of individuals who believe in the importance of protecting artists' rights and ensuring fair compensation for their work" and that they believed that "AI-generated artwork is detrimental to the creative industry and should be discouraged".

    Read more →
  • Sports Card Investor

    Sports Card Investor

    Sports Card Investor is an American sports collectibles media platform and mobile application founded by Geoff Wilson. The platform provides market data, analysis, and editorial content focused on sports trading cards and related collectibles. It operates a website, mobile app, and digital media channels covering developments in the sports card industry. The company posted its first YouTube video in July 2019, shortly before a period of rapid growth in sports card collecting in the early 2020s, which was marked by increased trading volumes and mainstream media attention. == History == Sports Card Investor was founded by Geoff Wilson, an entrepreneur and collector who began publishing sports card–related content online before launching the platform's dedicated app and subscription tools. In February 2020, the company launched Market Movers, the first website and app to chart sports card prices and track card collections. The platform expanded its media presence through partnerships and distribution agreements. In 2023, Yahoo Sports announced a new collectibles coverage initiative that included additional content from Sports Card Investor. In February 2024, the Sports Card Investor studio relocated to CardsHQ in Atlanta, Georgia, and visitors to the facility can watch Sports Card Investor videos being filmed. == Platform and content == The Sports Card Investor app provides users with pricing data, portfolio-tracking tools, and market-trend analysis for trading cards. The company also produces video and editorial content discussing market developments, grading trends, and major card releases. Coverage in industry publications has referenced Sports Card Investor in discussions about shifts in sports card licensing rights and hobby market reactions. == Industry context == The growth of Sports Card Investor coincided with a broader resurgence in trading card markets, including record sales and expanded retail presence. Mainstream outlets have cited the company and its founder in reporting on collectibles investing trends, grading practices, and market volatility. The Sports Card Investor app has attracted over 37,000 reviews on the Apple App Store, reflecting its strong user engagement within the sports card community.

    Read more →
  • Artificial intelligence safety institute

    Artificial intelligence safety institute

    An artificial intelligence safety institute is a type of state-backed organization aiming to evaluate and ensure the safety of advanced artificial intelligence (AI) models, also called frontier AI models. AI safety gained prominence in 2023, notably with public declarations about potential existential risks from AI. During the AI Safety Summit in November 2023, the United Kingdom and the United States both created their own AISI. During the AI Seoul Summit in May 2024, international leaders agreed to form a network of AI Safety Institutes, comprising institutes from the UK, the US, Japan, France, Germany, Italy, Singapore, South Korea, Australia, Canada and the European Union. In 2025, the UK's AI Safety Institute was renamed the "AI Security Institute", and its US counterpart became the Center for AI Standards and Innovation (CAISI). == Timeline == In 2023, Rishi Sunak, the Prime Minister of the United Kingdom, expressed his intention to "make the UK not just the intellectual home but the geographical home of global AI safety regulation" and unveiled plans for an AI Safety Summit. He emphasized the need for independent safety evaluations, stating that AI companies cannot "mark their own homework". During the summit in November 2023, the UK AISI was officially established as an evolution of the Frontier AI Taskforce, and the US AISI as part of the National Institute of Standards and Technology. Japan followed by launching an AI safety institute in February 2024. Politico reported in April 2024 that many AI companies had not shared pre-deployment access to their most advanced AI models for evaluation. Meta's president of global affairs Nick Clegg said that many AI companies were waiting for the UK and the US AI Safety Institutes to work out common evaluation rules and procedures. An agreement was indeed concluded between the UK and the US in April 2024 to collaborate on at least one joint safety test. Initially established in London, the UK AI Safety Institute announced in May 2024 that it would open an office in San Francisco, where many AI companies are located. This is part of a plan to "set new, international standards on AI safety", according to UK's technology minister Michele Donelan. == International network == At the AI Seoul Summit in May 2024, the European Union and other countries agreed to create their own AI safety institutes, forming an international network. In July 2025, the international network held an exercise to explore issues with evaluating AI agents, especially when it came to leaking sensitive information or cybersecurity. Network members also met at NeurIPS 2025 in the city of San Diego. == Specific institutes == === Australia === The Albanese government announced the creation of the Australian AI Safety Institute on 25 November 2025. === Canada === Canada announced in April 2024 that it would create an AI safety institute, and such an institute was officially founded in November 2024. The institute is housed under Innovation, Science and Economic Development Canada, though it also partners with the Canadian Institute for Advanced Research (CIFAR). It is supported by a budget of CA$50,000,000 for a five-year timespan. === European Union === The EU AI office, founded in May 2024, is a member of the international network of AI safety institutes. === France === On 31 January 2025, the government of France created the Institut national pour l'évaluation et la sécurité de l'intelligence artificielle (INESIA), or the National Institute for AI Evaluation and Security. === India === The Ministry of Electronics and Information Technology held consultations with Meta Platforms, Google, Microsoft, IBM, OpenAI, NASSCOM, Broadband India Forum, Software Alliance, Indian Institutes of Technology (IITs), The Quantum Hub, Digital Empowerment Foundation, and Access Now on October 7, 2024, in relation to the establishment of the AI Safety Institute. The decision was made to shift focus from regulation to standards-setting, risk identification, and damage detection—all of which require interoperable technologies. The AISI may spend the ₹20 crore allotted to the Safe and Trusted Pillar of the IndiaAI Mission for the initial budget. Future funding may come from other components of the IndiaAI Mission. UNESCO and MeitY began consulting on AI Readiness Assessment Methodology under Safety and Ethics in Artificial Intelligence from 2024. It is to encourage the ethical and responsible use of AI in industries. The study will find areas where government can become involved, especially in attempts to strengthen institutional and regulatory capabilities. Minister for Electronics & Information Technology Ashwini Vaishnaw announced the creation of an IndiaAI Safety Institute on January 30, 2025, to ensure the ethical and safe application of AI models. The institute will promote domestic R&D that is grounded in India's social, economic, cultural, and linguistic diversity and is based on Indian datasets. With the help of academic and research institutions, as well as private sector partners, the institute will follow the hub-and-spoke approach to carry out projects within Safe and Trusted Pillar of the IndiaAI Mission. It operates under a "hub-and-spoke" model with collaboration from academic institutions (e.g., IITs), tech firms, and international organizations like UNESCO. === Japan === The Japan AISI (or J-AISI) was founded in February 2024. Part of the Information Technology Promotion Agency, it employs about 23 people. The institute consists of the Council of AISI, the AISI Steering Committee, and a secretariat with six teams. Akiko Murakami (previously of IBM Japan and Sompo Japan) serves as the institute's executive director, and Kenji Hiramoto and Suguru Nishimura serve as the institute's two deputy executive directors. === Kenya === Kenya agreed to join the international network of AI safety institutes, but the country has not announced any details yet. It is the only African state in the network. === Singapore === The Digital Trust Centre was initially founded in June 2022. In May 2024, it was renamed to the Singapore AISI. Part of Nanyang Technological University, the institute partners with Infocomm Media Development Authority and is supported by an investment of S$10,000,000 per year. === South Korea === South Korea announced in May 2024 that it would create an AI safety institute under the umbrella of the Electronics and Telecommunications Research Institute. It will be supported by a tentative investment of somewhere between 10 and 20 million South Korean won per year, and employ at least 30 people. The institute was founded in November 2024 and is based in Bundang District within the city of Seongnam. === United Kingdom === The United Kingdom founded in April 2023 a safety organisation called Frontier AI Taskforce, with an initial budget of £100 million. In November 2023, it evolved into the AI Safety Institute, and continued to be led by Ian Hogarth. The AISI is part of the United Kingdom's Department for Science, Innovation and Technology. The United Kingdom's AI strategy aims to balance safety and innovation. Unlike the European Union which adopted the AI Act, the UK is reluctant to legislate early, considering that it may lower the sector's growth, and that laws might be rendered obsolete by technological progress. In May 2024, the institute open-sourced an AI safety tool called "Inspect", which evaluates AI model capabilities such as reasoning and their degree of autonomy. In February 2025, the UK body was renamed the AI Security Institute. Observers saw the name change as a signal that the institute will not focus on ethical issues such as algorithmic bias or freedom of speech in AI applications. === United States === The US AISI was founded in November 2023 as part of the National Institute of Standards and Technology (NIST). This happened the day after the signature of the Executive Order 14110. In February 2024, Joe Biden's former economic policy adviser Elizabeth Kelly was appointed to lead it. In February 2024, the US government created the US AI Safety Institute Consortium (AISIC), regrouping more than 200 organizations such as Google, Anthropic or Microsoft. In March 2024, a budget of $10 million was allocated. Observers noted that this investment is relatively small, especially considering the presence of many big AI companies in the US. The NIST itself, which hosts the AISI, is also known for its chronic lack of funding. Biden administration's request for additional funding was met with further budget cuts from congressional appropriators. Under President Trump, plans for members of the agency to attend the February 2025 AI Action Summit in Paris were scrapped. The US and the UK refused to sign the summit's final communique. US Vice President JD Vance said "pro-growth AI policies" should be prioritised over safety. The name of the agency was changed in June 2025 to the Center for AI Standards and Innovation

    Read more →
  • HiLog

    HiLog

    HiLog is a programming logic with higher-order syntax, which allows arbitrary terms to appear in predicate and function positions. However, the model theory of HiLog is first-order. Although syntactically HiLog strictly extends first order logic, HiLog can be embedded into this logic. HiLog was first described in 1989. It was later extended in the direction of many-sorted logic. The XSB system parses HiLog syntax, but the integration of HiLog into XSB is only partial. In particular, HiLog is not integrated with the XSB module system. A full implementation of HiLog is available in the Flora-2 system. It has been shown that HiLog can be embedded into first-order logic through a fairly simple transformation. For instance, p(X)(Y,Z(V)(W)) gets embedded as the following first-order term: apply(p(X),Y,apply(apply(Z,V),W)). The Framework for Logic-Based Dialects (RIF-FLD) of the Rule Interchange Format (RIF) is largely based on the ideas underlying HiLog and F-logic. == Examples == In all the examples below, capitalized symbols denote variables and the comma denotes logical conjunction, as in most logic programming languages. The first and the second examples show that variables can appear in predicate positions. Predicates can even be complex terms, such as closure(P) or maplist(F) below. The third example shows that variables can also appear in place of atomic formulas, while the fourth example illustrates the use of variables in place of function symbols. The first example defines a generic transitive closure operator, which can be applied to an arbitrary binary predicate. The second example is similar. It defines a LISP-like mapping operator, which applies to an arbitrary binary predicate. The third example shows that the Prolog meta-predicate call/1 can be expressed in HiLog in a natural way and without the use of extra-logical features. The last example defines a predicate that traverses arbitrary binary trees represented as first-order terms.

    Read more →
  • No Fakes Act

    No Fakes Act

    The NO FAKES Act or the Nurture Originals, Foster Art, and Keep Entertainment Safe Act, is proposed United States federal legislation concerning digital replicas. The bill was first introduced in 2023 as a discussion draft, formally introduced in 2024, and reintroduced in 2025. If enacted, the bill would establish a federal right of publicity, giving public figures and private individuals greater control over the creation and use of digital replicas of their likenesses, including artificial intelligence (AI)-generated content. If passed, the NO FAKES Act would create a legal framework for licensing digital replicas, including provisions for liability, safe harbors, and statutory exceptions. The proposal has received broad support from the entertainment and technology industries. However, digital rights organizations have raised concerns that the Act risks chilling protected speech. == Background == === Entertainment industry concerns === Actors’ concerns over studios' use of their digital likeness were one of the primary drivers of the Screen Actors Guild–American Federation of Television and Radio Artists (SAG-AFTRA) strike in 2023. Negotiators for SAG-AFTRA alleged that the Alliance of Motion Picture and Television Producers (AMPTP) sought to use the digital likenesses of actors in perpetuity and would try to replace union members, especially background actors. The AMPTP denied SAG-AFTRA's interpretation of its proposal. In November 2023, AMPTP and SAG-AFTRA reached an agreement on the use of actors’ digital replicas, which included requirements for consent and compensation. Recording labels have also expressed concerns over unauthorized digital replicas of their performers' likeness. In 2023, TikTok user Ghostwriter977 released "Heart on My Sleeve," an AI-produced song in the styles of Drake and the Weeknd. After the song received millions of streams, the Universal Music Group (UMG) initiated takedown requests to TikTok and YouTube, which removed the song from their platforms. The legal arguments attorneys made were not disclosed; however, commentators noted that they likely used the Digital Millennium Copyright Act (DMCA). This presented a novel scenario, since UMG did not have licensing rights to "Heart on My Sleeve." According to The Verge, UMG based its DMCA takedown request on an unauthorized sample used at the start of the song for the producer tag. While legal commentators noted that UMG could have asserted a violation of the artists’ rights of publicity, existing state right of publicity laws do not provide notice-and-takedown mechanisms comparable to those under the DMCA. === Legal landscape === Legal scholars have observed that AI-generated digital replicas raise questions under existing copyright and intellectual property law. U.S. copyright law generally requires that original authorship be attributable to a human; however, the extent of human intervention needed to satisfy this requirement is not clear. Copyright holders have filed lawsuits against AI companies alleging unauthorized usage of copyrighted material to train their models, though many of these cases remain pending. In terms of outputs, record labels often hold rights to artists’ musical works but do not necessarily control the artists’ voice, appearance, or likeness in the same way. As a result, AI-generated recordings such as "Heart on My Sleeve" may fall outside the scope of certain traditional copyright protections. Individuals' likenesses have historically been governed under the Lanham Act, the Federal Trade Commission Act, and right of publicity laws. The right of publicity, recognized in many state-level statutes and common law, allows individuals to bring legal claims against unauthorized commercial use of their identities. It has often, but not exclusively, been applied to celebrities or other recognizable individuals. There is no federal-level right to publicity, and state-level protections vary, especially on issues relating to digital replicas and posthumous rights, which makes it difficult for creators or other individuals to prevent unauthorized use of their likenesses. In July 2024, the U.S. Copyright Office released a report on digital replicas and recommended that Congress create a federal law to protect individuals from unauthorized uses of their digital replicas, noting the inadequacy, narrowness, and inconsistency of existing laws. == Provisions == Under the NO FAKES Act of 2025, a digital replica is defined as "a newly created, computer-generated, highly realistic electronic representation that is readily identifiable as the voice or visual likeness of an individual," living or dead. A digital replica can be embodied in sound recordings, images, or audiovisual works in which the individual did not perform or in which the individual did perform but the "fundamental character of the performance or appearance has been materially altered." The Act specifies that digital replicas do not include reproduced samples of works authorized by the copyright holder. The Act defines a "right holder" as either the individual who is the subject of a digital replica or an entity that has acquired the rights to that individual’s likeness. The Act grants right holders the exclusive right to authorize the use of an individual’s likeness in a digital replica. This right is not assignable during the individual’s lifetime; however, it can be licensed to a living individual for up to 10 years under certain conditions. Postmortem rights The Act provides that the right does not automatically expire upon an individual’s death. It may be transferred to executors, heirs, or other parties designated by the individual. The right is held by the right holder for 10 years following the individual’s death. If the right holder demonstrates active use of the digital replica within the 2 years preceding the end of the 10-year term, the right may be extended for an additional 5-year period. These five-year extensions may be renewed for up to 70 years after the individual’s death. Liability The Act establishes liability for individuals who knowingly distribute a digital replica without authorization from the right holder, as well as for entities that make available a service primarily designed to produce unlawful digital replicas. Safe harbor provisions Similar to the Communications Decency Act and the DMCA, the Act establishes safe harbor provisions for online service providers. Providers are shielded from liability if they adopt and inform users of a policy for terminating accounts that repeatedly violate the Act. The NO FAKES Act does not require online services to proactively monitor content. Instead, it creates a notice-and-takedown mechanism under which providers must promptly respond to notifications seeking the removal of unauthorized digital replicas. These safe harbor protections apply only if the online service provider designates an agent with the U.S. Copyright Office to receive notifications of alleged violations. Remedies The NO FAKES Act provides remedies that are similar to those available under U.S. copyright law. Under the Act, individuals may be held liable for either statutory damages of $5,000 or actual damages for creating or distributing an unauthorized digital replica. The legislation also establishes a tiered liability framework for online service providers. Those that make good faith efforts to comply with the Act may face statutory damages of up to $25,000 per work for violations or actual damages. Providers that do not undertake such compliance efforts may be liable for $5,000 per unauthorized display or transmission of a digital replica, with damages capped at $750,000 per work. Exclusions The Act includes several exceptions to liability that are modeled in part on fair use principles. Digital replicas are excluded from liability when "used in a bona fide news, public affairs, or sports broadcast or account;" in a documentary or historical context; or in a way that is "consistent with the public interest." These exclusions do not apply to de minimis uses or to digital replicas that are sexually explicit in nature. The Act further states that licensing requirements do not apply to licenses established through collective bargaining agreements that contain provisions governing the use of digital replicas. The Act does not impose secondary liability on providers of generative artificial intelligence tools or services whose primary purpose is not the creation of unauthorized digital replicas. Preemption The NO FAKES Act preempts laws that protect "an individual's voice and visual likeness rights in connection with a digital replica, as defined in this Act, in an expressive work." However, the Act preserves state laws governing digital replicas enacted before January 2, 2025, as well as state laws addressing digital replicas that portray sexually explicit conduct. == History == In 2023, Senators Marsha Blackburn, Chris Coons, Amy Klobuchar, and Th

    Read more →
  • Electronic sell-through

    Electronic sell-through

    Electronic sell-through (EST) is a method of media distribution whereby consumers pay a one-time fee to download a media file for storage on a hard drive. Although EST is often described as a transaction that grants content "ownership" to the consumer, the content may become unusable after a certain period and may not be viewable using competing platforms. EST is used by a wide array of digital media products, including movies, television, music, games, and mobile applications. The term is sometimes used interchangeably with download to own (DTO). == Film and television == The film and television industry's $18.8 billion home entertainment market consists of rental and sell-through segments, the latter of which includes the electronic sell-through of digital content. In 2010, EST generated $683 million of total home entertainment revenues, putting it behind the more lucrative revenue streams of cable video-on-demand (VOD) and internet video-on-demand (iVOD), which brought in a combined $1.8 billion in the same period. In 2010, Apple's iTunes Store accounted for three quarters of the U.S. EST business. The rest of the EST market was captured by Microsoft (via its Zune Video platform), Sony, Amazon VOD (now Amazon Video), and Walmart (via its VUDU service). A number of industry trends indicate the future expansion of EST's share of digital distribution revenues. David Bishop, worldwide president of Sony Pictures Home Entertainment, describes the following outlook: "With the launch of UltraViolet (the cloud-based digital copy locker system) establishing a common digital distribution platform later this year, prices potentially coming down on digital sales, more marketing devoted to digital sellthrough, and studios adding more value to the sellthrough product by making HD available and building in smarter extra features, we see the balance tilting even more toward owning and collecting digital movies."

    Read more →
  • IEEE Transactions on Pattern Analysis and Machine Intelligence

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    IEEE Transactions on Pattern Analysis and Machine Intelligence (sometimes abbreviated as IEEE PAMI or simply PAMI) is a monthly peer-reviewed scientific journal published by the IEEE Computer Society. == Background == The journal covers research in computer vision and image understanding, pattern analysis and recognition, machine intelligence, machine learning, search techniques, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, and face and gesture recognition. The editor-in-chief is Kyoung Mu Lee (Seoul National University). According to the Journal Citation Reports, the journal has a 2023 impact factor of 20.8.

    Read more →
  • Lukas Biewald

    Lukas Biewald

    Lukas Biewald (born 1981) is an American entrepreneur and a prominent figure in artificial intelligence. He is recognized for his contributions to machine learning and as the CEO and co-founder of Weights & Biases, a company that builds developer tools for AI, that sold to CoreWeave in 2025 for $1.7B. He previously founded and was CEO of Figure Eight, a human-in-the-loop machine learning platform. He has co-authored 26 AI research papers from 2004 through 2018. == Early life and education == Biewald was born in Boston, Massachusetts in 1981. He attended Cambridge Rindge and Latin School and later earned both a Bachelor's and Master's degree in Computer science from Stanford University. == Early Career and Founding Figure Eight == After graduation, Biewald joined Yahoo! as an engineer, working on machine translations to improve search results, and eventually led the Search Relevance Team for Yahoo! Japan. He later joined Powerset, a natural language search technology company, as their Senior Scientist, which was acquired by Microsoft in 2008 for an estimated $100M. In 2007, Biewald co-founded Figure Eight (formerly CrowdFlower), a data labeling and crowdsourcing company that created datasets for training machine learning models. Figure Eight was acquired by Appen in 2019 for $300 million. == Weights and Biases == In 2017, Biewald co-founded Weights & Biases with Chris Van Pelt and Shawn Lewis. The company provides tools for tracking machine learning experiments, model management, and collaborative AI and LLM app development. The platform has been adopted by organizations such as OpenAI, Salesforce, and Microsoft. In March 2025 Coreweave acquired Weights and Biases at $1.7 billion, with the transaction closing on May 5, 2025. == Gradient Dissent == Biewald hosts the bi-weekly podcast Gradient Dissent. Guest have included: Anthony Goldbloom – Co-founder & CEO of Kaggle. “How to Win Kaggle Competitions” (podcast, Sep. 9, 2020). Shared tips on data-science competitions from the founder of the largest ML community. Richard Socher – Founder & CEO of You.com; former Chief Scientist at Salesforce. “The Challenges of Making ML Work in the Real World” (podcast, September 28, 2020). A leading NLP researcher, he spoke on multimodal search engines powered by large language models. Jensen Huang – Founder & CEO of NVIDIA. “NVIDIA’s CEO on the Next Generation of AI and MLOps” (podcast, March 3, 2022). Huang’s GPUs power modern ML research and production. Emad Mostaque – Co-founder & CEO of Stability AI. “Stable Diffusion, Stability AI, and What’s Next” (podcast, Nov. 15, 2022). Leads the company behind Stable Diffusion, which helped spark the generative-AI imaging boom. Drago Anguelov – Head of Research at Waymo. “Robustness, Safety, and Scalability at Waymo” (podcast, July 14, 2022). Covered Waymo’s self-driving AI advances and deployment challenges. Jeremy Howard – Co-founder of fast.ai. “The Simple but Profound Insight Behind Diffusion” (podcast, Jan. 5, 2023). Known for democratizing deep-learning education; discussed diffusion models and accessible AI tooling. Aidan Gomez – Co-founder & CEO of Cohere. “Scaling LLMs and Accelerating Adoption” (podcast, April 20, 2023). Co-author of “Attention Is All You Need,” he shared how Cohere delivers large-scale NLP models as a service. Chelsea Finn – Stanford Assistant Professor (AI & Robotics). “Shaping the World of Robotics with Chelsea Finn” (podcast, February 15, 2024). A pioneer in meta-learning and robotics, she detailed robots learning complex tasks like cooking. Andrew Feldman – Co-founder & CEO of Cerebras Systems. "Launching the Fastest AI Inference Solution" (podcast, August 27, 2024). Described wafer-scale AI chips achieving new training performance records. Thomas Dohmke – CEO of GitHub. “GitHub CEO on Copilot and the Future of Software Development” (podcast, June 10, 2025). Discussed building Copilot and the future of AI-assisted coding. Martin Shkreli – Founder of Godel Terminal. “From Pharma to AGI Hype, and Developing AI in Finance: Martin Shkreli’s Journey” (podcast, May 20, 2025). Shkreli reflects on his pharma controversies, prison experience, and his new AI-driven trading platform. Jarek Kutylowski – Founder & CEO of DeepL. “How DeepL Built a Translation Powerhouse with AI” (podcast, July 8, 2025). Shared how DeepL’s neural-MT rivals Google Translate through model and infrastructure innovation. == Awards and recognition == In 2010, Lukas Biewald won the Netexplorateur Award for creating the GiveWork iPhone app, which allows users to perform small tasks that assist refugees and people in developing countries. In 2010, Inc Magazine included Biewald and Van Pelt on its list of the Top 30 Entrepreneurs Under 30. == Publications == Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution by John Le, Andy Edmonds, Vaughn Hester, Lukas Biewald. SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, July 2010. Superficial Data Analysis: Exploring Millions of Social Stereotypes by Lukas Biewald, Brendan O’Connor. O’Reilly July 2009 Biewald has co-authored 26 AI research papers from 2004 through 2018.

    Read more →
  • OpenAI Five

    OpenAI Five

    OpenAI Five is a computer program by OpenAI that plays the five-on-five video game Dota 2. Its first public appearance occurred in 2017, where it was demonstrated in a live one-on-one game against the professional player Dendi, who lost to it. The following year, the system had advanced to the point of performing as a full team of five, and began playing against and showing the capability to defeat professional teams. By choosing a game as complex as Dota 2 to study machine learning, OpenAI thought they could more accurately capture the unpredictability and continuity seen in the real world, thus constructing more general problem-solving systems. The algorithms and code used by OpenAI Five were eventually borrowed by another neural network in development by the company, one which controlled a physical robotic hand. OpenAI Five has been compared to other similar cases of artificial intelligence (AI) playing against and defeating humans, such as AlphaStar in the video game StarCraft II, AlphaGo in the board game Go, Deep Blue in chess, and Watson on the television game show Jeopardy!. == History == Development on the algorithms used for the bots began in November 2016. OpenAI decided to use Dota 2, a competitive five-on-five video game, as a base due to it being popular on the live streaming platform Twitch, having native support for Linux, and had an application programming interface (API) available. Before becoming a team of five, the first public demonstration occurred at The International 2017 in August, the annual premiere championship tournament for the game, where Dendi, a Ukrainian professional player, lost against an OpenAI bot in a live one-on-one matchup. After the match, CTO Greg Brockman explained that the bot had learned by playing against itself for two weeks of real time, and that the learning software was a step in the direction of creating software that can handle complex tasks "like being a surgeon". OpenAI used a methodology called reinforcement learning, as the bots learn over time by playing against itself hundreds of times a day for months, in which they are rewarded for actions such as killing an enemy and destroying towers. By June 2018, the ability of the bots expanded to play together as a full team of five and were able to defeat teams of amateur and semi-professional players. At The International 2018, OpenAI Five played in two games against professional teams, one against the Brazilian-based paiN Gaming and the other against an all-star team of former Chinese players. Although the bots lost both matches, OpenAI still considered it a successful venture, stating that playing against some of the best players in Dota 2 allowed them to analyze and adjust their algorithms for future games. The bots' final public demonstration occurred in April 2019, where they won a best-of-three series against The International 2018 champions OG at a live event in San Francisco. A four-day online event to play against the bots, open to the public, occurred the same month. There, the bots played in 42,729 public games, winning 99.4% of those games. == Architecture == Each OpenAI Five bot is a neural network containing a single layer with a 4096-unit LSTM that observes the current game state extracted from the Dota developer's API. The neural network conducts actions via numerous possible action heads (no human data involved), and every head has meaning. For instance, the number of ticks to delay an action, what action to select – the X or Y coordinate of this action in a grid around the unit. In addition, action heads are computed independently. The AI system observes the world as a list of 20,000 numbers and takes an action by conducting a list of eight enumeration values. Also, it selects different actions and targets to understand how to encode every action and observe the world. OpenAI Five has been developed as a general-purpose reinforcement learning training system on the "Rapid" infrastructure. Rapid consists of two layers: it spins up thousands of machines and helps them 'talk' to each other and a second layer runs software. By 2018, OpenAI Five had played around 180 years worth of games in reinforcement learning running on 256 GPUs and 128,000 CPU cores, using Proximal Policy Optimization, a policy gradient method. == Comparisons with other game AI systems == Prior to OpenAI Five, other AI versus human experiments and systems have been successfully used before, such as Jeopardy! with Watson, chess with Deep Blue, and Go with AlphaGo. In comparison with other games that have used AI systems to play against human players, Dota 2 differs as explained below: Long run view: The bots run at 30 frames per second for an average match time of 45 minutes, which results in 80,000 ticks per game. OpenAI Five observes every fourth frame, generating 20,000 moves. By comparison, chess usually ends before 40 moves, while Go ends before 150 moves. Partially observed state of the game: Players and their allies can only see the map directly around them. The rest of it is covered in a fog of war which hides enemies units and their movements. Thus, playing Dota 2 requires making inferences based on this incomplete data, as well as predicting what their opponent could be doing at the same time. By comparison, Chess and Go are "full-information games", as they do not hide elements from the opposing player. Continuous action space: Each playable character in a Dota 2 game, known as a hero, can take dozens of actions that target either another unit or a position. The OpenAI Five developers allow the space into 170,000 possible actions per hero. Without counting the perpetual aspects of the game, there are an average of ~1,000 valid actions each tick. By comparison, the average number of actions in chess is 35 and 250 in Go. Continuous observation space: Dota 2 is played on a large map with ten heroes, five on each team, along with dozens of buildings and non-player character (NPC) units. The OpenAI system observes the state of a game through developers' bot API, as 20,000 numbers that constitute all information a human is allowed to get access to. A chess board is represented as about 70 lists, whereas a Go board has about 400 enumerations. == Reception == OpenAI Five have received acknowledgement from the AI, tech, and video game community at large. Microsoft founder Bill Gates called it a "big deal", as their victories "required teamwork and collaboration". Chess champion Garry Kasparov, who lost against the Deep Blue AI in 1997, stated that despite their losing performance at The International 2018, the bots would eventually "get there, and sooner than expected". In a conversation with MIT Technology Review, AI experts also considered OpenAI Five system as a significant achievement, as they noted that Dota 2 was an "extremely complicated game", so even beating non-professional players was impressive. PC Gamer wrote that their wins against professional players was a significant event in machine learning. In contrast, Motherboard wrote that the victory was "basically cheating" due to the simplified hero pools on both sides, as well as the fact that bots were given direct access to the API, as opposed to using computer vision to interpret pixels on the screen. The Verge wrote that the bots were evidence that the company's approach to reinforcement learning and its general philosophy about AI was "yielding milestones". In 2019, DeepMind unveiled a similar bot for StarCraft II, AlphaStar. Like OpenAI Five, AlphaStar used reinforcement learning and self-play. The Verge reported that "the goal with this type of AI research is not just to crush humans in various games just to prove it can be done. Instead, it's to prove that — with enough time, effort, and resources — sophisticated AI software can best humans at virtually any competitive cognitive challenge, be it a board game or a modern video game." They added that the DeepMind and OpenAI victories were also a testament to the power of certain uses of reinforcement learning. It was OpenAI's hope that the technology could have applications outside of the digital realm. In 2018, they were able to reuse the same reinforcement learning algorithms and training code from OpenAI Five for Dactyl, a human-like robot hand with a neural network built to manipulate physical objects. In 2019, Dactyl solved the Rubik's Cube.

    Read more →
  • Rademacher complexity

    Rademacher complexity

    In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of sets with respect to a probability distribution. The concept can also be extended to real valued functions. == Definitions == === Rademacher complexity of a set === Given a set A ⊆ R m {\displaystyle A\subseteq \mathbb {R} ^{m}} , the Rademacher complexity of A is defined as follows: Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]} where σ 1 , σ 2 , … , σ m {\displaystyle \sigma _{1},\sigma _{2},\dots ,\sigma _{m}} are independent random variables drawn from the Rademacher distribution i.e. Pr ( σ i = + 1 ) = Pr ( σ i = − 1 ) = 1 / 2 {\displaystyle \Pr(\sigma _{i}=+1)=\Pr(\sigma _{i}=-1)=1/2} for i ∈ { 1 , 2 , … , m } {\displaystyle i\in \{1,2,\dots ,m\}} , and a = ( a 1 , … , a m ) ∈ A {\displaystyle a=(a_{1},\ldots ,a_{m})\in A} . Some authors take the absolute value of the sum before taking the supremum, but if A {\displaystyle A} is symmetric this makes no difference. === Rademacher complexity of a function class === Let S = { z 1 , z 2 , … , z m } ⊆ Z {\displaystyle S=\{z_{1},z_{2},\dots ,z_{m}\}\subseteq Z} be a sample of points and consider a function class F {\displaystyle {\mathcal {F}}} of real-valued functions over Z {\displaystyle Z} . Then, the empirical Rademacher complexity of F {\displaystyle {\mathcal {F}}} given S {\displaystyle S} is defined as: Rad S ⁡ ( F ) = 1 m E σ [ sup f ∈ F | ∑ i = 1 m σ i f ( z i ) | ] {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{f\in {\mathcal {F}}}\left|\sum _{i=1}^{m}\sigma _{i}f(z_{i})\right|\right]} This can also be written using the previous definition: Rad S ⁡ ( F ) = Rad ⁡ ( F ∘ S ) {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})=\operatorname {Rad} ({\mathcal {F}}\circ S)} where F ∘ S {\displaystyle {\mathcal {F}}\circ S} denotes function composition, i.e.: F ∘ S := { ( f ( z 1 ) , … , f ( z m ) ) ∣ f ∈ F } {\displaystyle {\mathcal {F}}\circ S:=\{(f(z_{1}),\ldots ,f(z_{m}))\mid f\in {\mathcal {F}}\}} The worst case empirical Rademacher complexity is Rad ¯ m ( F ) = sup S = { z 1 , … , z m } Rad S ⁡ ( F ) {\displaystyle {\overline {\operatorname {Rad} }}_{m}({\mathcal {F}})=\sup _{S=\{z_{1},\dots ,z_{m}\}}\operatorname {Rad} _{S}({\mathcal {F}})} Let P {\displaystyle P} be a probability distribution over Z {\displaystyle Z} . The Rademacher complexity of the function class F {\displaystyle {\mathcal {F}}} with respect to P {\displaystyle P} for sample size m {\displaystyle m} is: Rad P , m ⁡ ( F ) := E S ∼ P m [ Rad S ⁡ ( F ) ] {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}}):=\mathbb {E} _{S\sim P^{m}}\left[\operatorname {Rad} _{S}({\mathcal {F}})\right]} where the above expectation is taken over an identically independently distributed (i.i.d.) sample S = ( z 1 , z 2 , … , z m ) {\displaystyle S=(z_{1},z_{2},\dots ,z_{m})} generated according to P {\displaystyle P} . == Intuition == The Rademacher complexity is typically applied on a function class of models that are used for classification, with the goal of measuring their ability to classify points drawn from a probability space under arbitrary labellings. When the function class is rich enough, it contains functions that can appropriately adapt for each arrangement of labels, simulated by the random draw of σ i {\displaystyle \sigma _{i}} under the expectation, so that this quantity in the sum is maximized. The Rademacher complexity of a set A {\displaystyle A} can be rewritten as Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] = 1 m 2 m ∑ σ ∈ { − 1 / m , + 1 / m } m [ sup a ∈ A ⟨ σ , a ⟩ ] . {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]={\frac {1}{{\sqrt {m}}2^{m}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}}\left[\sup _{a\in A}\langle \sigma ,a\rangle \right].} Each term in the summation is the farthest distance of the set A {\displaystyle A} from the origin, along a unit-length direction σ {\displaystyle \sigma } . The directions are along the vertices of a hypercube. Thus, we can also write it as Rad ⁡ ( A ) = 1 2 m 1 2 m − 1 ∑ σ ∈ { − 1 / m , + 1 / m } m / { − 1 , + 1 } [ sup a ∈ A ⟨ σ , a ⟩ − inf a ∈ A ⟨ σ , a ⟩ ] {\displaystyle \operatorname {Rad} (A)={\frac {1}{2{\sqrt {m}}}}{\frac {1}{2^{m-1}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}}\left[\sup _{a\in A}\langle \sigma ,a\rangle -\inf _{a\in A}\langle \sigma ,a\rangle \right]} Here, the set { − 1 / m , + 1 / m } m / { − 1 , + 1 } {\displaystyle \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}} denotes half of the vertices of a hypercube, selected so that each diagonal has exactly one vertex selected. In words, this states that 2 m Rad ⁡ ( A ) {\displaystyle 2{\sqrt {m}}\operatorname {Rad} (A)} is precisely the average width of the set A {\displaystyle A} along all diagonal directions of a hypercube. == Examples == A singleton set has 0 width in any direction, so it has Rademacher complexity 0. The set A = { ( 1 , 1 ) , ( 1 , 2 ) } ⊆ R 2 {\displaystyle A=\{(1,1),(1,2)\}\subseteq \mathbb {R} ^{2}} has average width 1 / 2 {\displaystyle 1/{\sqrt {2}}} along the two diagonal directions of the square, so it has Rademacher complexity 1 / 4 {\displaystyle 1/4} . The unit cube [ 0 , 1 ] m {\displaystyle [0,1]^{m}} has constant width m {\displaystyle {\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / 2 {\displaystyle 1/2} . Similarly, the unit cross-polytope { x ∈ R m : ‖ x ‖ 1 ≤ 1 } {\displaystyle \{x\in \mathbb {R} ^{m}:\|x\|_{1}\leq 1\}} has constant width 2 / m {\displaystyle 2/{\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / m {\displaystyle 1/m} . == Using the Rademacher complexity == The Rademacher complexity can be used to derive data-dependent upper-bounds on the learnability of function classes. Intuitively, a function-class with smaller Rademacher complexity is easier to learn. === Bounding the representativeness === In machine learning, it is desired to have a training set that represents the true distribution of some sample data S {\displaystyle S} . This can be quantified using the notion of representativeness. Denote by P {\displaystyle P} the probability distribution from which the samples are drawn. Denote by H {\displaystyle H} the set of hypotheses (potential classifiers) and denote by F {\displaystyle {\mathcal {F}}} the corresponding set of error functions, i.e., for every hypothesis h ∈ H {\displaystyle h\in H} , there is a function f h ∈ F {\displaystyle f_{h}\in F} , that maps each training sample (features,label) to the error of the classifier h {\displaystyle h} (note in this case hypothesis and classifier are used interchangeably). For example, in the case that h {\displaystyle h} represents a binary classifier, the error function is a 0–1 loss function, i.e. the error function f h {\displaystyle f_{h}} returns 0 if h {\displaystyle h} correctly classifies a sample and 1 else. We omit the index and write f {\displaystyle f} instead of f h {\displaystyle f_{h}} when the underlying hypothesis is irrelevant. Define: L P ( f ) := E z ∼ P [ f ( z ) ] {\displaystyle L_{P}(f):=\mathbb {E} _{z\sim P}[f(z)]} – the expected error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the real distribution P {\displaystyle P} ; L S ( f ) := 1 m ∑ i = 1 m f ( z i ) {\displaystyle L_{S}(f):={1 \over m}\sum _{i=1}^{m}f(z_{i})} – the estimated error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the sample S {\displaystyle S} . The representativeness of the sample S {\displaystyle S} , with respect to P {\displaystyle P} and F {\displaystyle {\mathcal {F}}} , is defined as: Rep P ⁡ ( F , S ) := sup f ∈ F ( L P ( f ) − L S ( f ) ) {\displaystyle \operatorname {Rep} _{P}({\mathcal {F}},S):=\sup _{f\in F}(L_{P}(f)-L_{S}(f))} Smaller representativeness is better, since it provides a way to avoid overfitting: it means that the true error of a classifier is not much higher than its estimated error, and so selecting a classifier that has low estimated error will ensure that the true error is also low. Note however that the concept of representativeness is relative and hence can not be compared across distinct samples. The expected representativeness of a sample can be bounded above by the Rademacher complexity of the function class: If F {\displaystyle {\mathcal {F}}} is a set of functions with range within [ 0 , 1 ] {\displaystyle [0,1]} , then Rad P , m ⁡ ( F ) − ln ⁡ 2 2 m ≤ E S ∼ P m [ Rep P ⁡ ( F , S ) ] ≤ 2 Rad P , m ⁡ ( F ) {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}})-{\sqrt {\frac {\ln 2}{2m}}}\leq \mathbb {E} _{S\sim P^{m}}[\operatorname {Rep} _{P}({\

    Read more →
  • Learning Applied to Ground Vehicles

    Learning Applied to Ground Vehicles

    The Learning Applied to Ground Vehicles (LAGR) program, which ran from 2004 until 2008, had the goal of accelerating progress in autonomous, perception-based, off-road navigation in robotic unmanned ground vehicles (UGVs). LAGR was funded by DARPA, a research agency of the United States Department of Defense. == History and background == While mobile robots had been in existence since the 1960s, (e.g. Shakey), progress in creating robots that could navigate on their own, outdoors, off-road, on irregular, obstacle-rich terrain had been slow. In fact, no clear metrics were in place to measure progress. A baseline understanding of off-road capabilities began to emerge with the DARPA PerceptOR program in which independent research teams fielded robotic vehicles in unrehearsed Government tests that measured average speed and number of required operator interventions over a fixed course over widely spaced waypoints. These tests exposed the extreme challenges of off-road navigation. While the PerceptOR vehicles were equipped with sensors and algorithms that were state-of-the-art for the beginning of the 21st century, the limited range of their perception technology caused them to become trapped in natural cul-de-sacs. Furthermore, their reliance on pre-scripted behaviors did not allow them to adapt to unexpected circumstances. The overall result was that except for essentially open terrain with minimal obstacles, or along dirt roads, the PerceptOR vehicles were unable navigate without numerous, repeated operator intervention. The LAGR program was designed to build on the methodology started in PerceptOR while seeking to overcome the technical challenges exposed by the PerceptOR tests. == LAGR goals == The principal goal of LAGR was to accelerate progress in off navigation of UGVs. Additional, synergistic goals included (1) establishing benchmarking methodology for measuring progress for autonomous robots operating in unstructured environments, (2) advancing machine vision and thus enabling long-range perception, and (3) increasing the number of institutions and individuals who were able to contribute to forefront UGV research. == Structure and rationale of the LAGR program == The LAGR program was designed to focus on developing new science for robot perception and control rather than on new hardware. Thus, it was decided to create a fleet of identical, relatively simple robots that would be supplied to the LAGR researchers, who were members of competitive teams, freeing them to concentrate on algorithm development. The teams were each given two robots of the standard design. They developed new software on these robots, and then sent the code to a government test team that then tested that code on Government robots at various test courses. These courses were located throughout the US and were not previously known to the teams. In this way, the code from all teams could be tested in essentially identical circumstances. After an initial startup period, the code development/test cycle was repeated about once every month. The standard robot was designed and built by the Carnegie Mellon University National Robotics Engineering Center (CMU NREC). The vehicles’ computers were preloaded with a modular “Baseline” perception and navigation system that was essentially the same system that CMU NREC had created for the PerceptOR program and was considered to represent the state-of-the-art at the inception of LAGR. The modular nature of the Baseline system allowed the researchers to replace parts of the Baseline code with their own modules and still have a complete working system without having to create an entire navigation system from scratch. Thus, for example, they were able to compare the performance of their own obstacle detection module with that of the Baseline code, while holding everything else fixed. The Baseline code also served as a fixed reference – in any environment and at any time in the program, teams’ code could be compared to the Baseline code. This rapid cycle gave the Government team and the performer teams quick feedback and allowed the Government team to design test courses that challenged the performers in specific perception tasks and whose difficulty was likely to challenge, but not overwhelm, the performers’ current capabilities. Teams were not required to submit new code for every test, but usually did. Despite this leeway, some teams found the rapid test cycle distracting to their long term progress and would have preferred a longer interval between tests. === Phase II === To advance to Phase II, each team had to modify the Baseline code so that on the final 3 tests of Phase I of the government tests, robots running the team's code averaged at least 10% faster than a vehicle running the original Baseline code. This rather modest “Go/ No Go” metric was chosen to allow teams to choose risky, but promising approaches that might not be fully developed in the first 18 months of the program. All 8 teams achieved this metric, with some scoring more twice the speed of the Baseline on the later tests which was the objective for Phase II. Note that the Phase I Go / No Go metric was such that teams were not in completion with each other for a limited number of slots on Phase II: any number of teams, from eight to zero could make the grade. This strategy by DARPA was to designed to encourage cooperation and even code sharing among the teams. == The LAGR teams == Eight teams were selected as performers in Phase I, the first 18 months of LAGR. The teams were from Applied Perception (Principal Investigator [PI] Mark Ollis), Georgia Tech (PI Tucker Balch), Jet Propulsion Laboratory (PI Larry Matthies), Net-Scale Technologies (PI Urs Muller), NIST (PI James Albus), Stanford University (PI Sebastian Thrun), SRI International (PI Robert Bolles), and University of Pennsylvania (PI Daniel Lee). The Stanford team resigned at the end of Phase I to focus its efforts on the DARPA Grand Challenge; it was replaced by a team from the University of Colorado, Boulder (PI Greg Grudic). Also in Phase II, the NIST team suspended its participation in the competition and instead concentrated on assembling the best software elements from each team into a single system. Roger Bostelman became PI of that effort. == The LAGR vehicle == The LAGR vehicle, which was about the size of a supermarket shopping cart, was designed to be simple to control. (A companion DARPA program, Learning Locomotion, addressed complex motor control.) It was battery powered and had two independently driven wheelchair motors in the front, and two caster wheels in the rear. When the front wheels were rotated in the same direction the robot was driven either forward or reverse. When these wheels were driven in opposite directions, the robot turned. The ~ $30,000 cost of the LAGR vehicle meant that a fleet could be built and distributed to a number of teams expanding on the field of researchers who had traditionally participated in DARPA robotics programs. The vehicle's top speed of about 3 miles/ hour and relatively modest weight of ~100 kg meant that it posed a much reduced safety hazard compared to vehicles used in previous programs in unmanned ground vehicles and thus further reduced the budget required for each team to manage its robot. Nevertheless, the LAGR vehicles were sophisticated machines. Their sensor suite included 2 pairs of stereo cameras, an accelerometer, a bumper sensor, wheel encoders, and a GPS. The vehicle also had three computers that were user-programmable. == Scientific results == A cornerstone of the program was incorporation of learned behaviors in the robots. In addition, the program used passive optical systems to accomplish long-range scene analysis. The difficulty of testing UGV navigation in unstructured, off-road environments made accurate, objective measurement of progress a challenging task. While no absolute measure of performance had been defined in LAGR, the relative comparison of a team's code to that of the Baseline code on a given course demonstrated whether progress was being made in that environment. By the conclusion of the program, testing showed that many of the performers had attained leaps in performance. In particular, average autonomous speeds were increased by factor of 3 and useful visual perception was extended to ranges as far as 100 meters. While LAGR did succeed in extending the useful range of visual perception, this was primarily done by either pixel or patch-based color or texture analysis. Object recognition was not directly addressed. Even though the LAGR vehicle had a WAAS GPS, its position was never determined down to the width of the vehicle, so it was hard for the systems to re-use obstacle maps of areas the robots had previously traversed since the GPS continually drifted. The drift was especially severe if there was a forest canopy. A few teams developed visual odometry algorithms that essentially eliminated this drift.

    Read more →
  • Alex Krizhevsky

    Alex Krizhevsky

    Alex Krizhevsky is a Canadian computer scientist most noted for his work on artificial neural networks and deep learning. In 2012, Krizhevsky, Ilya Sutskever and their PhD advisor Geoffrey Hinton, at the University of Toronto, developed a powerful visual-recognition network AlexNet using only two GeForce-branded GPU cards. This revolutionized research in neural networks. Previously neural networks were trained on CPUs. The transition to GPUs opened the way to the development of advanced AI models. == AlexNet == Motivated by Sutskever and inspired by Hinton, Krizhevsky developed AlexNet to expand the limits in image recognition and classification. Building on Convolutional Neural Networks and Sutskever’s Deep Neural Network approach of deepening the neural layers far beyond the convention of the time—as well as adding Dropout for training resilience—AlexNet won the ImageNet challenge in 2012. The team presented their paper for AlexNet at NeurIPS (NIPS) 2012. Shortly after AlexNet’s debut, Krizhevsky and Sutskever sold their startup, DNN Research Inc., to Google. Krizhevsky left Google in September 2017 after losing interest in the work, to work at the company Dessa in support of new deep-learning techniques. Many of his numerous papers on machine learning and computer vision are frequently cited by other researchers. He is also the main author of the CIFAR-10 and CIFAR-100 datasets. == Legacy == AlexNet is widely credited with igniting the deep learning revolution. Its success demonstrated the effectiveness of deep neural networks trained on GPUs, leading to rapid progress across multiple domains of artificial intelligence beyond computer vision. The techniques and momentum generated by AlexNet helped shape the development of modern natural language processing models, including large-scale transformer-based models such as BERT and GPT, which power tools like ChatGPT.

    Read more →