AI Coding Using Python

AI Coding Using Python — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Evntlive

    Evntlive

    Evntlive was an interactive digital concert venue that allowed music fans worldwide to stream concerts to their computer, tablet, or phone. Based in Redwood City, CA, EVNTLIVE Beta launched on April 15, 2013. EVNTLIVE provided users with the ability to switch camera angles, view All Access interviews and clips from artists, buy music, and chat with other online concert-goers in the in-app feature. Users could watch live and on-demand concerts with both free and pay-per-view concerts offered. In its first two months, EVNTLIVE streamed live performances of popular artists ranging from Bon Jovi to Wale, as well as music festivals such as Taste of Country and Mountain Jam; including performances by The Lumineers, Gary Clark Jr., Phil Lesh & Friends, Primus, and more. On December 6, 2013, Evntlive was acquired and absorbed by Yahoo!. The site ceased operations and redirected viewers to Yahoo! Music and Yahoo! Screen promptly afterwards. == About the Platform == EvntLive is an HTML5, web-based platform available on laptops, iPads, and mobile devices. Users must register for a free account on Evntlive’s website in order to reserve tickets and access live and on-demand content. Once they reserve tickets, they can view All Access features from their favorite artists or bands, purchase music, and interact with other online audience members using Buzz. Users can also switch between alternate camera angles as though they are on the concert floor - sharing the experience with their friends online in real-time. EvntLive was acquired by Yahoo in December 2013 == Artists == Bon Jovi Wale Escape the Fate The Parlotones === Taste of Country Music Festival === Trace Adkins Willie Nelson Justin Moore Montgomery Gentry Craig Campbell Blackberry Smoke Gloriana Dustin Lynch LoCash Cowboys Rachel Farley Parmalee Joe Nichols === Mountain Jam Music Festival === Source: The Lumineers Primus Widespread Panic Gov't Mule Phil Lesh The Avett Brothers Dispatch Rubblebucket Michael Franti Jackie Greene Deer Tick Gary Clark Jr. ALO The London Souls Nicki Bluhm Amy Helm The Lone Bellow The Revivalists Swear and Shake Roadkill Ghost Choir Michael Bernard Fitzgerald Michele Clark 's Sunset Sessions Semi Precious Weapons Dale Earnhardt Jr. Jr. DigiTour Media Pentatonix Allstar Weekend Tyler Ward === Launch Music Festival ===

    Read more →
  • Model

    Model

    A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in 16th-century English, and derived via French and Italian ultimately from Latin modulus, 'a measure'. Models can be divided into physical models (e.g. a ship model) and abstract models (e.g. a set of mathematical equations describing the workings of the atmosphere for the purpose of weather forecasting). Abstract or conceptual models are central to philosophy of science. In scholarly research and applied science, a model should not be confused with a theory: while a model seeks only to represent reality with the purpose of better understanding or predicting the world, a theory is more ambitious in that it claims to be an explanation of reality. == Types of model == === Model in specific contexts === As a noun, model has specific meanings in certain fields, derived from its original meaning of "structural design or layout": Model (art), a person posing for an artist, e.g. a 15th-century criminal representing the biblical Judas in Leonardo da Vinci's painting The Last Supper Model (person), a person who serves as a template for others to copy, as in a role model, often in the context of advertising commercial products; e.g. the first fashion model, Marie Vernet Worth in 1853, wife of designer Charles Frederick Worth. Model (product), a particular design of a product as displayed in a catalogue or show room (e.g. Ford Model T, an early car model) Model (organism) a non-human species that is studied to understand biological phenomena in other organisms, e.g. a guinea pig starved of vitamin C to study scurvy, an experiment that would be immoral to conduct on a person Model (mimicry), a species that is mimicked by another species Model (logic), a structure (a set of items, such as natural numbers 1, 2, 3,..., along with mathematical operations such as addition and multiplication, and relations, such as < {\displaystyle <} ) that satisfies a given system of axioms (basic truisms), i.e. that satisfies the statements of a given theory Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Model (MVC), the information-representing internal component of a software, as distinct from its user interface === Physical model === A physical model (most commonly referred to simply as a model but in this context distinguished from a conceptual model) is a smaller or larger physical representation of an object, person or system. The object being modelled may be small (e.g., an atom) or large (e.g., the Solar System) or life-size (e.g., a fashion model displaying clothes for similarly-built potential customers). The geometry of the model and the object it represents are often similar in the sense that one is a rescaling of the other. However, in many cases the similarity is only approximate or even intentionally distorted. Sometimes the distortion is systematic, e.g., a fixed scale horizontally and a larger fixed scale vertically when modelling topography to enhance a region's mountains. An architectural model permits visualization of internal relationships within the structure or external relationships of the structure to the environment. Another use is as a toy. Instrumented physical models are an effective way of investigating fluid flows for engineering design. Physical models are often coupled with computational fluid dynamics models to optimize the design of equipment and processes. This includes external flow such as around buildings, vehicles, people, or hydraulic structures. Wind tunnel and water tunnel testing is often used for these design efforts. Instrumented physical models can also examine internal flows, for the design of ductwork systems, pollution control equipment, food processing machines, and mixing vessels. Transparent flow models are used in this case to observe the detailed flow phenomenon. These models are scaled in terms of both geometry and important forces, for example, using Froude number or Reynolds number scaling (see Similitude). In the pre-computer era, the UK economy was modelled with the hydraulic model MONIAC, to predict for example the effect of tax rises on employment. === Conceptual model === A conceptual model is a theoretical representation of a system, e.g. a set of mathematical equations attempting to describe the workings of the atmosphere for the purpose of weather forecasting. It consists of concepts used to help understand or simulate a subject the model represents. Abstract or conceptual models are central to philosophy of science, as almost every scientific theory effectively embeds some kind of model of the physical or human sphere. In some sense, a physical model "is always the reification of some conceptual model; the conceptual model is conceived ahead as the blueprint of the physical one", which is then constructed as conceived. Thus, the term refers to models that are formed after a conceptualization or generalization process. === Examples === Conceptual model (computer science), an agreed representation of entities and their relationships, to assist in developing software Economic model, a theoretical construct representing economic processes Language model, a probabilistic model of a natural language, used for speech recognition, language generation, and information retrieval Large language models are artificial neural networks used for generative artificial intelligence (AI), e.g. ChatGPT Mathematical model, a description of a system using mathematical concepts and language Statistical model, a mathematical model that usually specifies the relationship between one or more random variables and other non-random variables Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Medical model, a proposed "set of procedures in which all doctors are trained" Mental model, in psychology, an internal representation of external reality Model (logic), a set along with a collection of finitary operations, and relations that are defined on it, satisfying a given collection of axioms Model (MVC), information-representing component of a software, distinct from the user interface (the "view"), both linked by the "controller" component, in the context of the model–view–controller software design Model act, a law drafted centrally to be disseminated and proposed for enactment in multiple independent legislatures Standard model (disambiguation) == Properties of models, according to general model theory == According to Herbert Stachowiak, a model is characterized by at least three properties: 1. Mapping A model always is a model of something—it is an image or representation of some natural or artificial, existing or imagined original, where this original itself could be a model. 2. Reduction In general, a model will not include all attributes that describe the original but only those that appear relevant to the model's creator or user. 3. Pragmatism A model does not relate unambiguously to its original. It is intended to work as a replacement for the original a) for certain subjects (for whom?) b) within a certain time range (when?) c) restricted to certain conceptual or physical actions (what for?). For example, a street map is a model of the actual streets in a city (mapping), showing the course of the streets while leaving out, say, traffic signs and road markings (reduction), made for pedestrians and vehicle drivers for the purpose of finding one's way in the city (pragmatism). Additional properties have been proposed, like extension and distortion as well as validity. The American philosopher Michael Weisberg differentiates between concrete and mathematical models and proposes computer simulations (computational models) as their own class of models. == Uses of models == According to Bruce Edmonds, there are at least 5 general uses for models: Prediction: reliably anticipating unknown data, including data within the domain of the training data (interpolation), and outside the domain (extrapolation) Explanation: establishing plausible chains of causality by proposing mechanisms that can explain patterns seen in data Theoretical exposition: discovering or proposing new hypotheses, or refuting existing hypotheses about the behaviour of the system being modelled Description: representing important aspects of the system being modelled Illustration: communicating an idea or explanation

    Read more →
  • Frame (artificial intelligence)

    Frame (artificial intelligence)

    Frames are an artificial intelligence data structure used to divide knowledge into substructures by representing "stereotyped situations". They were proposed by Marvin Minsky in his 1974 article "A Framework for Representing Knowledge". Frames are the primary data structure used in artificial intelligence frame languages; they are stored as ontologies of sets. Frames are also an extensive part of knowledge representation and reasoning schemes. They were originally derived from semantic networks and are therefore part of structure-based knowledge representations. According to Russell and Norvig's Artificial Intelligence: A Modern Approach, structural representations assemble "facts about particular object and event types and [arrange] the types into a large taxonomic hierarchy analogous to a biological taxonomy". == Frame structure == The frame contains information on how to use the frame, what to expect next, and what to do when these expectations are not met. Some information in the frame is generally unchanged while other information, stored in "terminals", usually change. Terminals can be considered as variables. Top-level frames carry information, that is always true about the problem in hand, however, terminals do not have to be true. Their value might change with the new information encountered. Different frames may share the same terminals. Each piece of information about a particular frame is held in a slot. The information can contain: Facts or Data Values (called facets) Procedures (also called procedural attachments) IF-NEEDED: deferred evaluation IF-ADDED: updates linked information Default Values For Data For Procedures Other Frames or Subframes == Features and advantages == A frame's terminals are already filled with default values, which is based on how the human mind works. For example, when a person is told "a boy kicks a ball", most people will visualize a particular ball (such as a familiar soccer ball) rather than imagining some abstract ball with no attributes. One particular strength of frame-based knowledge representations is that, unlike semantic networks, they allow for exceptions in particular instances. This gives frames a degree of flexibility that allows representations to reflect real-world phenomena more accurately. Like semantic networks, frames can be queried using spreading activation. Following the rules of inheritance, any value given to a slot that is inherited by subframes will be updated (IF-ADDED) to the corresponding slots in the subframes and any new instances of a particular frame will feature that new value as the default. Because frames are based on structures, it is possible to generate a semantic network given a set of frames even though it lacks explicit arcs. References to Noam Chomsky and his generative grammar of 1950 are generally missing from Minsky's work. The simplified structures of frames allow for easy analogical reasoning, a much prized feature in any intelligent agent. The procedural attachments provided by frames also allow a degree of flexibility that makes for a more realistic representation and gives a natural affordance for programming applications. == Example == Worth noticing here is the easy analogical reasoning (comparison) that can be done between a boy and a monkey just by having similarly named slots. Also notice that Alex, an instance of a boy, inherits default values like "Sex" from the more general parent object Boy, but the boy may also have different instance values in the form of exceptions such as the number of legs. == Frame language == A frame language is a technology used for knowledge representation in artificial intelligence. They are similar to class hierarchies in object-oriented languages although their fundamental design goals are different. Frames are focused on explicit and intuitive representation of knowledge whereas objects focus on encapsulation and information hiding. Frames originated in AI research and objects primarily in software engineering. However, in practice, the techniques and capabilities of frame and object-oriented languages overlap significantly. === Example === A simple example of concepts modeled in a frame language is the Friend of A Friend (FOAF) ontology defined as part of the Semantic Web as a foundation for social networking and calendar systems. The primary frame in this simple example is a Person. Example slots are the person's email, home page, phone, etc. The interests of each person can be represented by additional frames describing the space of business and entertainment domains. The slot knows links each person with other persons. Default values for a person's interests can be inferred by the web of people they are friends of. === Implementations === The earliest frame-based languages were custom developed for specific research projects and were not packaged as tools to be re-used by other researchers. Just as with expert system inference engines, researchers soon realized the benefits of extracting part of the core infrastructure and developing general-purpose frame languages that were not coupled to specific applications. One of the first general-purpose frame languages was KRL. One of the most influential early frame languages was KL-ONE. KL-ONE spawned several subsequent Frame languages. One of the most widely used successors to KL-ONE was the Loom language developed by Robert MacGregor at the Information Sciences Institute. In the 1980s, Artificial Intelligence generated a great deal of interest in the business world fueled by expert systems. This led to the development of many commercial products for the development of knowledge-based systems. These early products were usually developed in Lisp and integrated constructs such as IF-THEN rules for logical reasoning with Frame hierarchies for representing data. One of the most well known of these early Lisp knowledge-base tools was the Knowledge Engineering Environment (KEE) from Intellicorp. KEE provided a full Frame language with multiple inheritance, slots, triggers, default values, and a rule engine that supported backward and forward chaining. As with most early commercial versions of AI software KEE was originally deployed in Lisp on Lisp machine platforms but was eventually ported to PCs and Unix workstations. The research agenda of the Semantic Web spawned a renewed interest in automatic classification and frame languages. An example is the Web Ontology Language (OWL) standard for describing information on the Internet. OWL is a standard to provide a semantic layer on top of the Internet. The goal is that rather than searching the web using keywords as most search engines (e.g. Google) do today, the web can be organized by concepts organized in an ontology, like a directory structure. The name of the OWL language itself provides a good example of the value of a Semantic Web. If one were to search for "OWL" using the Internet today most of the pages retrieved would be on the bird Owl rather than the standard OWL. With a Semantic Web it would be possible to specify the concept "Web Ontology Language" and the user would not need to worry about the various possible acronyms or synonyms as part of the search. Likewise, the user would not need to worry about homonyms crowding the search results with irrelevant data such as information about birds of prey as in this simple example. In addition to OWL, various standards and technologies that are relevant to the Semantic Web and were influenced by Frame languages include OIL and DAML. The Protege Open Source software tool from Stanford University provides an ontology editing capability that is built on OWL and has the full capabilities of a classifier. However it ceased to explicitly support frames as of version 3.5 (which is maintained for those preferring frame orientation), with the current version being 5.6.8 as of 2025. The justification for moving from explicit frames being that OWL DL is more expressive and "industry standard". === Comparison of frames and objects === Frame languages have a significant overlap with object-oriented languages. The terminologies and goals of the two communities were different but as they moved from the academic world and labs to the commercial world developers tended to not care about philosophical issues and focused primarily on specific capabilities, taking the best from either camp regardless of where the idea began. What both paradigms have in common is a desire to reduce the distance between concepts in the real world and their implementation in software. As such both paradigms arrived at the idea of representing the primary software objects in taxonomies starting with very general types and progressing to more specific types. The following table illustrates the correlation between standard terminology from the object-oriented and frame language communities: The primary difference between the two paradigms was in the degree that encapsulation was considered a majo

    Read more →
  • Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

    Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

    The Safe and Secure Innovation for Frontier Artificial Intelligence Models Act, or SB 1047, was a failed 2024 California bill intended to "mitigate the risk of catastrophic harms from AI models so advanced that they are not yet known to exist". Specifically, the bill would have applied to models which cost more than $100 million to train and were trained using a quantity of computing power greater than 1026 integer or floating-point operations. SB 1047 would have applied to all AI companies doing business in California—the location of the company would not matter. The bill would have created protections for whistleblowers and required developers to perform risk assessments of their models prior to release, with guidance from the Government Operations Agency. It would also have established CalCompute, a University of California public cloud computing cluster for startups, researchers and community groups. == Background == The rapid increase in capabilities of AI systems in the 2020s, including the release of ChatGPT in November 2022, caused some researchers and members of the public to become concerned about the existential risks associated with increasingly powerful AI systems. Hundreds of tech executives and AI researchers, including two of the so-called "Godfathers of AI", Geoffrey Hinton and Yoshua Bengio, signed a statement in May 2023 calling for the mitigation of the "risk of extinction from AI" to be a global priority alongside "pandemics and nuclear war". However, the plausibility of these risks is still widely debated. Strong regulation of AI has been criticized for purportedly causing regulatory capture by large AI companies like OpenAI, a phenomenon in which regulation advances the interest of larger companies at the expense of smaller competition and the public in general, although OpenAI ended up opposing the bill. Other advocates of AI regulation aim to prevent bias and privacy violations, rather than existential risks. For example, some experts who view existential concerns as overblown and unrealistic view them as a distraction from near-term harms of AI like discriminatory automated decision making. In the face of existential concerns, technology companies have made voluntary commitments to conduct safety testing, for example at the AI Safety Summit and AI Seoul Summit. In 2023, not long before the bill was proposed, Governor Newsom of California and President Biden issued executive orders on artificial intelligence. State Senator Wiener said SB 1047 draws heavily on the Biden executive order, and is motivated by the absence of unified federal legislation on AI safety. Historically, California has passed regulation on several tech issues itself, including consumer privacy and net neutrality, in the absence of action by Congress. == History == === Proposal and voting === The bill was authored by State Senator Scott Wiener. Wiener first proposed AI legislation for California through an intent bill called SB 294, the Safety in Artificial Intelligence Act, in September 2023. On February 7, 2024, Wiener introduced SB 1047. On May 21, SB 1047 passed the Senate 32–1. The bill was significantly amended by Wiener on August 15, 2024, in response to industry advice. Amendments included adding clarifications, and removing the creation of a "Frontier Model Division" and the penalty of perjury. On August 28, the bill passed the State Assembly 48–16. Then, due to the amendments, the bill was once again voted on by the Senate, passing 30–9. === Veto by governor === On September 29, Governor Gavin Newsom vetoed the bill. The deadline for California lawmakers to overrule Newsom's veto was November 30, 2024. Newsom cited concerns over the bill's regulatory framework targeting only large AI models based on their computational size, while not taking into account whether the models are deployed in high-risk environments. Newsom emphasized that this approach could create a false sense of security, overlooking smaller models that might present equally significant risks. He acknowledged the need for AI safety protocols but stressed the importance of adaptability in regulation as AI technology continues to evolve rapidly. Governor Newsom also committed to working with technology experts, federal partners, and research institutions, including the Carnegie Endowment for International Peace, led by former California Supreme Court Justice Mariano-Florentino Cuéllar; and Stanford University's Human-Centered AI (HAI) Institute, led by Dr. Fei-Fei Li. He announced plans to collaborate with these entities to advance responsible AI development, aiming to protect the public while fostering innovation. == Provisions == SB 1047 would have covered AI models with training compute over 1026 integer or floating-point operations and a cost of over $100 million. If a covered model is fine-tuned using more than $10 million, the resulting model would also have been covered. The bill would have defined critical harms with respect to four categories: Creation or use of a chemical, biological, radiological, or nuclear weapon Cyberattacks on critical infrastructure causing mass casualties or at least $500 million of damage Autonomous crimes causing mass casualties or at least $500 million of damage Other harms of comparable severity Developers would have needed to create a "safety and security protocol" before training covered models. Before deployment, they would have submitted a statement of compliance, confirming they took reasonable care to take measures to prevent covered models that pose an unreasonable risk of critical harms. The statement would have included risk assessments and descriptions of their compliance process. These rules would have applied to both covered models and their derivatives, including post-training modifications, with annual third-party audits required starting in 2026. Safeguards to reduce risk included the ability to shut down the model, which has been variously described as a "kill switch" and "circuit breaker". Whistleblowing provisions would have protected employees who report safety problems and incidents. Additionally, SB 1047 would have created a public cloud computing cluster called CalCompute, associated with the University of California, to support startups, researchers, and community groups that lack large-scale computing resources. === Compliance and supervision === SB 1047 would have required developers, beginning January 1, 2026, to annually retain a third-party auditor to perform an independent audit of compliance with the requirements of the bill, as provided. The Government Operations Agency would have reviewed the results of safety tests and incidents, and issue guidance, standards, and best practices. The bill would have created a Board of Frontier Models to supervise the application of the bill by the Government Operations Agency. It is would be composed of 9 members. == Reception == === Subjects of debate === Proponents of the bill described its provisions as simple and narrowly focused, with Sen. Scott Weiner describing it as a "light-touch, basic safety bill". This was disputed by critics of the bill, who described the bill's language as vague and criticized it as consolidating power in the largest AI companies at the expense of smaller ones. Proponents, in turn, argued that the bill only applies to models trained using more than 1026 FLOPS and with over $100 million, or fine-tuned with more than $10 million, and that the threshold could be increased if needed. The penalty of perjury was also a subject of debate, and was eventually removed through an amendment. The scope of the "kill switch" requirement was also reduced, following concerns from open-source developers. The use of the term "reasonable assurance" in the bill was also controversial, and it was eventually amended to "reasonable care". Critics then argued that "reasonable care" imposed an excessive burden by requiring confidence that models could not be used to cause catastrophic harm; proponents claimed that the standard did not require certainty and that it already applied to AI developers under existing law. === Support and opposition === Individual supporters of the bill included Turing Award recipients Yoshua Bengio and Geoffrey Hinton, Elon Musk, Bill de Blasio, Kevin Esvelt, Dan Hendrycks, Vitalik Buterin, OpenAI whistleblowers Daniel Kokotajlo and William Saunders, Lawrence Lessig, Sneha Revanur, Stuart Russell, Jan Leike, actors Mark Ruffalo, Sean Astin, and Rosie Perez, Scott Aaronson, and Max Tegmark. Over 120 Hollywood celebrities, including Mark Hamill, Jane Fonda, and J. J. Abrams, also signed a statement in support of the bill. Max Tegmark likened the bill's focus on holding companies responsible for the harms caused by their models to the FDA requiring clinical trials before a company can release a drug to the market. Organizations sponsoring the bill included the Center for AI Safety, Economic Security California and Encode. The la

    Read more →
  • Lingua Libre

    Lingua Libre

    Lingua Libre is an online collaborative project and tool by the Wikimédia France association, which aims to build a collaborative, multilingual, audiovisual speech corpus under a free license. It mostly consists of a rapid recording online service which allows the user to chain hundreds of recordings. Contributors have produced content in 310+ languages. == Description == Lingua Libre enables the recording of words, phrases or sentences of any language, oral (audio recording) or signed (video recording). Words are presented to the speaker in the form of a list, created on the spot, in advance, or by reusing an existing Wikimedia category. The speaker simply reads the word displayed on the screen, and the software moves on to the next word when it detects a silence after the read word. This principle, borrowed from the open source software Shtooka recorder with the help of its creator, Nicolas Vion, makes it possible to record several hundreds of words per hour. The recordings are then uploaded automatically from the web client to the Wikimedia Commons media library. In spring 2021, Lingua Libre was offline due to a fire in Strasbourg, but no audio recordings were lost. === Use of the recordings === The recordings can be consulted either on Lingua Libre or on Commons. They are mainly used on other Wikimedia projects, for example to illustrate entries on Wiktionaries or proper nouns in Wikipedia articles. The re-use of the recordings in a language teaching context is envisaged. Language learners can freely download pronunciations and use them on GoldenDict, a popular dictionary software. Thus, audio recordings can be used as “Pronunciation Dictionaries” on GoldenDict without needing internet connection. The recordings are also reused in Natural Language Processing projects, for example to drive Mozilla's DeepSpeech speech recognition engines. == Versions == Lingua Libre was initiated on January 23, 2015 and has had three successive versions: === Lingua Libre v.1 (2016) === As part of the Languages of France project, which aims to document and promote the regional languages of France on Wikimedia and Internet projects in general, the conception of Lingua Libre started in November 2015, partly funded by the DGLFLF (General Delegation for the French language and the languages of France). The first version of the project was launched in August 2016. Only suitable for audio recording, Lingua Libre was shown during a workshop on Occitan language in December 2016, and then presented to the online Wikimedia community and at international events in 2017. === Lingua Libre v.2 (2018) === A complete rebuilding was launched at the end of 2017. The new version of Lingua Libre is based on MediaWiki, uses Wikibase and OAuth to better integrate into the Wikimedia environment. The interface is translated via Translatewiki.net so that the project can be used by a large number of communities. The new version of the site was ready in June 2018 and opened to the public in August 2018. === Lingua Libre v.2.2 (2020) === In 2020, important changes were made to the platform; a new look was developed especially for the site, the .org domain replaced the .fr domain used until then, and added support for sign languages through video recording. == Statistics == In the first two years of the project's launch, approximately 10,000 recordings were made. The transition to v.2 was accompanied by a sharp increase in the contributions. The number of recordings multiplied by 10 in less than a year, exceeding the 100,000 threshold in May 2019. These recordings were made by 127 speakers in almost 50 languages. By September 2020, the platform had more than 300,000 recordings in 90 languages with more than 350 speakers. The 500,000 recordings milestone was reached in June 2021, thanks to 540 speakers of 120 languages.

    Read more →
  • Artificial intelligence safety institute

    Artificial intelligence safety institute

    An artificial intelligence safety institute is a type of state-backed organization aiming to evaluate and ensure the safety of advanced artificial intelligence (AI) models, also called frontier AI models. AI safety gained prominence in 2023, notably with public declarations about potential existential risks from AI. During the AI Safety Summit in November 2023, the United Kingdom and the United States both created their own AISI. During the AI Seoul Summit in May 2024, international leaders agreed to form a network of AI Safety Institutes, comprising institutes from the UK, the US, Japan, France, Germany, Italy, Singapore, South Korea, Australia, Canada and the European Union. In 2025, the UK's AI Safety Institute was renamed the "AI Security Institute", and its US counterpart became the Center for AI Standards and Innovation (CAISI). == Timeline == In 2023, Rishi Sunak, the Prime Minister of the United Kingdom, expressed his intention to "make the UK not just the intellectual home but the geographical home of global AI safety regulation" and unveiled plans for an AI Safety Summit. He emphasized the need for independent safety evaluations, stating that AI companies cannot "mark their own homework". During the summit in November 2023, the UK AISI was officially established as an evolution of the Frontier AI Taskforce, and the US AISI as part of the National Institute of Standards and Technology. Japan followed by launching an AI safety institute in February 2024. Politico reported in April 2024 that many AI companies had not shared pre-deployment access to their most advanced AI models for evaluation. Meta's president of global affairs Nick Clegg said that many AI companies were waiting for the UK and the US AI Safety Institutes to work out common evaluation rules and procedures. An agreement was indeed concluded between the UK and the US in April 2024 to collaborate on at least one joint safety test. Initially established in London, the UK AI Safety Institute announced in May 2024 that it would open an office in San Francisco, where many AI companies are located. This is part of a plan to "set new, international standards on AI safety", according to UK's technology minister Michele Donelan. == International network == At the AI Seoul Summit in May 2024, the European Union and other countries agreed to create their own AI safety institutes, forming an international network. In July 2025, the international network held an exercise to explore issues with evaluating AI agents, especially when it came to leaking sensitive information or cybersecurity. Network members also met at NeurIPS 2025 in the city of San Diego. == Specific institutes == === Australia === The Albanese government announced the creation of the Australian AI Safety Institute on 25 November 2025. === Canada === Canada announced in April 2024 that it would create an AI safety institute, and such an institute was officially founded in November 2024. The institute is housed under Innovation, Science and Economic Development Canada, though it also partners with the Canadian Institute for Advanced Research (CIFAR). It is supported by a budget of CA$50,000,000 for a five-year timespan. === European Union === The EU AI office, founded in May 2024, is a member of the international network of AI safety institutes. === France === On 31 January 2025, the government of France created the Institut national pour l'évaluation et la sécurité de l'intelligence artificielle (INESIA), or the National Institute for AI Evaluation and Security. === India === The Ministry of Electronics and Information Technology held consultations with Meta Platforms, Google, Microsoft, IBM, OpenAI, NASSCOM, Broadband India Forum, Software Alliance, Indian Institutes of Technology (IITs), The Quantum Hub, Digital Empowerment Foundation, and Access Now on October 7, 2024, in relation to the establishment of the AI Safety Institute. The decision was made to shift focus from regulation to standards-setting, risk identification, and damage detection—all of which require interoperable technologies. The AISI may spend the ₹20 crore allotted to the Safe and Trusted Pillar of the IndiaAI Mission for the initial budget. Future funding may come from other components of the IndiaAI Mission. UNESCO and MeitY began consulting on AI Readiness Assessment Methodology under Safety and Ethics in Artificial Intelligence from 2024. It is to encourage the ethical and responsible use of AI in industries. The study will find areas where government can become involved, especially in attempts to strengthen institutional and regulatory capabilities. Minister for Electronics & Information Technology Ashwini Vaishnaw announced the creation of an IndiaAI Safety Institute on January 30, 2025, to ensure the ethical and safe application of AI models. The institute will promote domestic R&D that is grounded in India's social, economic, cultural, and linguistic diversity and is based on Indian datasets. With the help of academic and research institutions, as well as private sector partners, the institute will follow the hub-and-spoke approach to carry out projects within Safe and Trusted Pillar of the IndiaAI Mission. It operates under a "hub-and-spoke" model with collaboration from academic institutions (e.g., IITs), tech firms, and international organizations like UNESCO. === Japan === The Japan AISI (or J-AISI) was founded in February 2024. Part of the Information Technology Promotion Agency, it employs about 23 people. The institute consists of the Council of AISI, the AISI Steering Committee, and a secretariat with six teams. Akiko Murakami (previously of IBM Japan and Sompo Japan) serves as the institute's executive director, and Kenji Hiramoto and Suguru Nishimura serve as the institute's two deputy executive directors. === Kenya === Kenya agreed to join the international network of AI safety institutes, but the country has not announced any details yet. It is the only African state in the network. === Singapore === The Digital Trust Centre was initially founded in June 2022. In May 2024, it was renamed to the Singapore AISI. Part of Nanyang Technological University, the institute partners with Infocomm Media Development Authority and is supported by an investment of S$10,000,000 per year. === South Korea === South Korea announced in May 2024 that it would create an AI safety institute under the umbrella of the Electronics and Telecommunications Research Institute. It will be supported by a tentative investment of somewhere between 10 and 20 million South Korean won per year, and employ at least 30 people. The institute was founded in November 2024 and is based in Bundang District within the city of Seongnam. === United Kingdom === The United Kingdom founded in April 2023 a safety organisation called Frontier AI Taskforce, with an initial budget of £100 million. In November 2023, it evolved into the AI Safety Institute, and continued to be led by Ian Hogarth. The AISI is part of the United Kingdom's Department for Science, Innovation and Technology. The United Kingdom's AI strategy aims to balance safety and innovation. Unlike the European Union which adopted the AI Act, the UK is reluctant to legislate early, considering that it may lower the sector's growth, and that laws might be rendered obsolete by technological progress. In May 2024, the institute open-sourced an AI safety tool called "Inspect", which evaluates AI model capabilities such as reasoning and their degree of autonomy. In February 2025, the UK body was renamed the AI Security Institute. Observers saw the name change as a signal that the institute will not focus on ethical issues such as algorithmic bias or freedom of speech in AI applications. === United States === The US AISI was founded in November 2023 as part of the National Institute of Standards and Technology (NIST). This happened the day after the signature of the Executive Order 14110. In February 2024, Joe Biden's former economic policy adviser Elizabeth Kelly was appointed to lead it. In February 2024, the US government created the US AI Safety Institute Consortium (AISIC), regrouping more than 200 organizations such as Google, Anthropic or Microsoft. In March 2024, a budget of $10 million was allocated. Observers noted that this investment is relatively small, especially considering the presence of many big AI companies in the US. The NIST itself, which hosts the AISI, is also known for its chronic lack of funding. Biden administration's request for additional funding was met with further budget cuts from congressional appropriators. Under President Trump, plans for members of the agency to attend the February 2025 AI Action Summit in Paris were scrapped. The US and the UK refused to sign the summit's final communique. US Vice President JD Vance said "pro-growth AI policies" should be prioritised over safety. The name of the agency was changed in June 2025 to the Center for AI Standards and Innovation

    Read more →
  • Model

    Model

    A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in 16th-century English, and derived via French and Italian ultimately from Latin modulus, 'a measure'. Models can be divided into physical models (e.g. a ship model) and abstract models (e.g. a set of mathematical equations describing the workings of the atmosphere for the purpose of weather forecasting). Abstract or conceptual models are central to philosophy of science. In scholarly research and applied science, a model should not be confused with a theory: while a model seeks only to represent reality with the purpose of better understanding or predicting the world, a theory is more ambitious in that it claims to be an explanation of reality. == Types of model == === Model in specific contexts === As a noun, model has specific meanings in certain fields, derived from its original meaning of "structural design or layout": Model (art), a person posing for an artist, e.g. a 15th-century criminal representing the biblical Judas in Leonardo da Vinci's painting The Last Supper Model (person), a person who serves as a template for others to copy, as in a role model, often in the context of advertising commercial products; e.g. the first fashion model, Marie Vernet Worth in 1853, wife of designer Charles Frederick Worth. Model (product), a particular design of a product as displayed in a catalogue or show room (e.g. Ford Model T, an early car model) Model (organism) a non-human species that is studied to understand biological phenomena in other organisms, e.g. a guinea pig starved of vitamin C to study scurvy, an experiment that would be immoral to conduct on a person Model (mimicry), a species that is mimicked by another species Model (logic), a structure (a set of items, such as natural numbers 1, 2, 3,..., along with mathematical operations such as addition and multiplication, and relations, such as < {\displaystyle <} ) that satisfies a given system of axioms (basic truisms), i.e. that satisfies the statements of a given theory Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Model (MVC), the information-representing internal component of a software, as distinct from its user interface === Physical model === A physical model (most commonly referred to simply as a model but in this context distinguished from a conceptual model) is a smaller or larger physical representation of an object, person or system. The object being modelled may be small (e.g., an atom) or large (e.g., the Solar System) or life-size (e.g., a fashion model displaying clothes for similarly-built potential customers). The geometry of the model and the object it represents are often similar in the sense that one is a rescaling of the other. However, in many cases the similarity is only approximate or even intentionally distorted. Sometimes the distortion is systematic, e.g., a fixed scale horizontally and a larger fixed scale vertically when modelling topography to enhance a region's mountains. An architectural model permits visualization of internal relationships within the structure or external relationships of the structure to the environment. Another use is as a toy. Instrumented physical models are an effective way of investigating fluid flows for engineering design. Physical models are often coupled with computational fluid dynamics models to optimize the design of equipment and processes. This includes external flow such as around buildings, vehicles, people, or hydraulic structures. Wind tunnel and water tunnel testing is often used for these design efforts. Instrumented physical models can also examine internal flows, for the design of ductwork systems, pollution control equipment, food processing machines, and mixing vessels. Transparent flow models are used in this case to observe the detailed flow phenomenon. These models are scaled in terms of both geometry and important forces, for example, using Froude number or Reynolds number scaling (see Similitude). In the pre-computer era, the UK economy was modelled with the hydraulic model MONIAC, to predict for example the effect of tax rises on employment. === Conceptual model === A conceptual model is a theoretical representation of a system, e.g. a set of mathematical equations attempting to describe the workings of the atmosphere for the purpose of weather forecasting. It consists of concepts used to help understand or simulate a subject the model represents. Abstract or conceptual models are central to philosophy of science, as almost every scientific theory effectively embeds some kind of model of the physical or human sphere. In some sense, a physical model "is always the reification of some conceptual model; the conceptual model is conceived ahead as the blueprint of the physical one", which is then constructed as conceived. Thus, the term refers to models that are formed after a conceptualization or generalization process. === Examples === Conceptual model (computer science), an agreed representation of entities and their relationships, to assist in developing software Economic model, a theoretical construct representing economic processes Language model, a probabilistic model of a natural language, used for speech recognition, language generation, and information retrieval Large language models are artificial neural networks used for generative artificial intelligence (AI), e.g. ChatGPT Mathematical model, a description of a system using mathematical concepts and language Statistical model, a mathematical model that usually specifies the relationship between one or more random variables and other non-random variables Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Medical model, a proposed "set of procedures in which all doctors are trained" Mental model, in psychology, an internal representation of external reality Model (logic), a set along with a collection of finitary operations, and relations that are defined on it, satisfying a given collection of axioms Model (MVC), information-representing component of a software, distinct from the user interface (the "view"), both linked by the "controller" component, in the context of the model–view–controller software design Model act, a law drafted centrally to be disseminated and proposed for enactment in multiple independent legislatures Standard model (disambiguation) == Properties of models, according to general model theory == According to Herbert Stachowiak, a model is characterized by at least three properties: 1. Mapping A model always is a model of something—it is an image or representation of some natural or artificial, existing or imagined original, where this original itself could be a model. 2. Reduction In general, a model will not include all attributes that describe the original but only those that appear relevant to the model's creator or user. 3. Pragmatism A model does not relate unambiguously to its original. It is intended to work as a replacement for the original a) for certain subjects (for whom?) b) within a certain time range (when?) c) restricted to certain conceptual or physical actions (what for?). For example, a street map is a model of the actual streets in a city (mapping), showing the course of the streets while leaving out, say, traffic signs and road markings (reduction), made for pedestrians and vehicle drivers for the purpose of finding one's way in the city (pragmatism). Additional properties have been proposed, like extension and distortion as well as validity. The American philosopher Michael Weisberg differentiates between concrete and mathematical models and proposes computer simulations (computational models) as their own class of models. == Uses of models == According to Bruce Edmonds, there are at least 5 general uses for models: Prediction: reliably anticipating unknown data, including data within the domain of the training data (interpolation), and outside the domain (extrapolation) Explanation: establishing plausible chains of causality by proposing mechanisms that can explain patterns seen in data Theoretical exposition: discovering or proposing new hypotheses, or refuting existing hypotheses about the behaviour of the system being modelled Description: representing important aspects of the system being modelled Illustration: communicating an idea or explanation

    Read more →
  • Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

    Safe and Secure Innovation for Frontier Artificial Intelligence Models Act

    The Safe and Secure Innovation for Frontier Artificial Intelligence Models Act, or SB 1047, was a failed 2024 California bill intended to "mitigate the risk of catastrophic harms from AI models so advanced that they are not yet known to exist". Specifically, the bill would have applied to models which cost more than $100 million to train and were trained using a quantity of computing power greater than 1026 integer or floating-point operations. SB 1047 would have applied to all AI companies doing business in California—the location of the company would not matter. The bill would have created protections for whistleblowers and required developers to perform risk assessments of their models prior to release, with guidance from the Government Operations Agency. It would also have established CalCompute, a University of California public cloud computing cluster for startups, researchers and community groups. == Background == The rapid increase in capabilities of AI systems in the 2020s, including the release of ChatGPT in November 2022, caused some researchers and members of the public to become concerned about the existential risks associated with increasingly powerful AI systems. Hundreds of tech executives and AI researchers, including two of the so-called "Godfathers of AI", Geoffrey Hinton and Yoshua Bengio, signed a statement in May 2023 calling for the mitigation of the "risk of extinction from AI" to be a global priority alongside "pandemics and nuclear war". However, the plausibility of these risks is still widely debated. Strong regulation of AI has been criticized for purportedly causing regulatory capture by large AI companies like OpenAI, a phenomenon in which regulation advances the interest of larger companies at the expense of smaller competition and the public in general, although OpenAI ended up opposing the bill. Other advocates of AI regulation aim to prevent bias and privacy violations, rather than existential risks. For example, some experts who view existential concerns as overblown and unrealistic view them as a distraction from near-term harms of AI like discriminatory automated decision making. In the face of existential concerns, technology companies have made voluntary commitments to conduct safety testing, for example at the AI Safety Summit and AI Seoul Summit. In 2023, not long before the bill was proposed, Governor Newsom of California and President Biden issued executive orders on artificial intelligence. State Senator Wiener said SB 1047 draws heavily on the Biden executive order, and is motivated by the absence of unified federal legislation on AI safety. Historically, California has passed regulation on several tech issues itself, including consumer privacy and net neutrality, in the absence of action by Congress. == History == === Proposal and voting === The bill was authored by State Senator Scott Wiener. Wiener first proposed AI legislation for California through an intent bill called SB 294, the Safety in Artificial Intelligence Act, in September 2023. On February 7, 2024, Wiener introduced SB 1047. On May 21, SB 1047 passed the Senate 32–1. The bill was significantly amended by Wiener on August 15, 2024, in response to industry advice. Amendments included adding clarifications, and removing the creation of a "Frontier Model Division" and the penalty of perjury. On August 28, the bill passed the State Assembly 48–16. Then, due to the amendments, the bill was once again voted on by the Senate, passing 30–9. === Veto by governor === On September 29, Governor Gavin Newsom vetoed the bill. The deadline for California lawmakers to overrule Newsom's veto was November 30, 2024. Newsom cited concerns over the bill's regulatory framework targeting only large AI models based on their computational size, while not taking into account whether the models are deployed in high-risk environments. Newsom emphasized that this approach could create a false sense of security, overlooking smaller models that might present equally significant risks. He acknowledged the need for AI safety protocols but stressed the importance of adaptability in regulation as AI technology continues to evolve rapidly. Governor Newsom also committed to working with technology experts, federal partners, and research institutions, including the Carnegie Endowment for International Peace, led by former California Supreme Court Justice Mariano-Florentino Cuéllar; and Stanford University's Human-Centered AI (HAI) Institute, led by Dr. Fei-Fei Li. He announced plans to collaborate with these entities to advance responsible AI development, aiming to protect the public while fostering innovation. == Provisions == SB 1047 would have covered AI models with training compute over 1026 integer or floating-point operations and a cost of over $100 million. If a covered model is fine-tuned using more than $10 million, the resulting model would also have been covered. The bill would have defined critical harms with respect to four categories: Creation or use of a chemical, biological, radiological, or nuclear weapon Cyberattacks on critical infrastructure causing mass casualties or at least $500 million of damage Autonomous crimes causing mass casualties or at least $500 million of damage Other harms of comparable severity Developers would have needed to create a "safety and security protocol" before training covered models. Before deployment, they would have submitted a statement of compliance, confirming they took reasonable care to take measures to prevent covered models that pose an unreasonable risk of critical harms. The statement would have included risk assessments and descriptions of their compliance process. These rules would have applied to both covered models and their derivatives, including post-training modifications, with annual third-party audits required starting in 2026. Safeguards to reduce risk included the ability to shut down the model, which has been variously described as a "kill switch" and "circuit breaker". Whistleblowing provisions would have protected employees who report safety problems and incidents. Additionally, SB 1047 would have created a public cloud computing cluster called CalCompute, associated with the University of California, to support startups, researchers, and community groups that lack large-scale computing resources. === Compliance and supervision === SB 1047 would have required developers, beginning January 1, 2026, to annually retain a third-party auditor to perform an independent audit of compliance with the requirements of the bill, as provided. The Government Operations Agency would have reviewed the results of safety tests and incidents, and issue guidance, standards, and best practices. The bill would have created a Board of Frontier Models to supervise the application of the bill by the Government Operations Agency. It is would be composed of 9 members. == Reception == === Subjects of debate === Proponents of the bill described its provisions as simple and narrowly focused, with Sen. Scott Weiner describing it as a "light-touch, basic safety bill". This was disputed by critics of the bill, who described the bill's language as vague and criticized it as consolidating power in the largest AI companies at the expense of smaller ones. Proponents, in turn, argued that the bill only applies to models trained using more than 1026 FLOPS and with over $100 million, or fine-tuned with more than $10 million, and that the threshold could be increased if needed. The penalty of perjury was also a subject of debate, and was eventually removed through an amendment. The scope of the "kill switch" requirement was also reduced, following concerns from open-source developers. The use of the term "reasonable assurance" in the bill was also controversial, and it was eventually amended to "reasonable care". Critics then argued that "reasonable care" imposed an excessive burden by requiring confidence that models could not be used to cause catastrophic harm; proponents claimed that the standard did not require certainty and that it already applied to AI developers under existing law. === Support and opposition === Individual supporters of the bill included Turing Award recipients Yoshua Bengio and Geoffrey Hinton, Elon Musk, Bill de Blasio, Kevin Esvelt, Dan Hendrycks, Vitalik Buterin, OpenAI whistleblowers Daniel Kokotajlo and William Saunders, Lawrence Lessig, Sneha Revanur, Stuart Russell, Jan Leike, actors Mark Ruffalo, Sean Astin, and Rosie Perez, Scott Aaronson, and Max Tegmark. Over 120 Hollywood celebrities, including Mark Hamill, Jane Fonda, and J. J. Abrams, also signed a statement in support of the bill. Max Tegmark likened the bill's focus on holding companies responsible for the harms caused by their models to the FDA requiring clinical trials before a company can release a drug to the market. Organizations sponsoring the bill included the Center for AI Safety, Economic Security California and Encode. The la

    Read more →
  • Multimodal representation learning

    Multimodal representation learning

    Multimodal representation learning is a subfield of representation learning focused on integrating and interpreting information from different modalities, such as text, images, audio, or video, by projecting them into a shared latent space. This allows for semantically similar content across modalities to be mapped to nearby points within that space, facilitating a unified understanding of diverse data types. By automatically learning meaningful features from each modality and capturing their inter-modal relationships, multimodal representation learning enables a unified representation that enhances performance in cross-media analysis tasks such as video classification, event detection, and sentiment analysis. It also supports cross-modal retrieval and translation, including image captioning, video description, and text-to-image synthesis. == Motivation == The primary motivations for multimodal representation learning arise from the inherent nature of real-world data and the limitations of unimodal approaches. Since multimodal data offers complementary and supplementary information about an object or event from different perspectives, it is more informative than relying on a single modality. A key motivation is to narrow the heterogeneity gap that exists between different modalities by projecting their features into a shared semantic subspace. This allows semantically similar content across modalities to be represented by similar vectors, facilitating the understanding of relationships and correlations between them. Multimodal representation learning aims to leverage the unique information provided by each modality to achieve a more comprehensive and accurate understanding of concepts. These unified representations are crucial for improving performance in various cross-media analysis tasks such as video classification, event detection, and sentiment analysis. They also enable cross-modal retrieval, allowing users to search and retrieve content across different modalities. Additionally, it facilitates cross-modal translation, where information can be converted from one modality to another, as seen in applications like image captioning and text-to-image synthesis. The abundance of ubiquitous multimodal data in real-world applications, including understudied areas like healthcare, finance, and human-computer interaction (HCI), further motivates the development of effective multimodal representation learning techniques. == Approaches and methods == === Canonical-correlation analysis based methods === Canonical-correlation analysis (CCA) was first introduced in 1936 by Harold Hotelling and is a fundamental approach for multimodal learning. CCA aims to find linear relationships between two sets of variables. Given two data matrices X ∈ R n × p {\displaystyle X\in \mathbb {R} ^{n\times p}} and Y ∈ R n × q {\displaystyle Y\in \mathbb {R} ^{n\times q}} representing different modalities, CCA finds projection vectors w x ∈ R p {\displaystyle w_{x}\in \mathbb {R} ^{p}} and w y ∈ R q {\displaystyle w_{y}\in \mathbb {R} ^{q}} that maximizes the correlation between the projected variables: ρ = max w x , w y w x ⊤ Σ x y w y w x ⊤ Σ x x w x w y ⊤ Σ y y w y {\displaystyle \rho =\max _{w_{x},w_{y}}{\frac {w_{x}^{\top }\Sigma _{xy}w_{y}}{{\sqrt {w_{x}^{\top }\Sigma _{xx}w_{x}}}{\sqrt {w_{y}^{\top }\Sigma _{yy}w_{y}}}}}} such that Σ x x {\displaystyle \Sigma _{xx}} and Σ y y {\displaystyle \Sigma _{yy}} are the within-modality covariance matrices, and Σ x y {\displaystyle \Sigma _{xy}} is the between-modality covariance matrix. However, standard CCA is limited by its linearity, which led to the development of nonlinear extensions, such as kernel CCA and deep CCA. ==== Kernel CCA ==== Kernel canonical correlation analysis (KCCA) extends traditional CCA to capture nonlinear relationships between modalities by implicitly mapping the data into high dimensional feature spaces using kernel functions. Given kernel functions K x {\displaystyle K_{x}} and K y {\displaystyle K_{y}} with corresponding Gram matrices K x ∈ R n × n {\displaystyle K_{x}\in \mathbb {R} ^{n\times n}} and K y ∈ R n × n {\displaystyle K_{y}\in \mathbb {R} ^{n\times n}} , KCCA seeks coefficients α {\displaystyle \alpha } and β {\displaystyle \beta } that maximize: ρ = max α , β α ⊤ K x K y β α ⊤ K x 2 α β ⊤ K y 2 β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{\top }K_{x}Ky\beta }{{\sqrt {\alpha ^{\top }K_{x}^{2}\alpha }}{\sqrt {\beta ^{\top }K_{y}^{2}\beta }}}}} To prevent overfitting, regularization terms are typically added, resulting in: ρ = max α , β α T K x K y β α T ( K x 2 + λ x K x ) α β T ( K y 2 + λ y K y ) β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{T}K_{x}K_{y}\beta }{{\sqrt {\alpha ^{T}\left(K_{x}^{2}+\lambda _{x}K_{x}\right)\alpha }}{\sqrt {\;\beta ^{T}\left(K_{y}^{2}+\lambda _{y}K_{y}\right)\beta }}}}} where λ x {\displaystyle \lambda _{x}} and λ y {\displaystyle \lambda _{y}} are regularization parameters. KCCA has proven effective for tasks such as cross-modal retrieval and semantic analysis, though it faces computational challenges with large datasets due to its O ( n 2 ) {\displaystyle O(n^{2})} memory requirement for sorting kernel matrices. KCCA was proposed independently by several researchers. ==== Deep CCA ==== Deep canonical correlation analysis (DCCA), introduced in 2013, employs neural networks to learn nonlinear transformations for maximizing the correlation between modalities. DCCA uses separate neural networks f x {\displaystyle f_{x}} and f y {\displaystyle f_{y}} for each modality to transform the original data before applying CCA: max W x , W y , θ x , θ y corr ⁡ ( f x ( X ; θ x ) , f y ( Y ; θ y ) ) {\displaystyle \max _{W_{x},W_{y},\theta _{x},\theta _{y}}\operatorname {corr} \left(f_{x}(X;\theta _{x}),f_{y}(Y;\theta _{y})\right)} where θ x {\displaystyle \theta _{x}} and θ y {\displaystyle \theta _{y}} represent the parameters of the neural networks, and W x {\displaystyle W_{x}} and W y {\displaystyle W_{y}} are the CCA projection matrices. The correlation objective is computed as: corr ⁡ ( H x , H y ) = tr ⁡ ( T − 1 / 2 H x T H y S − 1 / 2 ) {\displaystyle \operatorname {corr} (H_{x},H_{y})=\operatorname {tr} \left(T^{-1/2}H_{x}^{T}H_{y}S^{-1/2}\right)} where H x = f x ( X ) {\displaystyle H_{x}=f_{x}(X)} and H y = f y ( Y ) {\displaystyle H_{y}=f_{y}(Y)} are the network outputs, T = H x T H x + r x I {\displaystyle T=H_{x}^{T}H_{x}+r_{x}I} , S = H y T H y + r y I {\displaystyle S=H_{y}^{T}H_{y}+r_{y}I} and r x , r y {\displaystyle r_{x},r_{y}} are the regularization parameters. DCCA overcomes the limitations of linear CCA and kernel CCA by learning complex nonlinear relationships while maintaining computational efficiency for large datasets through mini-batch optimization. === Graph-based methods === Graph-based approaches for multimodal representation learning leverage graph structure to model relationships between entities across different modalities. These methods typically represent each modality as a graph and then learn embedding that preserve cross-modal similarities, enabling more effective joint representation of heterogeneous data. One such method is cross-modal graph neural networks (CMGNNs) that extend traditional graph neural networks (GNNs) to handle data from multiple modalities by constructing graphs that capture both intra-modal and inter-modal relationships. These networks model interactions across modalities by representing them as nodes and their relationships as edges. Other graph-based methods include Probabilistic Graphical Models (PGMs) such as deep belief networks (DBN) and deep Boltzmann machines (DBM). These models can learn a joint representation across modalities, for instance, a multimodal DBN achieves this by adding a shared restricted Boltzmann Machine (RBM) hidden layer on top of modality-specific DBNs. Additionally, the structure of data in some domains like Human-Computer Interaction (HCI), such as the view hierarchy of app screens, can potentially be modeled using graph-like structures. The field of graph representation learning is also relevant, with ongoing progress in developing evaluation benchmarks. === Diffusion maps === Another set of methods relevant to multimodal representation learning are based on diffusion maps and their extensions to handle multiple modalities. ==== Multi-view diffusion maps ==== Multi-view diffusion maps address the challenge of achieving multi-view dimensionality reduction by effectively utilizing the availability of multiple views to extract a coherent low-dimensional representation of the data. The core idea is to exploit both the intrinsic relations within each view and the mutual relations between the different views, defining a cross-view model where a random walk process implicitly hops between objects in different views. A multi-view kernel matrix is constructed by combining these relations, defining a cross-view diffusion process and associ

    Read more →
  • Illia Polosukhin

    Illia Polosukhin

    Illia Polosukhin is a Ukrainian-born computer scientist and entrepreneur known for his work on the transformer architecture in machine learning and for co-founding the NEAR blockchain. == Early life and education == Polosukhin studied at the Kharkiv Polytechnic Institute, later relocating to San Diego and then moving to Silicon Valley. == Career == === Google and transformer research === Polosukhin worked at Google and was part of the team associated with research on self-attention that culminated in the 2017 paper Attention Is All You Need, widely credited with introducing the transformer architecture used in modern large language models. === NEAR Protocol === After his work in machine learning, Polosukhin became a co-founder of NEAR Protocol and later associated with the NEAR Foundation ecosystem. In 2023, Polosukhin publicly argued that increasingly capable A.I. systems should be more transparent and user-controlled, and expressed skepticism that conventional regulation alone would solve problems created by closed, corporate models, warning about risks such as regulatory capture. He has promoted “user-owned AI” concepts that combine open approaches with decentralized infrastructure aligned with the blockchain technology. In 2024, Polosukhin downplayed scenarios of A.I. independently causing human extinction, arguing that conflicts are driven by people and that misuse of AI would reflect human intent and incentives. Later this year, Polosukhin said the NEAR Foundation would reduce its workforce by about 40%. == Publications == Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Lukasz Kaiser, Illia Polosukhin; et al. (2017). "Attention Is All You Need". arXiv.{{cite journal}}: CS1 maint: multiple names: authors list (link)

    Read more →
  • LightGBM

    LightGBM

    LightGBM, short for Light Gradient-Boosting Machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft. It is based on decision tree algorithms and used for ranking, classification and other machine learning tasks. The development focus is on performance and scalability. == Overview == The LightGBM framework supports different algorithms including GBT, GBDT, GBRT, GBM, MART and RF. LightGBM has many of XGBoost's advantages, including sparse optimization, parallel training, multiple loss functions, regularization, bagging, and early stopping. A major difference between the two lies in the construction of trees. LightGBM does not grow a tree level-wise — row by row — as most other implementations do. Instead it grows trees leaf-wise. It will choose the leaf with max delta loss to grow. Besides, LightGBM does not use the widely used sorted-based decision tree learning algorithm, which searches the best split point on sorted feature values, as XGBoost or other implementations do. Instead, LightGBM implements a highly optimized histogram-based decision tree learning algorithm, which yields great advantages on both efficiency and memory consumption. The LightGBM algorithm utilizes two novel techniques called Gradient-Based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) which allow the algorithm to run faster while maintaining a high level of accuracy. LightGBM works on Linux, Windows, and macOS and supports C++, Python, R, and C#. The source code is licensed under MIT License and available on GitHub. == Gradient-based one-side sampling == When using gradient descent, one thinks about the space of possible configurations of the model as a valley, in which the lowest part of the valley is the model which most closely fits the data. In this metaphor, one walks in different directions to learn how much lower the valley becomes. Typically, in gradient descent, one uses the whole set of data to calculate the valley's slopes. However, this commonly used method assumes that every data point is equally informative. By contrast, Gradient-Based One-Side Sampling (GOSS), a method first developed for gradient-boosted decision trees, does not rely on the assumption that all data are equally informative. Instead, it treats data points with smaller gradients (shallower slopes) as less informative by randomly dropping them. This is intended to filter out data which may have been influenced by noise, allowing the model to more accurately model the underlying relationships in the data. == Exclusive feature bundling == Exclusive feature bundling (EFB) is a near-lossless method to reduce the number of effective features. In a sparse feature space many features are nearly exclusive, implying they rarely take nonzero values simultaneously. One-hot encoded features are a perfect example of exclusive features. EFB bundles these features, reducing dimensionality to improve efficiency while maintaining a high level of accuracy. The bundle of exclusive features into a single feature is called an exclusive feature bundle.

    Read more →
  • United States export controls on AI chips and semiconductors

    United States export controls on AI chips and semiconductors

    United States export controls on AI chips and semiconductors are a series of regulations imposed by the United States restricting the export of technology and equipment related to artificial intelligence to other countries, primarily targeting China. This has happened in the context of a broader trade war. In January 2026, BIS formalized a flexible license review policy for these transactions.

    Read more →
  • Edits (app)

    Edits (app)

    Edits is an American photo and short form video editing software service owned by Meta Platforms. It allows users to create videos and edit them by using features like green screens, and AI animation, and also provides real-time statistics to Instagram creators to track their accounts. Accounts directly from Instagram can be imported, and videos can be exported vice-versa. It is available solely on iOS and Android. On Apple, it supports over 32 different languages, including French, Spanish, and Chinese. It has been noted by critics as a direct competitor for apps like CapCut, owned by Chinese brand ByteDance. The Instagram head, Adam Mosseri, also acknowledged these similarities. Launched on April 22 for both iOS and Android. It received over 5M+ users on Apple and Android combined in its first 4 days since its launch. == History == On January 19, 2025, following the ban of all ByteDance Apps from the Google Play Store, and App Store, Instagram head Adam Mosseri announced on Threads that they would be launching the app in February for iOS, followed by an Android counterpart. He said the app is working with select people to test its features. In a separate post, he emphasized that the app is "more for creators than casual video makers". == Features == Edits contains many similar features to other competition of video editors like KineMaster, Inshot, and CapCut. When creating a video, users have the option to export in resolution of HD, 4K, and 2K, along with having HDR and SDR support. Like many traditional video editing software, it includes a timeline, and basic undo-redo buttons. On the bottom bar, 7 tabs for editing exist, namely the Split, Volume, Adjust, Speed, Delete, Filters, Green Screen, Voice FX, Extract Audio, Mirror, Slip, Replace and Duplicate bars. Basic features, like splitting, and adjusting speed and volume of clips are present, along with more advanced Green Screens, and AI features. Being a mobile video editor app, Edits also has drag-and-drop features to ease customer usage. Users have the ability to record videos directly within the app. This feature allows users to create content without needing extra software or devices. They can choose from several focal lengths, which affect how close or wide the shot appears. The app also supports different frame rates. Users have the ability to record videos directly within the app. This feature allows users to create content without needing extra software or devices. Once users are done filming your clips, they can simply transfer them into a project to start editing immediately. Upcoming features for the app include Keyframes, AI-powered modification, Collaboration, and Enhanced creativity. == Reception == Since its release, it received over 5 million downloads in 4 days. Critically, the app received great rankings from many. From users, the app received an average of 4.45 stars over Google Play Store and App Store in the first few days, with Google Play Store receiving the least stars. As in reviews, it was received mixed by the public. Many people praised the smoothness and intuivity of the app. "The app is more than just a basic editor, offering a full suite of creative tools, including a dedicated tab for inspiration and trending audio, as well as a tab for managing drafts," said a blogger. Some users were disappointed with the range of editing tools, some users have noted that it could benefit from more transition options between clips. Some even reported crashing between clips.

    Read more →
  • Tree (abstract data type)

    Tree (abstract data type)

    In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be connected to exactly one parent, except for the root node, which has no parent (i.e., the root node as the top-most node in the tree hierarchy). These constraints mean there are no cycles or "loops" (no node can be its own ancestor), and also that each child can be treated like the root node of its own subtree, making recursion a useful technique for tree traversal. In contrast to linear data structures, many trees cannot be represented by relationships between neighboring nodes (parent and children nodes of a node under consideration, if they exist) in a single straight line (called edge or link between two adjacent nodes). Binary trees are a commonly used type, which constrain the number of children for each parent to at most two. When the order of the children is specified, this data structure corresponds to an ordered tree in graph theory. A value or pointer to other data may be associated with every node in the tree, or sometimes only with the leaf nodes, which have no children nodes. The abstract data type (ADT) can be represented in a number of ways, including a list of parents with pointers to children, a list of children with pointers to parents, or a list of nodes and a separate list of parent-child relations (a specific type of adjacency list). Representations might also be more complicated, for example using indexes or ancestor lists for performance. Trees as used in computing are similar to but can be different from mathematical constructs of trees in graph theory, trees in set theory, and trees in descriptive set theory. == Terminology == A node is a structure which may contain data and connections to other nodes, sometimes called edges or links. Each node in a tree has zero or more child nodes, which are below it in the tree (by convention, trees are drawn with descendants going downwards). A node that has a child is called the child's parent node (or superior). All nodes have exactly one parent, except the topmost root node, which has none. A node might have many ancestor nodes, such as the parent's parent. Child nodes with the same parent are sibling nodes. Typically siblings have an order, with the first one conventionally drawn on the left. Some definitions allow a tree to have no nodes at all, in which case it is called empty. An internal node (also known as an inner node, inode for short, or branch node) is any node of a tree that has child nodes. Similarly, an external node (also known as an outer node, leaf node, or terminal node) is any node that does not have child nodes. The height of a node is the length of the longest downward path to a leaf from that node. The height of the root is the height of the tree. The depth of a node is the length of the path to its root (i.e., its root path). Thus the root node has depth zero, leaf nodes have height zero, and a tree with only a single node (hence both a root and leaf) has depth and height zero. Conventionally, an empty tree (tree with no nodes, if such are allowed) has height −1. Each non-root node can be treated as the root node of its own subtree, which includes that node and all its descendants. Other terms used with trees: Neighbor Parent or child. Ancestor A node reachable by repeated proceeding from child to parent. Descendant A node reachable by repeated proceeding from parent to child. Also known as subchild. Degree For a given node, its number of children. A leaf, by definition, has degree zero. Degree of tree The degree of a tree is the maximum degree of a node in the tree. Distance The number of edges along the shortest path between two nodes. Level The level of a node is the number of edges along the unique path between it and the root node. This is the same as depth. Width The number of nodes in a level. Breadth The number of leaves. Complete tree A tree with every level filled, except the last. Forest A set of one or more disjoint trees. Ordered tree A rooted tree in which an ordering is specified for the children of each vertex. Size of a tree Number of nodes in the tree. == Common operations == Enumerating all the items Enumerating a section of a tree Searching for an item Adding a new item at a certain position on the tree Deleting an item Pruning: Removing a whole section of a tree Grafting: Adding a whole section to a tree Finding the root for any node Finding the lowest common ancestor of two nodes === Traversal and search methods === Stepping through the items of a tree, by means of the connections between parents and children, is called walking the tree, and the action is a walk of the tree. Often, an operation might be performed when a pointer arrives at a particular node. A walk in which each parent node is traversed before its children is called a pre-order walk; a walk in which the children are traversed before their respective parents are traversed is called a post-order walk; a walk in which a node's left subtree, then the node itself, and finally its right subtree are traversed is called an in-order traversal. (This last scenario, referring to exactly two subtrees, a left subtree and a right subtree, assumes specifically a binary tree.) A level-order walk effectively performs a breadth-first search over the entirety of a tree; nodes are traversed level by level, where the root node is visited first, followed by its direct child nodes and their siblings, followed by its grandchild nodes and their siblings, etc., until all nodes in the tree have been traversed. == Representations == There are many different ways to represent trees. In working memory, nodes are typically dynamically allocated records with pointers to their children, their parents, or both, as well as any associated data. If of a fixed size, the nodes might be stored in a list. Nodes and relationships between nodes might be stored in a separate special type of adjacency list. In relational databases, nodes are typically represented as table rows, with indexed row IDs facilitating pointers between parents and children. Nodes can also be stored as items in an array, with relationships between them determined by their positions in the array (as in a binary heap). A binary tree can be implemented as a list of lists: the head of a list (the value of the first term) is the left child (subtree), while the tail (the list of second and subsequent terms) is the right child (subtree). This can be modified to allow values as well, as in Lisp S-expressions, where the head (value of first term) is the value of the node, the head of the tail (value of second term) is the left child, and the tail of the tail (list of third and subsequent terms) is the right child. Ordered trees can be naturally encoded by finite sequences, for example with natural numbers. == Examples of trees and non-trees == == Type theory == As an abstract data type, the abstract tree type T with values of some type E is defined, using the abstract forest type F (list of trees), by the functions: value: T → E children: T → F nil: () → F node: E × F → T with the axioms: value(node(e, f)) = e children(node(e, f)) = f In terms of type theory, a tree is an inductive type defined by the constructors nil (empty forest) and node (tree with root node with given value and children). == Mathematical terminology == Viewed as a whole, a tree data structure is an ordered tree, generally with values attached to each node. Concretely, it is (if required to be non-empty): A rooted tree with the "away from root" direction (a more narrow term is an "arborescence"), meaning: A directed graph, whose underlying undirected graph is a tree (any two vertices are connected by exactly one simple path), with a distinguished root (one vertex is designated as the root), which determines the direction on the edges (arrows point away from the root; given an edge, the node that the edge points from is called the parent and the node that the edge points to is called the child), together with: an ordering on the child nodes of a given node, and a value (of some data type) at each node. Often trees have a fixed (more properly, bounded) branching factor (outdegree), particularly always having two child nodes (possibly empty, hence at most two non-empty child nodes), hence a "binary tree". Allowing empty trees makes some definitions simpler, some more complicated: a rooted tree must be non-empty, hence if empty trees are allowed the above definition instead becomes "an empty tree or a rooted tree such that ...". On the other hand, empty trees simplify defining fixed branching factor: with empty trees allowed, a binary tree is a tree such that every node has exactly two children, each of which is a tree (possibly empty). == Applications == Trees are commonly used to represent or manipulate hierarchical data in ap

    Read more →
  • Fifth Generation Computer Systems

    Fifth Generation Computer Systems

    The Fifth Generation Computer Systems (FGCS; Japanese: 第五世代コンピュータ, romanized: daigosedai konpyūta) was a 10-year initiative launched in 1982 by Japan's Ministry of International Trade and Industry (MITI) to develop computers based on massively parallel computing and logic programming. The project aimed to create an "epoch-making computer" with supercomputer-like performance and to establish a platform for future advancements in artificial intelligence. Although FGCS was noted as ahead of its time, and its ambitious goals contributed significantly to the development of concurrent logic programming, it ultimately ended in commercial failure. The term "fifth generation" was chosen to emphasize the system's advanced nature. In the history of computing hardware, there had been four prior "generations" of computers: the first generation utilized vacuum tubes; the second, transistors and diodes; the third, integrated circuits; and the fourth, microprocessors. While earlier generations focused on increasing the number of logic elements within a single CPU, it was widely believed at the time that the fifth generation would achieve enhanced performance through the use of massive numbers of CPUs. == Background == In the late 1960s until the early 1970s, there was much talk about "generations" of computer hardware, then usually organized into three generations First generation: Thermionic vacuum tubes. Mid-1940s. IBM pioneered the arrangement of vacuum tubes in pluggable modules. The IBM 650 was a first-generation computer. Second generation: Transistors. 1956. The era of miniaturization begins. Transistors are much smaller than vacuum tubes, draw less power, and generate less heat. Discrete transistors are soldered to circuit boards, with interconnections accomplished by stencil-screened conductive patterns on the reverse side. The IBM 7090 was a second-generation computer. Third generation: Integrated circuits (silicon chips containing multiple transistors). 1964. A pioneering example is the ACPX module used in the IBM 360/91, which, by stacking layers of silicon over a ceramic substrate, accommodated over 20 transistors per chip; the chips could be packed together onto a circuit board to achieve unprecedented logic densities. The IBM 360/91 was a hybrid second and third-generation computer. Omitted from this taxonomy is the "zeroth-generation" computer based on metal gears (such as the IBM 407) or mechanical relays (such as the Mark I), and the post-third-generation computers based on Very Large Scale Integrated (VLSI) circuits. There was also a parallel set of generations for software: First generation: Machine language. Second generation: Low-level programming languages such as Assembly language. Third generation: Structured high-level programming languages such as C, COBOL and FORTRAN. Fourth generation: "Non-procedural" high-level programming languages (such as object-oriented languages). Throughout these multiple generations up to the 1970s, Japan built computers following U.S. and British leads. In the mid-1970s, the Ministry of International Trade and Industry stopped following western leads and started looking into the future of computing on a small scale. They asked the Japan Information Processing Development Center (JIPDEC) to indicate a number of future directions, and in 1979 offered a three-year contract to carry out more in-depth studies along with industry and academia. It was during this period that the term "fifth-generation computer" started to be used. Prior to the 1970s, MITI guidance had successes such as an improved steel industry, the creation of the oil supertanker, the automotive industry, consumer electronics, and computer memory. MITI decided that the future was going to be information technology. However, the Japanese language, particularly in its written form, presented and still presents obstacles for computers. As a result of these hurdles, MITI held a conference to seek assistance from experts. The primary fields for investigation from this initial project were: Inference computer technologies for knowledge processing Computer technologies to process large-scale data bases and knowledge bases High-performance workstations Distributed functional computer technologies Super-computers for scientific calculation == Project launch == The aim was to build parallel computers for artificial intelligence applications using concurrent logic programming. The project imagined an "epoch-making" computer with supercomputer-like performance running on top of large databases (as opposed to a traditional filesystem) using a logic programming language to define and access the data using massively parallel computing/processing. They envisioned building a prototype machine with performance between 100M and 1G LIPS, where a LIPS is a Logical Inference Per Second. At the time typical workstation machines were capable of about 100k LIPS. They proposed to build this machine over a ten-year period, 3 years for initial R&D, 4 years for building various subsystems, and a final 3 years to complete a working prototype system. In 1982 the government decided to go ahead with the project, and established the Institute for New Generation Computer Technology (ICOT) through joint investment with various Japanese computer companies. After the project ended, MITI would consider an investment in a new "sixth generation" project. Ehud Shapiro captured the rationale and motivations driving this project: "As part of Japan's effort to become a leader in the computer industry, the Institute for New Generation Computer Technology has launched a revolutionary ten-year plan for the development of large computer systems which will be applicable to knowledge information processing systems. These Fifth Generation computers will be built around the concepts of logic programming. In order to refute the accusation that Japan exploits knowledge from abroad without contributing any of its own, this project will stimulate original research and will make its results available to the international research community." === Logic programming === The target defined by the FGCS project was to develop "Knowledge Information Processing systems" (roughly meaning, applied Artificial Intelligence). The chosen tool to implement this goal was logic programming. Logic programming approach as was characterized by Maarten Van Emden – one of its founders – as: The use of logic to express information in a computer. The use of logic to present problems to a computer. The use of logical inference to solve these problems. More technically, it can be summed up in two equations: Program = Set of axioms. Computation = Proof of a statement from axioms. The Axioms typically used are universal axioms of a restricted form, called Horn-clauses or definite-clauses. The statement proved in a computation is an existential statement. The proof is constructive, and provides values for the existentially quantified variables: these values constitute the output of the computation. Logic programming was thought of as something that unified various gradients of computer science (software engineering, databases, computer architecture and artificial intelligence). It seemed that logic programming was a key missing connection between knowledge engineering and parallel computer architectures. == Results == After having influenced the consumer electronics field during the 1970s and the automotive world during the 1980s, the Japanese had developed a strong reputation. The launch of the FGCS project spread the belief that parallel computing was the future of all performance gains, producing a wave of apprehension in the computer field. Soon parallel projects were set up in the US as the Strategic Computing Initiative and the Microelectronics and Computer Technology Corporation (MCC), in the UK as Alvey, and in Europe as the European Strategic Program on Research in Information Technology (ESPRIT), as well as the European Computer‐Industry Research Centre (ECRC) in Munich, a collaboration between ICL in Britain, Bull in France, and Siemens in Germany. The project ran from 1982 to 1994, spending a little less than ¥57 billion (about US$320 million) total. After the FGCS Project, MITI stopped funding large-scale computer research projects, and the research momentum developed by the FGCS Project dissipated. However MITI/ICOT embarked on a neural-net project which some called the Sixth Generation Project in the 1990s, with a similar level of funding. Per-year spending was less than 1% of the entire R&D expenditure of the electronics and communications equipment industry. For example, the project's highest expenditure year was 7.2 million yen in 1991, but IBM alone spent 1.5 billion dollars (370 billion yen) in 1982, while the industry spent 2150 billion yen in 1990. === Concurrent logic programming === In 1982, during a visit to the ICOT, Ehud Shapiro invented Concurrent Prolog, a novel programming language t

    Read more →