AI Detector Zero

AI Detector Zero — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Computers & Graphics

    Computers & Graphics

    Computers & Graphics is a peer-reviewed scientific journal that covers computer graphics and related subjects such as data visualization, human-computer interaction, virtual reality, and augmented reality. It was established in 1975 and originally published by Pergamon Press. It is now published by Elsevier, which acquired Pergamon Press in 1991. From 2018 to 2022 Graphics and Visual Computing was an open access sister journal sharing the same editorial team and double-blind peer-review policies. It has since merged into GMOD, the International Journal of Graphical Models. == History == The journal was established in 1975 by founding editor-in-chief Robert Schiffman (University of Colorado, Boulder), as Computers & Graphics-UK. Schiffman, who co-organized the first SIGGRAPH conference in 1974, had the conference proceedings published as the first issue of the journal. He was succeeded in 1978 by Larry Feeser (Rensselaer Polytechnic Institute). In 1983 José Luis Encarnação (Technische Hochschule Darmstadt) took over. Joaquim Jorge (University of Lisbon) has been Editor-in-Chief since 2007. == Replicability == The journal is working with the Graphics Replicability Stamp Initiative to promote replicable results in publication. == Abstracting and indexing == The journal is abstracted and indexed in: Current Contents/Engineering, Computing & Technology EBSCO databases Ei Compendex Inspec ProQuest databases Science Citation Index Expanded Scopus Chinese Computer Federation/Recommended List of International Conferences and Journals on CAD & Graphics and Multimedia. According to the Journal Citation Reports, the journal has a 2022 impact factor of 2.5.

    Read more →
  • Lighthill report

    Lighthill report

    Artificial Intelligence: A General Survey, commonly known as the Lighthill report, is a scholarly article by James Lighthill, published in Artificial Intelligence: a paper symposium in 1973. It was compiled by Lighthill for the British Science Research Council as an evaluation of academic research in the field of artificial intelligence (AI). The report gave a very pessimistic prognosis for many core aspects of research in this field, stating that "In no part of the field have the discoveries made so far produced the major impact that was then promised". It "formed the basis for the decision by the British government to end support for AI research in most British universities", contributing to an AI winter in the United Kingdom. == Publication history == It was commissioned by the SRC in 1972 for Lighthill to "make a personal review of the subject [of AI]". Lighthill completed the report in July. The SRC discussed the report in September, and decided to publish it, together with some alternative points of view by Stuart Sutherland, Roger Needham, Christopher Longuet-Higgins, and Donald Michie. The SRC's decision to invite the report was partly a reaction to high levels of discord within the University of Edinburgh's Department of Artificial Intelligence, one of the earliest and biggest centres for AI research in the UK. On May 9, 1973, Lighthill debated several leading AI researchers (Donald Michie, John McCarthy, Richard Gregory) at the Royal Institution in London concerning the report. == Content == While the report was supportive of research into the simulation of neurophysiological and psychological processes, it was "highly critical of basic research in foundational areas such as robotics and language processing". The report stated that AI researchers had failed to address the issue of combinatorial explosion when solving problems within real-world domains. That is, the report states that whilst AI techniques may have worked within the scope of small problem domains, the techniques would not scale up well to solve more realistic problems. The report represents a pessimistic view of AI that began after early excitement in the field. The report divides AI research into three categories: Advanced Automation ("A"): applications of AI, such as optical character recognition, mechanical component design and manufacture, missile perception and guidance, etc. Computer-based Central Nervous System research ("C"): building computational models of human brains (neurobiology) and behavior (psychology). Bridge, or Building Robots ("B"): research that combines categories A and C. This category is intentionally vague. Projects in category A had had some success, but only in restricted domains where a large quantity of detailed knowledge was used in designing the program. This was disappointing to researchers who hoped for generic methods. Due to the issue of the combinatorial explosion, the amount of detailed knowledge required by the program quickly grew too large to be entered by hand, thus restricting projects to restricted domains. Projects in category C had had some measure of success. Artificial neural networks were successfully used to model neurobiological data. SHRDLU demonstrated that human use of language, even in fine details, depends on the semantics or knowledge, and is not purely syntactical. This was influential in psycholinguistics. Attempts to extend SHRDLU to larger domains of discourse was considered impractical, again due to the issue of the combinatorial explosion. Projects in category B were held to be failures. One important project, that of "programming and building a robot that would mimic human ability in a combination of eye-hand co-ordination and common-sense problem solving", was considered entirely disappointing. Similarly, chess playing programs were no better than human amateurs. Due to the combinatorial explosion, the run-time of general algorithms quickly grew impractical, requiring detailed problem-specific heuristics. The report stated that it was expected that within the next 25 years, category A would simply become applied technologies engineering, C would integrate with psychology and neurobiology, while category B would be abandoned.

    Read more →
  • Tractable (company)

    Tractable (company)

    Tractable is a technology company specializing in the development of Artificial Intelligence (AI) to assess damage to property and vehicles. The AI allows users to appraise damage digitally. == Technology == Tractable's technology uses computer vision and deep learning to automate the appraisal of visual damage in accident and disaster recovery, for example to a vehicle. Drivers can be directed to use the application by their insurer after an accident, with the aim of settling their claim more quickly. The AI evaluates the damage from images, and therefore doesn't assess what isn't visible (such as, for example, interior damage to a vehicle or property). == History == Alexandre Dalyac and Razvan Ranca founded Tractable in 2014, and Adrien Cohen joined as co-founder in 2015. The company employs more than 300 staff members, largely in the United Kingdom. Tractable was named one of the 100 leading AI companies in the world in 2020 and 2021 by CB Insights. It won the Best Technology Award in the 2020 British Insurance Awards. In June 2021, Tractable announced a venture round that valued the company at $1 billion. Tractable was the UK's 100th billion-dollar tech company, or unicorn. In July 2023, the company received a $65 million investment from SoftBank Group, through its Vision Fund 2.

    Read more →
  • Capsule neural network

    Capsule neural network

    A capsule neural network (CapsNet) is a machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. The approach is an attempt to more closely mimic biological neural organization. The idea is to add structures called "capsules" to a convolutional neural network (CNN), and to reuse output from several of those capsules to form more stable (with respect to various perturbations) representations for higher capsules. The output is a vector consisting of the probability of an observation, and a pose for that observation. This vector is similar to what is done for example when doing classification with localization in CNNs. Among other benefits, capsnets address the "Picasso problem" in image recognition: images that have all the right parts but that are not in the correct spatial relationship (e.g., in a "face", the positions of the mouth and one eye are switched). For image recognition, capsnets exploit the fact that while viewpoint changes have nonlinear effects at the pixel level, they have linear effects at the part/object level. This can be compared to inverting the rendering of an object of multiple parts. == History == In 2000, Geoffrey Hinton et al. described an imaging system that combined segmentation and recognition into a single inference process using parse trees. So-called credibility networks described the joint distribution over the latent variables and over the possible parse trees. That system proved useful on the MNIST handwritten digit database. A dynamic routing mechanism for capsule networks was introduced by Hinton and his team in 2017. The approach was claimed to reduce error rates on MNIST and to reduce training set sizes. Results were claimed to be considerably better than a CNN on highly overlapped digits. In Hinton's original idea one minicolumn would represent and detect one multidimensional entity. == Transformations == An invariant is an object property that does not change as a result of some transformation. For example, the area of a circle does not change if the circle is shifted to the left. Informally, an equivariant is a property that changes predictably under transformation. For example, the center of a circle moves by the same amount as the circle when shifted. A nonequivariant is a property whose value does not change predictably under a transformation. For example, transforming a circle into an ellipse means that its perimeter can no longer be computed as π times the diameter. In computer vision, the class of an object is expected to be an invariant over many transformations. I.e., a cat is still a cat if it is shifted, turned upside down or shrunken in size. However, many other properties are instead equivariant. The volume of a cat changes when it is scaled. Equivariant properties such as a spatial relationship are captured in a pose, data that describes an object's translation, rotation, scale and reflection. Translation is a change in location in one or more dimensions. Rotation is a change in orientation. Scale is a change in size. Reflection is a mirror image. Unsupervised capsnets learn a global linear manifold between an object and its pose as a matrix of weights. In other words, capsnets can identify an object independent of its pose, rather than having to learn to recognize the object while including its spatial relationships as part of the object. In capsnets, the pose can incorporate properties other than spatial relationships, e.g., color (cats can be of various colors). Multiplying the object by the manifold poses the object (for an object, in space). == Pooling == Capsnets reject the pooling layer strategy of conventional CNNs that reduces the amount of detail to be processed at the next higher layer. Pooling allows a degree of translational invariance (it can recognize the same object in a somewhat different location) and allows a larger number of feature types to be represented. Capsnet proponents argue that pooling: violates biological shape perception in that it has no intrinsic coordinate frame; provides invariance (discarding positional information) instead of equivariance (disentangling that information); ignores the linear manifold that underlies many variations among images; routes statically instead of communicating a potential "find" to the feature that can appreciate it; damages nearby feature detectors, by deleting the information they rely upon. == Capsules == A capsule is a set of neurons that individually activate for various properties of a type of object, such as position, size and hue. Formally, a capsule is a set of neurons that collectively produce an activity vector with one element for each neuron to hold that neuron's instantiation value (e.g., hue). Graphics programs use instantiation value to draw an object. Capsnets attempt to derive these from their input. The probability of the entity's presence in a specific input is the vector's length, while the vector's orientation quantifies the capsule's properties. Artificial neurons traditionally output a scalar, real-valued activation that loosely represents the probability of an observation. Capsnets replace scalar-output feature detectors with vector-output capsules and max-pooling with routing-by-agreement. Because capsules are independent, when multiple capsules agree, the probability of correct detection is much higher. A minimal cluster of two capsules considering a six-dimensional entity would agree within 10% by chance only once in a million trials. As the number of dimensions increase, the likelihood of a chance agreement across a larger cluster with higher dimensions decreases exponentially. Capsules in higher layers take outputs from capsules at lower layers, and accept those whose outputs cluster. A cluster causes the higher capsule to output a high probability of observation that an entity is present and also output a high-dimensional (20-50+) pose. Higher-level capsules ignore outliers, concentrating on clusters. This is similar to the Hough transform, the RHT and RANSAC from classic digital image processing. == Routing by agreement == The outputs from one capsule (child) are routed to capsules in the next layer (parent) according to the child's ability to predict the parents' outputs. Over the course of a few iterations, each parents' outputs may converge with the predictions of some children and diverge from those of others, meaning that that parent is present or absent from the scene. For each possible parent, each child computes a prediction vector by multiplying its output by a weight matrix (trained by backpropagation). Next the output of the parent is computed as the scalar product of a prediction with a coefficient representing the probability that this child belongs to that parent. A child whose predictions are relatively close to the resulting output successively increases the coefficient between that parent and child and decreases it for parents that it matches less well. This increases the contribution that that child makes to that parent, thus increasing the scalar product of the capsule's prediction with the parent's output. After a few iterations, the coefficients strongly connect a parent to its most likely children, indicating that the presence of the children imply the presence of the parent in the scene. The more children whose predictions are close to a parent's output, the more quickly the coefficients grow, driving convergence. The pose of the parent (reflected in its output) progressively becomes compatible with that of its children. The coefficients' initial logits are the log prior probabilities that a child belongs to a parent. The priors can be trained discriminatively along with the weights. The priors depend on the location and type of the child and parent capsules, but not on the current input. At each iteration, the coefficients are adjusted via a "routing" softmax so that they continue to sum to 1 (to express the probability that a given capsule is the parent of a given child.) Softmax amplifies larger values and diminishes smaller values beyond their proportion of the total. Similarly, the probability that a feature is present in the input is exaggerated by a nonlinear "squashing" function that reduces values (smaller ones drastically and larger ones such that they are less than 1). This dynamic routing mechanism provides the necessary deprecation of alternatives ("explaining away") that is needed for segmenting overlapped objects. This learned routing of signals has no clear biological equivalent. Some operations can be found in cortical layers, but they do not seem to relate this technique. === Math/code === The pose vector u i {\textstyle \mathbf {u} _{i}} is rotated and translated by a matrix W i j {\textstyle \mathbf {W} _{ij}} into a vector u ^ j | i {\textstyle \mathbf {\hat {u}} _{j|i}} that predicts the output of the parent capsule. u ^ j | i = W i j u i {\displaystyle \mathbf {

    Read more →
  • SemEval

    SemEval

    SemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive. This series of evaluations provides a mechanism to characterize in more precise terms exactly what is necessary to compute in meaning. As such, the evaluations provide an emergent mechanism to identify the problems and solutions for computations with meaning. These exercises have evolved to articulate more of the dimensions that are involved in our use of language. They began with apparently simple attempts to identify word senses computationally. They have evolved to investigate the interrelationships among the elements in a sentence (e.g., semantic role labeling), relations between sentences (e.g., coreference), and the nature of what we are saying (semantic relations and sentiment analysis). The purpose of the SemEval and Senseval exercises is to evaluate semantic analysis systems. "Semantic Analysis" refers to a formal analysis of meaning, and "computational" refer to approaches that in principle support effective implementation. The first three evaluations, Senseval-1 through Senseval-3, were focused on word sense disambiguation (WSD), each time growing in the number of languages offered in the tasks and in the number of participating teams. Beginning with the fourth workshop, SemEval-2007 (SemEval-1), the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation. Triggered by the conception of the SEM conference, the SemEval community had decided to hold the evaluation workshops yearly in association with the SEM conference. It was also the decision that not every evaluation task will be run every year, e.g. none of the WSD tasks were included in the SemEval-2012 workshop. == History == === Early evaluation of algorithms for word sense disambiguation === From the earliest days, assessing the quality of word sense disambiguation algorithms had been primarily a matter of intrinsic evaluation, and “almost no attempts had been made to evaluate embedded WSD components”. Only very recently (2006) had extrinsic evaluations begun to provide some evidence for the value of WSD in end-user applications. Until 1990 or so, discussions of the sense disambiguation task focused mainly on illustrative examples rather than comprehensive evaluation. The early 1990s saw the beginnings of more systematic and rigorous intrinsic evaluations, including more formal experimentation on small sets of ambiguous words. === Senseval to SemEval === In April 1997, Martha Palmer and Marc Light organized a workshop entitled Tagging with Lexical Semantics: Why, What, and How? in conjunction with the Conference on Applied Natural Language Processing. At the time, there was a clear recognition that manually annotated corpora had revolutionized other areas of NLP, such as part-of-speech tagging and parsing, and that corpus-driven approaches had the potential to revolutionize automatic semantic analysis as well. Kilgarriff recalled that there was "a high degree of consensus that the field needed evaluation", and several practical proposals by Resnik and Yarowsky kicked off a discussion that led to the creation of the Senseval evaluation exercises. === SemEval's 3, 2 or 1 year(s) cycle === After SemEval-2010, many participants feel that the 3-year cycle is a long wait. Many other shared tasks such as Conference on Natural Language Learning (CoNLL) and Recognizing Textual Entailments (RTE) run annually. For this reason, the SemEval coordinators gave the opportunity for task organizers to choose between a 2-year or a 3-year cycle. The SemEval community favored the 3-year cycle. Although the votes within the SemEval community favored a 3-year cycle, organizers and coordinators had settled to split the SemEval task into 2 evaluation workshops. This was triggered by the introduction of the new SEM conference. The SemEval organizers thought it would be appropriate to associate our event with the SEM conference and collocate the SemEval workshop with the SEM conference. The organizers got very positive responses (from the task coordinators/organizers and participants) about the association with the yearly SEM, and 8 tasks were willing to switch to 2012. Thus was born SemEval-2012 and SemEval-2013. The current plan is to switch to a yearly SemEval schedule to associate it with the SEM conference but not every task needs to run every year. ==== List of Senseval and SemEval Workshops ==== Senseval-1 took place in the summer of 1998 for English, French, and Italian, culminating in a workshop held at Herstmonceux Castle, Sussex, England on September 2–4. Senseval-2 took place in the summer of 2001, and was followed by a workshop held in July 2001 in Toulouse, in conjunction with ACL 2001. Senseval-2 included tasks for Basque, Chinese, Czech, Danish, Dutch, English, Estonian, Italian, Japanese, Korean, Spanish and Swedish. Senseval-3 took place in March–April 2004, followed by a workshop held in July 2004 in Barcelona, in conjunction with ACL 2004. Senseval-3 included 14 different tasks for core word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition. SemEval-2007 (Senseval-4) took place in 2007, followed by a workshop held in conjunction with ACL in Prague. SemEval-2007 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text. A special issue of Language Resources and Evaluation is devoted to the result. SemEval-2010 took place in 2010, followed by a workshop held in conjunction with ACL in Uppsala. SemEval-2010 included 18 different tasks targeting the evaluation of semantic analysis systems. SemEval-2012 took place in 2012; it was associated with the new SEM, First Joint Conference on Lexical and Computational Semantics, and co-located with NAACL, Montreal, Canada. SemEval-2012 included 8 different tasks targeting at evaluating computational semantic systems. However, there was no WSD task involved in SemEval-2012, the WSD related tasks were scheduled in the upcoming SemEval-2013. SemEval-2013 was associated with NAACL 2013, North American Association of Computational Linguistics, Georgia, USA and took place in 2013. It included 13 different tasks targeting at evaluating computational semantic systems. SemEval-2014 took place in 2014. It was co-located with COLING 2014, 25th International Conference on Computational Linguistics and SEM 2014, Second Joint Conference on Lexical and Computational Semantics, Dublin, Ireland. There were 10 different tasks in SemEval-2014 evaluating various computational semantic systems. SemEval-2015 took place in 2015. It was co-located with NAACL-HLT 2015, 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies and SEM 2015, Third Joint Conference on Lexical and Computational Semantics, Denver, USA. There were 17 different tasks in SemEval-2015 evaluating various computational semantic systems. == SemEval Workshop framework == The framework of the SemEval/Senseval evaluation workshops emulates the Message Understanding Conferences (MUCs) and other evaluation workshops ran by ARPA (Advanced Research Projects Agency, renamed the Defense Advanced Research Projects Agency (DARPA)). Stages of SemEval/Senseval evaluation workshops Firstly, all likely participants were invited to express their interest and participate in the exercise design. A timetable towards a final workshop was worked out. A plan for selecting evaluation materials was agreed. 'Gold standards' for the individual tasks were acquired, often human annotators were considered as a gold standard to measure precision and recall scores of computer systems. These 'gold standards' are what the computational systems strive towards. In WSD tasks, human annotators were set on the task of generating a set of correct WSD answers (i.e. the correct sense for a given word in a given context) The gold standard materials, without answers, were released to participants, who then had a short time to run their programs over them and return their sets of answers to the organizers. The organizers then scored the answers and the scores were announced and discussed at a workshop. == Semantic evaluation tasks == Senseval-1 & Senseval-2 focused on evaluation WSD systems on major languages that were available corpus and computerized dictionary. Senseval-3 looked beyond the lexemes and started to evaluate systems that looked into wider areas of semantics, such as Semantic Roles (technically known as Theta roles in formal semantics), Logic Form Transformation (commonly semantics of phrases, clauses or sentences were represented

    Read more →
  • National Security Memorandum on Artificial Intelligence

    National Security Memorandum on Artificial Intelligence

    The Memorandum on Advancing the United States' Leadership in Artificial Intelligence; Harnessing Artificial Intelligence to Fulfill National Security Objectives; and Fostering the Safety, Security, and Trustworthiness of Artificial Intelligence is a memorandum signed by U.S. president Joe Biden. The memorandum is described as seeking to advance U.S. leadership in the development of safe, secure, and trustworthy artificial intelligence (AI); enable the U.S. government to use AI for national security; and contribute to international AI governance.

    Read more →
  • Jakub Pachocki

    Jakub Pachocki

    Jakub Pachocki (born 1991) is a Polish computer scientist and former competitive programmer. He is best known as OpenAI's chief scientist and for his role in overseeing development of GPT-4. == Background == Pachocki was born in 1991 in Gdańsk, Poland. In high school, he was a six-time finalist of the Polish Olympiad in Informatics. In 2009, he qualified for the International Olympiad in Informatics, winning a silver medal. Pachocki obtained his undergraduate degree in Computer Science from the University of Warsaw. He represented his university at the International Collegiate Programming Contest with his team winning a gold medal and coming second place overall in 2012. In the same year he was also the champion of the Google Code Jam. From 2011 to 2012, Pachocki worked at Facebook as a software engineering intern. Pachocki attended graduate school at Carnegie Mellon University, where he obtained his PhD under the supervision of Gary Miller. == Career == After graduation, Pachocki did postdoc work at Harvard University and Simons Institute for the Theory of Computing. === OpenAI === In 2017, Pachocki joined OpenAI. In 2021, he became OpenAI's research director where he led the development of GPT-4 and OpenAI Five. In May 2024, he became chief scientist after his mentor Ilya Sutskever left the company. OpenAI CEO Sam Altman has called Pachocki "easily one of the greatest minds of our generation". == Competitive programming achievements == International Olympiad in Informatics: Silver medal (2009) International Collegiate Programming Contest World Finals: Gold medal (second place overall in 2012) Google Code Jam: Champion (2012), Third place (2011) Facebook Hacker Cup: Second place (2013) TopCoder Open Algorithm: Second place (2012) A more comprehensive list of achievements can be found at the Competitive Programming Hall Of Fame website.

    Read more →
  • Stanford Research Institute Problem Solver

    Stanford Research Institute Problem Solver

    The Stanford Research Institute Problem Solver, known by its acronym STRIPS, is an automated planner developed by Richard Fikes and Nils Nilsson in 1971 at SRI International. The same name was later used to refer to the formal language of the inputs to this planner. This language is the base for most of the languages for expressing automated planning problem instances in use today; such languages are commonly known as action languages. This article only describes the language, not the planner. == Definition == A STRIPS instance is composed of: An initial state; The specification of the goal states – situations that the planner is trying to reach; A set of actions. For each action, the following are included: preconditions (what must be established before the action is performed); postconditions (what is established after the action is performed). Mathematically, a STRIPS instance is a quadruple ⟨ P , O , I , G ⟩ {\displaystyle \langle P,O,I,G\rangle } , in which each component has the following meaning: P {\displaystyle P} is a set of conditions (i.e., propositional variables); O {\displaystyle O} is a set of operators (i.e., actions); each operator is itself a quadruple ⟨ α , β , γ , δ ⟩ {\displaystyle \langle \alpha ,\beta ,\gamma ,\delta \rangle } , each element being a set of conditions. These four sets specify, in order, which conditions must be true for the action to be executable, which ones must be false, which ones are made true by the action and which ones are made false; I {\displaystyle I} is the initial state, given as the set of conditions that are initially true (all others are assumed false); G {\displaystyle G} is the specification of the goal state; this is given as a pair ⟨ N , M ⟩ {\displaystyle \langle N,M\rangle } , which specify which conditions are true and false, respectively, in order for a state to be considered a goal state. A plan for such a planning instance is a sequence of operators that can be executed from the initial state and that leads to a goal state. Formally, a state is a set of conditions: a state is represented by the set of conditions that are true in it. Transitions between states are modeled by a transition function, which is a function mapping states into new states that result from the execution of actions. Since states are represented by sets of conditions, the transition function relative to the STRIPS instance ⟨ P , O , I , G ⟩ {\displaystyle \langle P,O,I,G\rangle } is a function succ : 2 P × O → 2 P , {\displaystyle \operatorname {succ} :2^{P}\times O\rightarrow 2^{P},} where 2 P {\displaystyle 2^{P}} is the set of all subsets of P {\displaystyle P} , and is therefore the set of all possible states. The transition function succ {\displaystyle \operatorname {succ} } for a state C ⊆ P {\displaystyle C\subseteq P} , can be defined as follows, using the simplifying assumption that actions can always be executed but have no effect if their preconditions are not met: The function succ {\displaystyle \operatorname {succ} } can be extended to sequences of actions by the following recursive equations: succ ⁡ ( C , [ ] ) = C {\displaystyle \operatorname {succ} (C,[\ ])=C} succ ⁡ ( C , [ a 1 , a 2 , … , a n ] ) = succ ⁡ ( succ ⁡ ( C , a 1 ) , [ a 2 , … , a n ] ) {\displaystyle \operatorname {succ} (C,[a_{1},a_{2},\ldots ,a_{n}])=\operatorname {succ} (\operatorname {succ} (C,a_{1}),[a_{2},\ldots ,a_{n}])} A plan for a STRIPS instance is a sequence of actions such that the state that results from executing the actions in order from the initial state satisfies the goal conditions. Formally, [ a 1 , a 2 , … , a n ] {\displaystyle [a_{1},a_{2},\ldots ,a_{n}]} is a plan for G = ⟨ N , M ⟩ {\displaystyle G=\langle N,M\rangle } if F = succ ⁡ ( I , [ a 1 , a 2 , … , a n ] ) {\displaystyle F=\operatorname {succ} (I,[a_{1},a_{2},\ldots ,a_{n}])} satisfies the following two conditions: N ⊆ F {\displaystyle N\subseteq F} M ∩ F = ∅ {\displaystyle M\cap F=\varnothing } == Extensions == The above language is actually the propositional version of STRIPS; in practice, conditions are often about objects: for example, that the position of a robot can be modeled by a predicate A t {\displaystyle At} , and A t ( r o o m 1 ) {\displaystyle At(room1)} means that the robot is in Room1. In this case, actions can have free variables, which are implicitly existentially quantified. In other words, an action represents all possible propositional actions that can be obtained by replacing each free variable with a value. The initial state is considered fully known in the language described above: conditions that are not in I {\displaystyle I} are all assumed false. This is often a limiting assumption, as there are natural examples of planning problems in which the initial state is not fully known. Extensions of STRIPS have been developed to deal with partially known initial states. == A sample STRIPS problem == A monkey is at location A in a lab. There is a box in location C. The monkey wants the bananas that are hanging from the ceiling in location B, but it needs to move the box and climb onto it in order to reach them. Initial state: At(A), Level(low), BoxAt(C), BananasAt(B) Goal state: Have(bananas) Actions: // move from X to Y _Move(X, Y)_ Preconditions: At(X), Level(low) Postconditions: not At(X), At(Y) // climb up on the box _ClimbUp(Location)_ Preconditions: At(Location), BoxAt(Location), Level(low) Postconditions: Level(high), not Level(low) // climb down from the box _ClimbDown(Location)_ Preconditions: At(Location), BoxAt(Location), Level(high) Postconditions: Level(low), not Level(high) // move monkey and box from X to Y _MoveBox(X, Y)_ Preconditions: At(X), BoxAt(X), Level(low) Postconditions: BoxAt(Y), not BoxAt(X), At(Y), not At(X) // take the bananas _TakeBananas(Location)_ Preconditions: At(Location), BananasAt(Location), Level(high) Postconditions: Have(bananas) == Complexity == Deciding whether any plan exists for a propositional STRIPS instance is PSPACE-complete. Various restrictions can be enforced in order to decide if a plan exists in polynomial time or at least make it an NP-complete problem. == Macro operator == In the monkey and banana problem, the robot monkey has to execute a sequence of actions to reach the banana at the ceiling. A single action provides a small change in the game. To simplify the planning process, it make sense to invent an abstract action, which isn't available in the normal rule description. The super-action consists of low level actions and can reach high-level goals. The advantage is that the computational complexity is lower, and longer tasks can be planned by the solver. Identifying new macro operators for a domain can be realized with genetic programming. The idea is, not to plan the domain itself, but in the pre-step, a heuristics is created that allows the domain to be solved much faster. In the context of reinforcement learning, a macro-operator is called an option. Similar to the definition within AI planning, the idea is, to provide a temporal abstraction (span over a longer period) and to modify the game state directly on a higher layer.

    Read more →
  • Local-first software

    Local-first software

    Local-first software is a software engineering approach in which an application stores its data primarily on the user's own device rather than on remote servers. Users can read and write data without an Internet connection, and changes are synchronized across devices in the background when connectivity is available. The approach differs from conventional cloud-based applications, where the server holds the authoritative copy of user data and the client acts as a thin client. The term was coined in a 2019 paper published by researchers at Ink & Switch, an independent research lab, and presented at the Onward! conference at ACM SIGPLAN. The paper, sometimes referred to as a manifesto, was authored by Martin Kleppmann, Adam Wiggins, Peter van Hardenberg, and Mark McGranaghan. == Background == Before the widespread adoption of Internet-connected software in the 2000s, most desktop applications stored data as files on the user's local disk. Users had direct access to their files and could copy, back up, or delete them at will. The rise of software as a service (SaaS) and cloud-based applications like Google Docs shifted data storage to centralized servers. While cloud applications made real-time collaboration across devices straightforward, they introduced a dependency on the service provider: if the provider discontinued the service or experienced an outage, users could lose access to their data. A related concept, "offline-first," emerged in the early 2010s and focused on making web applications resilient to network interruptions. The local-first approach built on these earlier efforts while placing greater emphasis on long-term data ownership and end-to-end encryption. == Origins == === Ink & Switch manifesto === Ink & Switch is an industrial research lab co-founded by Adam Wiggins, who had earlier co-founded Heroku. Martin Kleppmann, an associate professor in the Department of Computer Science and Technology at the University of Cambridge, was a co-author of the 2019 paper. The manifesto proposed seven "ideals" for local-first software: Fast — Operations respond without network round-trips. Multi-device — Data synchronizes across a user's devices. Offline — Users can read and write data without a network connection. Collaboration — Multiple users can work on the same data concurrently. Longevity — Data remains accessible even if the software vendor ceases operation. Privacy — End-to-end encryption protects user data. User control — The vendor cannot restrict how users access or use their data. The paper surveyed existing approaches to data storage and collaboration — ranging from email attachments and Dropbox-style file synchronization to web applications and mobile backends — and argued that none of them satisfied all seven ideals simultaneously. === Role of CRDTs === The manifesto identified conflict-free replicated data types (CRDTs) as a promising technical foundation for local-first applications. CRDTs are data structures that allow multiple replicas to be edited independently and then merged without conflicts, a property first formalized in research by Marc Shapiro and colleagues around 2011. Kleppmann and collaborators at Ink & Switch developed Automerge, an open-source CRDT library for JSON documents, to make these algorithms available to application developers. == Adoption and community == Developer interest in the local-first approach grew after the 2019 paper spread on Hacker News and at developer conferences In August 2023, Wired published a feature article on the movement, describing it as an effort to reduce reliance on large cloud providers. The first Local-First Conf took place on 30 May 2024 in Berlin, with talks by Kleppmann and developers from companies including Linear and Anytype. The community has continued to expand, with regular "LoFi" meetups, a podcast (localfirst.fm), and a third edition of the conference planned for Berlin in July 2026. == Criticisms and limitations == Developers and commentators have pointed out practical difficulties with the local-first approach. Synchronizing data between multiple devices that may be offline for extended periods introduces complexity that cloud-based architectures avoid. Conflict resolution, even with CRDTs, can produce results that are technically consistent but semantically unexpected to users. Schema migrations across thousands of client devices running different application versions pose another difficulty that does not arise with server-side databases. Web browsers impose storage limits and may evict locally stored data. Safari, for instance, has been reported to clear IndexedDB data after seven days of inactivity on a given site, which undermines the assumption that local data is persistent. There is also disagreement within the local-first community about whether a fully decentralized architecture is required. The original manifesto described decentralization as the "logical end goal," but a number of products that identify as local-first still depend on centralized servers for authentication, backup, or synchronization. In a talk at Local-First Conf 2024, Kleppmann said the seven ideals are better understood as a "gradient" rather than a strict checklist.

    Read more →
  • Fifth Generation Computer Systems

    Fifth Generation Computer Systems

    The Fifth Generation Computer Systems (FGCS; Japanese: 第五世代コンピュータ, romanized: daigosedai konpyūta) was a 10-year initiative launched in 1982 by Japan's Ministry of International Trade and Industry (MITI) to develop computers based on massively parallel computing and logic programming. The project aimed to create an "epoch-making computer" with supercomputer-like performance and to establish a platform for future advancements in artificial intelligence. Although FGCS was noted as ahead of its time, and its ambitious goals contributed significantly to the development of concurrent logic programming, it ultimately ended in commercial failure. The term "fifth generation" was chosen to emphasize the system's advanced nature. In the history of computing hardware, there had been four prior "generations" of computers: the first generation utilized vacuum tubes; the second, transistors and diodes; the third, integrated circuits; and the fourth, microprocessors. While earlier generations focused on increasing the number of logic elements within a single CPU, it was widely believed at the time that the fifth generation would achieve enhanced performance through the use of massive numbers of CPUs. == Background == In the late 1960s until the early 1970s, there was much talk about "generations" of computer hardware, then usually organized into three generations First generation: Thermionic vacuum tubes. Mid-1940s. IBM pioneered the arrangement of vacuum tubes in pluggable modules. The IBM 650 was a first-generation computer. Second generation: Transistors. 1956. The era of miniaturization begins. Transistors are much smaller than vacuum tubes, draw less power, and generate less heat. Discrete transistors are soldered to circuit boards, with interconnections accomplished by stencil-screened conductive patterns on the reverse side. The IBM 7090 was a second-generation computer. Third generation: Integrated circuits (silicon chips containing multiple transistors). 1964. A pioneering example is the ACPX module used in the IBM 360/91, which, by stacking layers of silicon over a ceramic substrate, accommodated over 20 transistors per chip; the chips could be packed together onto a circuit board to achieve unprecedented logic densities. The IBM 360/91 was a hybrid second and third-generation computer. Omitted from this taxonomy is the "zeroth-generation" computer based on metal gears (such as the IBM 407) or mechanical relays (such as the Mark I), and the post-third-generation computers based on Very Large Scale Integrated (VLSI) circuits. There was also a parallel set of generations for software: First generation: Machine language. Second generation: Low-level programming languages such as Assembly language. Third generation: Structured high-level programming languages such as C, COBOL and FORTRAN. Fourth generation: "Non-procedural" high-level programming languages (such as object-oriented languages). Throughout these multiple generations up to the 1970s, Japan built computers following U.S. and British leads. In the mid-1970s, the Ministry of International Trade and Industry stopped following western leads and started looking into the future of computing on a small scale. They asked the Japan Information Processing Development Center (JIPDEC) to indicate a number of future directions, and in 1979 offered a three-year contract to carry out more in-depth studies along with industry and academia. It was during this period that the term "fifth-generation computer" started to be used. Prior to the 1970s, MITI guidance had successes such as an improved steel industry, the creation of the oil supertanker, the automotive industry, consumer electronics, and computer memory. MITI decided that the future was going to be information technology. However, the Japanese language, particularly in its written form, presented and still presents obstacles for computers. As a result of these hurdles, MITI held a conference to seek assistance from experts. The primary fields for investigation from this initial project were: Inference computer technologies for knowledge processing Computer technologies to process large-scale data bases and knowledge bases High-performance workstations Distributed functional computer technologies Super-computers for scientific calculation == Project launch == The aim was to build parallel computers for artificial intelligence applications using concurrent logic programming. The project imagined an "epoch-making" computer with supercomputer-like performance running on top of large databases (as opposed to a traditional filesystem) using a logic programming language to define and access the data using massively parallel computing/processing. They envisioned building a prototype machine with performance between 100M and 1G LIPS, where a LIPS is a Logical Inference Per Second. At the time typical workstation machines were capable of about 100k LIPS. They proposed to build this machine over a ten-year period, 3 years for initial R&D, 4 years for building various subsystems, and a final 3 years to complete a working prototype system. In 1982 the government decided to go ahead with the project, and established the Institute for New Generation Computer Technology (ICOT) through joint investment with various Japanese computer companies. After the project ended, MITI would consider an investment in a new "sixth generation" project. Ehud Shapiro captured the rationale and motivations driving this project: "As part of Japan's effort to become a leader in the computer industry, the Institute for New Generation Computer Technology has launched a revolutionary ten-year plan for the development of large computer systems which will be applicable to knowledge information processing systems. These Fifth Generation computers will be built around the concepts of logic programming. In order to refute the accusation that Japan exploits knowledge from abroad without contributing any of its own, this project will stimulate original research and will make its results available to the international research community." === Logic programming === The target defined by the FGCS project was to develop "Knowledge Information Processing systems" (roughly meaning, applied Artificial Intelligence). The chosen tool to implement this goal was logic programming. Logic programming approach as was characterized by Maarten Van Emden – one of its founders – as: The use of logic to express information in a computer. The use of logic to present problems to a computer. The use of logical inference to solve these problems. More technically, it can be summed up in two equations: Program = Set of axioms. Computation = Proof of a statement from axioms. The Axioms typically used are universal axioms of a restricted form, called Horn-clauses or definite-clauses. The statement proved in a computation is an existential statement. The proof is constructive, and provides values for the existentially quantified variables: these values constitute the output of the computation. Logic programming was thought of as something that unified various gradients of computer science (software engineering, databases, computer architecture and artificial intelligence). It seemed that logic programming was a key missing connection between knowledge engineering and parallel computer architectures. == Results == After having influenced the consumer electronics field during the 1970s and the automotive world during the 1980s, the Japanese had developed a strong reputation. The launch of the FGCS project spread the belief that parallel computing was the future of all performance gains, producing a wave of apprehension in the computer field. Soon parallel projects were set up in the US as the Strategic Computing Initiative and the Microelectronics and Computer Technology Corporation (MCC), in the UK as Alvey, and in Europe as the European Strategic Program on Research in Information Technology (ESPRIT), as well as the European Computer‐Industry Research Centre (ECRC) in Munich, a collaboration between ICL in Britain, Bull in France, and Siemens in Germany. The project ran from 1982 to 1994, spending a little less than ¥57 billion (about US$320 million) total. After the FGCS Project, MITI stopped funding large-scale computer research projects, and the research momentum developed by the FGCS Project dissipated. However MITI/ICOT embarked on a neural-net project which some called the Sixth Generation Project in the 1990s, with a similar level of funding. Per-year spending was less than 1% of the entire R&D expenditure of the electronics and communications equipment industry. For example, the project's highest expenditure year was 7.2 million yen in 1991, but IBM alone spent 1.5 billion dollars (370 billion yen) in 1982, while the industry spent 2150 billion yen in 1990. === Concurrent logic programming === In 1982, during a visit to the ICOT, Ehud Shapiro invented Concurrent Prolog, a novel programming language t

    Read more →
  • Innovation Center for Artificial Intelligence

    Innovation Center for Artificial Intelligence

    The Innovation Center for Artificial Intelligence (ICAI) is a Dutch national network focused on joint technology development between academia, industry and government in the area of artificial intelligence (AI). The initiative was launched in April 2018 and is based at Amsterdam Science Park. As of 2024, the director of the ICAI is Maarten de Rijke. In November 2018, ICAI announced its contribution to AINED, the first iteration of the Dutch National AI Strategy. In January 2023, Maastricht University announced the ROBUST program, led by the Innovation Center for Artificial Intelligence (ICAI) and supported by the University of Amsterdam and others. This initiative focuses on advancing research in trustworthy AI technology across various sectors, notably healthcare and energy, in the Netherlands. The program's plan includes the creation of 17 new labs and the appointment of PhD candidates, backed by a €25 million funding from the Dutch Research Council (NWO). == Labs == The ICAI network is linked to several collaborative labs: Thira Lab (Imaging): Thirona, Delft Imaging Systems and Radboud UMC, founded March 2019 AIMLab (AI for Medical Imaging): Uva and Inception Institute of Artificial Intelligence from the United Arab Emirates, founded March 2019 AFL (AI for Fintech): ING and Delft University of Technology, founded March 2019 Police Lab AI: Dutch National Police, founded January 2019 Elsevier AI Lab: Uva and Elsevier, founded October 2018 AIRLab Delft (AI for Retail Robotics): TU Delft Robotics and AholdDelhaize, founded November 2018 Quva Lab (Deep Vision): Uva and Qualcomm, founded 2016 (prior to ICAI) AIRLab Amsterdam (AI for Retail): Uva and AholdDelhaize, founded April 2018 DeltaLab (Deep Learning Technologies Amsterdam): Uva and Bosch, founded April 2017 (prior to ICAI) AI4SE (AI for Software Engineering Lab) Delft University of Technology and JetBrains, founded October 2023 Atlas Lab: Uva and TomTom (TOM2)

    Read more →
  • Transaction logic

    Transaction logic

    Transaction Logic is an extension of predicate logic that accounts in a clean and declarative way for the phenomenon of state changes in logic programs and databases. This extension adds connectives specifically designed for combining simple actions into complex transactions and for providing control over their execution. The logic has a natural model theory and a sound and complete proof theory. Transaction Logic has a Horn clause subset, which has a procedural as well as a declarative semantics. The important features of the logic include hypothetical and committed updates, dynamic constraints on transaction execution, non-determinism, and bulk updates. In this way, Transaction Logic is able to declaratively capture a number of non-logical phenomena, including procedural knowledge in artificial intelligence, active databases, and methods with side effects in object databases. Transaction Logic was originally proposed in 1993 by Anthony Bonner and Michael Kifer and later described in more detail in An Overview of Transaction Logic and Logic Programming for Database Transactions. The most comprehensive description appears in Bonner & Kifer's technical report from 1995. In later years, Transaction Logic was extended in various ways, including concurrency, defeasible reasoning, partially defined actions, and other features. In 2013, the original paper on Transaction Logic has won the 20-year Test of Time Award of the Association for Logic Programming as the most influential paper from the proceedings of ICLP 1993 conference in the preceding 20 years. == Examples == === Graph coloring === Here tinsert denotes the elementary update operation of transactional insert. The connective ⊗ is called serial conjunction. === Pyramid stacking === The elementary update tdelete represents the transactional delete operation. === Hypothetical execution === Here <> is the modal operator of possibility: If both action1 and action2 are possible, execute action1. Otherwise, if only action2 is possible, then execute it. === Dining philosophers === Here | is the logical connective of parallel conjunction of Concurrent Transaction Logic. == Implementations == A number of implementations of Transaction Logic exist: The original implementation. An implementation of Concurrent Transaction Logic. Transaction Logic enhanced with tabling. An implementation of Transaction Logic has also been incorporated as part of the Flora-2 knowledge representation and reasoning system. All these implementations are open source.

    Read more →
  • Text-to-video model

    Text-to-video model

    A text-to-video model is a form of generative artificial intelligence that uses a natural language description as input to produce a video relevant to the input text. Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video diffusion models. == Models == There are different models, including open source models. Chinese-language input CogVideo is the earliest text-to-video model "of 9.4 billion parameters" to be developed, with its demo version of open source codes first presented on GitHub in 2022. That year, Meta Platforms released a partial text-to-video model called "Make-A-Video", and Google's Brain (later Google DeepMind) introduced Imagen Video, a text-to-video model with 3D U-Net. === 2023 === In February 2023, Runway released Gen-1 and Gen-2, among the first commercially available text-to-video and video-to-video models accessible to the public through a web interface. Gen-1, initially released as a video-to-video model, allowed users to transform existing video footage using text or image prompts. Gen-2, introduced in March 2023 and made publicly available in June 2023, added text-to-video capabilities, enabling users to generate videos from text prompts alone. In March 2023, a research paper titled "VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation" was published, presenting a novel approach to video generation. The VideoFusion model decomposes the diffusion process into two components: base noise and residual noise, which are shared across frames to ensure temporal coherence. By utilizing a pre-trained image diffusion model as a base generator, the model efficiently generated high-quality and coherent videos. Fine-tuning the pre-trained model on video data addressed the domain gap between image and video data, enhancing the model's ability to produce realistic and consistent video sequences. In the same month, Adobe introduced Firefly AI as part of its features. === 2024 === In January 2024, Google announced development of a text-to-video model named Lumiere which is anticipated to integrate advanced video editing capabilities. Matthias Niessner and Lourdes Agapito at AI company Synthesia work on developing 3D neural rendering techniques that can synthesise realistic video by using 2D and 3D neural representations of shape, appearances, and motion for controllable video synthesis of avatars. In June 2024, Luma Labs launched its Dream Machine video tool. That same month, Kuaishou extended its Kling AI text-to-video model to international users. In July 2024, TikTok owner ByteDance released Jimeng AI in China, through its subsidiary, Faceu Technology. By September 2024, the Chinese AI company MiniMax debuted its video-01 model, joining other established AI model companies like Zhipu AI, Baichuan, and Moonshot AI, which contribute to China's involvement in AI technology. In December 2024 Lightricks launched LTX Video as an open source model. === 2025 === Alternative approaches to text-to-video models include Google's Phenaki, Hour One, Colossyan, Runway's Gen-3 Alpha, and OpenAI's Sora, Several additional text-to-video models, such as Plug-and-Play, Text2LIVE, and TuneAVideo, have emerged. FLUX.1 developer Black Forest Labs has announced its text-to-video model SOTA. Google was preparing to launch a video generation tool named Veo for YouTube Shorts in 2025. In May 2025, Google launched the Veo 3 iteration of the model. It was noted for its impressive audio generation capabilities, which were a previous limitation for text-to-video models. In July 2025 Lightricks released an update to LTX Video capable of generating clips reaching 60 seconds, and in October 2025 it released LTX-2, with audio capabilities built in. === 2026 === In February 2026, ByteDance released Seedance 2.0, it was noted for its impressive realistic generation, motion and camera control and 15 second generation, however the model faced huge critiscism from Motion Picture Association for copyright infringement. After viewing a viral clip of a fight between actors Brad Pitt and Tom Cruise, Rhett Reese, who is the co-writer of Deadpool & Wolverine and Zombieland announced that on social media "I hate to say it. It’s likely over for us," further stating that "In next to no time, one person is going to be able to sit at a computer and create a movie indistinguishable from what Hollywood now releases." == Architecture and training == There are several architectures that have been used to create text-to-video models. Similar to text-to-image models, these models can be trained using Recurrent Neural Networks (RNNs) such as long short-term memory (LSTM) networks, which has been used for Pixel Transformation Models and Stochastic Video Generation Models, which aid in consistency and realism respectively. An alternative for these include transformer models. Generative adversarial networks (GANs), Variational autoencoders (VAEs), — which can aid in the prediction of human motion — and diffusion models have also been used to develop the image generation aspects of the model. Text-video datasets used to train models include, but are not limited to, WebVid-10M, HDVILA-100M, CCV, ActivityNet, and Panda-70M. These datasets contain millions of original videos of interest, generated videos, captioned-videos, and textual information that help train models for accuracy. Text-video datasets used to train models include, but are not limited to PromptSource, DiffusionDB, and VidProM. These datasets provide the range of text inputs needed to teach models how to interpret a variety of textual prompts. The video generation process involves synchronizing the text inputs with video frames, ensuring alignment and consistency throughout the sequence. This predictive process is subject to decline in quality as the length of the video increases due to resource limitations. The Will Smith Eating Spaghetti test is a benchmark for models. == Limitations == Despite the rapid evolution of text-to-video models in their performance, a primary limitation is that they are very computationally heavy which limits its capacity to provide high quality and lengthy outputs. Additionally, these models require a large amount of specific training data to be able to generate high quality and coherent outputs, which brings about the issue of accessibility. Moreover, models may misinterpret textual prompts, resulting in video outputs that deviate from the intended meaning. This can occur due to limitations in capturing semantic context embedded in text, which affects the model's ability to align generated video with the user's intended message. Various models, including Make-A-Video, Imagen Video, Phenaki, CogVideo, GODIVA, and NUWA, are currently being tested and refined to enhance their alignment capabilities and overall performance in text-to-video generation. Another issue with the outputs is that text or fine details in AI-generated videos often appear garbled, a problem that stable diffusion models also struggle with. Examples include distorted hands and unreadable text. == Ethics == The deployment of text-to-video models raises ethical considerations related to content generation. These models have the potential to create inappropriate or unauthorized content, including explicit material, graphic violence, misinformation, and likenesses of real individuals without consent. Ensuring that AI-generated content complies with established standards for safe and ethical usage is essential, as content generated by these models may not always be easily identified as harmful or misleading. The ability of AI to recognize and filter out NSFW or copyrighted content remains an ongoing challenge, with implications for both creators and audiences. == Impacts and applications == Text-to-video models offer a broad range of applications that may benefit various fields, from educational and promotional to creative industries. These models can streamline content creation for training videos, movie previews, gaming assets, and visualizations, making it easier to generate content. During the Russo-Ukrainian war, fake videos made with artificial intelligence were created as part of a propaganda war against Ukraine and shared in social media. These included depictions of children in the Ukrainian Armed Forces, fake ads targeting children encouraging them to denounce critics of the Ukrainian government, or fictitious statements by Ukrainian President Volodymyr Zelenskyy about the country's surrender, among others. === Movies === Kaur vs Kore is the first Indian feature film made using generative AI which features dual role for the AI character of Sunny Leone, set to release in 2026. Chiranjeevi Hanuman – The Eternal is an Indian movie made entirely using Generative AI created by Vijay Subramaniam which is set for theatrical release in 2026. The movie was widely criticised by the Film makers in the Bollywood industr

    Read more →
  • Hyper basis function network

    Hyper basis function network

    In machine learning, a Hyper basis function network, or HyperBF network, is a generalization of radial basis function (RBF) networks concept, where the Mahalanobis-like distance is used instead of the Euclidean distance measure. Hyper basis function networks were first introduced by Poggio and Girosi in the 1990 paper “Networks for Approximation and Learning”. == Network Architecture == The typical HyperBF network structure consists of a real input vector x ∈ R n {\displaystyle x\in \mathbb {R} ^{n}} , a hidden layer of activation functions and a linear output layer. The output of the network is a scalar function of the input vector, ϕ : R n → R {\displaystyle \phi :\mathbb {R} ^{n}\to \mathbb {R} } , is given by where N {\displaystyle N} is a number of neurons in the hidden layer, μ j {\displaystyle \mu _{j}} and a j {\displaystyle a_{j}} are the center and weight of neuron j {\displaystyle j} . The activation function ρ j ( | | x − μ j | | ) {\displaystyle \rho _{j}(||x-\mu _{j}||)} at the HyperBF network takes the following form where R j {\displaystyle R_{j}} is a positive definite d × d {\displaystyle d\times d} matrix. Depending on the application, the following types of matrices R j {\displaystyle R_{j}} are usually considered R j = 1 2 σ 2 I d × d {\displaystyle R_{j}={\frac {1}{2\sigma ^{2}}}\mathbb {I} _{d\times d}} , where σ > 0 {\displaystyle \sigma >0} . This case corresponds to the regular RBF network. R j = 1 2 σ j 2 I d × d {\displaystyle R_{j}={\frac {1}{2\sigma _{j}^{2}}}\mathbb {I} _{d\times d}} , where σ j > 0 {\displaystyle \sigma _{j}>0} . In this case, the basis functions are radially symmetric, but are scaled with different width. R j = d i a g ( 1 2 σ j 1 2 , . . . , 1 2 σ j z 2 ) I d × d {\displaystyle R_{j}=diag\left({\frac {1}{2\sigma _{j1}^{2}}},...,{\frac {1}{2\sigma _{jz}^{2}}}\right)\mathbb {I} _{d\times d}} , where σ j i > 0 {\displaystyle \sigma _{ji}>0} . Every neuron has an elliptic shape with a varying size. Positive definite matrix, but not diagonal. == Training == Training HyperBF networks involves estimation of weights a j {\displaystyle a_{j}} , shape and centers of neurons R j {\displaystyle R_{j}} and μ j {\displaystyle \mu _{j}} . Poggio and Girosi (1990) describe the training method with moving centers and adaptable neuron shapes. The outline of the method is provided below. Consider the quadratic loss of the network H [ ϕ ∗ ] = ∑ i = 1 N ( y i − ϕ ∗ ( x i ) ) 2 {\displaystyle H[\phi ^{}]=\sum _{i=1}^{N}(y_{i}-\phi ^{}(x_{i}))^{2}} . The following conditions must be satisfied at the optimum: where R j = W T W {\displaystyle R_{j}=W^{T}W} . Then in the gradient descent method the values of a j , μ j , W {\displaystyle a_{j},\mu _{j},W} that minimize H [ ϕ ∗ ] {\displaystyle H[\phi ^{}]} can be found as a stable fixed point of the following dynamic system: where ω {\displaystyle \omega } determines the rate of convergence. Overall, training HyperBF networks can be computationally challenging. Moreover, the high degree of freedom of HyperBF leads to overfitting and poor generalization. However, HyperBF networks have an important advantage that a small number of neurons is enough for learning complex functions.

    Read more →
  • Innovation Center for Artificial Intelligence

    Innovation Center for Artificial Intelligence

    The Innovation Center for Artificial Intelligence (ICAI) is a Dutch national network focused on joint technology development between academia, industry and government in the area of artificial intelligence (AI). The initiative was launched in April 2018 and is based at Amsterdam Science Park. As of 2024, the director of the ICAI is Maarten de Rijke. In November 2018, ICAI announced its contribution to AINED, the first iteration of the Dutch National AI Strategy. In January 2023, Maastricht University announced the ROBUST program, led by the Innovation Center for Artificial Intelligence (ICAI) and supported by the University of Amsterdam and others. This initiative focuses on advancing research in trustworthy AI technology across various sectors, notably healthcare and energy, in the Netherlands. The program's plan includes the creation of 17 new labs and the appointment of PhD candidates, backed by a €25 million funding from the Dutch Research Council (NWO). == Labs == The ICAI network is linked to several collaborative labs: Thira Lab (Imaging): Thirona, Delft Imaging Systems and Radboud UMC, founded March 2019 AIMLab (AI for Medical Imaging): Uva and Inception Institute of Artificial Intelligence from the United Arab Emirates, founded March 2019 AFL (AI for Fintech): ING and Delft University of Technology, founded March 2019 Police Lab AI: Dutch National Police, founded January 2019 Elsevier AI Lab: Uva and Elsevier, founded October 2018 AIRLab Delft (AI for Retail Robotics): TU Delft Robotics and AholdDelhaize, founded November 2018 Quva Lab (Deep Vision): Uva and Qualcomm, founded 2016 (prior to ICAI) AIRLab Amsterdam (AI for Retail): Uva and AholdDelhaize, founded April 2018 DeltaLab (Deep Learning Technologies Amsterdam): Uva and Bosch, founded April 2017 (prior to ICAI) AI4SE (AI for Software Engineering Lab) Delft University of Technology and JetBrains, founded October 2023 Atlas Lab: Uva and TomTom (TOM2)

    Read more →