AI Chat Interface

AI Chat Interface — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Pixel shift

    Pixel shift

    Pixel shift is a method in digital cameras for producing a super-resolution image. The method works by taking several images, after each such capture moving ("shifting") the sensor to a new position. In digital colour cameras that employ pixel shift, this avoids a major limitation inherent in using Bayer pattern for obtaining colour, and instead produces an image with increased colour resolution and, assuming a static subject or additional computational steps, an image free of colour moiré. Taking this idea further, sub-pixel shifting may increase the resolution of the final image beyond that suggested by the specified resolution of the image sensor. Additionally, assuming that the various individual captures are taken at the same sensitivity, the final combined image will have less image noise than a single capture. This can be thought of as an averaging effect (for instance, in a pixel shift image composed of four individual frames with a classic Bayer pattern, every pixel in the final colour image is based on two measurements of the green channel). == List of cameras implementing pixel shift == All of the following cameras are fabricated with one imaging sensor, thus any kind of pixel shift requires a movement of the whole sensor. === Canon === Canon R5: Contains a 45 Mpixel sensor. The High-Resolution Mode shifts the sensor by one pixel to obtain a sequence of nine images that are merged into a 400 Mpixel image. === Fujifilm === Fujifilm GFX50S II: contains a 51 Mpixel sensor. The Pixel Shift Multi-Shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 16 images that are subsequently merged into a 200 Mpixel image. Fujifilm GFX100, Fujifilm GFX100 II: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image. Fujifilm GFX100S, Fujifilm GFX100S II: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image Fujifilm GFX100IR: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image Fujifilm X-H2: contains a 40 Mpixel sensor. A sequence of 20 shifted images are merged into a 160 Mpixel image. Fujifilm X-T5: contains a 40 Mpixel sensor. A sequence of 20 shifted images are merged into a 160 Mpixel image. === Nikon === Nikon Z8: contains a 47.5 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of up to 32 images that can be merged in Nikon's NX studio software. Nikon Zf: contains a 24 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of up to 32 images that can be merged in Nikon's NX studio software. === Olympus === Olympus OM-D E-M1 Mark II: contains a 20.4 Mpixel sensor. The High Res shot mode produces a 50 Mpixel image. Olympus OM-D E-M5 Mark II: contains a 16 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 8 images that are subsequently merged into a 40 Mpixel image. Olympus OM-D E-M5 Mark III: contains a 20.4 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 8 images that are subsequently merged into a 50 Mpixel image. Olympus OM-D E-M1X: contains a 20.4 Mpixel sensor. The camera sports two pixel shift mode: (a) the 80Mp Tripod mode produces an 80 Mpixel image, (b) the Handheld High Res shot mode produces a 50 Mpixel image. Olympus PEN-F: contains a 20.4 Mpixel sensor. The High Res Shot mode takes multiple images, continually shifting the position of the sensor in sub-pixel increments. Combining these images results in either a 50MP JPEG or an 80MP Raw file. ==== OM System ==== OM System OM-1: contains a 20MPix sensor. The High Res Shot mode takes multiple images, and it can be used handheld or on a tripod. Handheld it will internally produce 50 Mpix files and 80 Mpix when mounted on a tripod. OM System OM-5: contains a 20MPix sensor. The High Res Shot mode takes multiple images, and it can be used handheld or on a tripod. Handheld it will internally produce 50 Mpix files and 80 Mpix when mounted on a tripod. === Panasonic === Panasonic Lumix DC-G9: contains a 20.3 Mpixel sensor. The High Resolution Mode takes a sequence of 8 shots in quick succession between which the sensor is shifted by 0.5 pixel for each image. These are subsequently merged into an 80 Mpixel image. Panasonic Lumix DC-S1: contains a 24.2 Mpixel sensor. The High Resolution Mode takes a sequence of shots in quick succession between which the sensor is shifted by a small amount. These are subsequently merged into a 96 Mpixel image. Panasonic Lumix DC-S1R: contains a 47.3 Mpixel sensor. The High Resolution Mode shifts the imaging sensor by a small increments to obtain a sequence of 8 images that are subsequently merged into a 187 Mpixel image. Panasonic Lumix DC-S1H Panasonic Lumix DC-S5 === Pentax === Pentax K-70: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'all color data in each pixel to deliver super-high-resolution images'. Pentax KP: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'high-resolution images with more accurate colours and much finer details'. Pentax K-3 II: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'super-high-resolution images with far more truthful color reproduction and much finer details'. Pentax K-3 III: contains a 25.7 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'a cancelling out of the Bayer pattern and removal of the need for sharpness-sapping demosaicing'. Pentax K-1: contains a 36.4 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'improved detail and colour resolution'. Pentax K-1 II: contains a 36.4 Mpixel sensor. The camera sports two pixel shift mode: (a) a series of 4 tripod-stabilised images shifted by 1 pixel each are subsequently combined into a 47.3 Mpixel image, (b) a series of images taken in handheld mode are combined into a 47.3 Mpixel image that is, within limits, able to cope even with moving subjects. === Sony === Sony a6600: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'all color data in each pixel to deliver super-high-resolution images'. Sony α7R III: contains a 42.4 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 42.4 Mpixel image with improved tonal resolution. Sony α7R IV: contains a 61 Mpixel sensor. The camera has two pixel shift modes, (a) the first takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 61 Mpixel image with improved tonal resolution, (b) the other takes a sequence of 16 shots between which the sensor is shifted by 0.5 pixel. These are subsequently merged into a 240 Mpixel image with both enhanced detail and improved tonal resolution. Sony α1: contains a 50 Mpixel sensor. The camera has two pixel shift modes, (a) the first takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 50 Mpixel image with improved tonal resolution, (b) the other takes a sequence of 16 shots between which the sensor is shifted by 0.5 pixel. These are subsequently merged into a 200 Mpixel image with both enhanced detail and improved tonal resolution. === Hasselblad === Hasselblad H3DII: the model H3DII-39 sports a 39 Mpixel sensor, the model H3DII-50 a 50 Mpixel sensor. Both enable a pixel shift mode which takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a single image. Hasselblad H4D series: the model H4D-200MS contains a 50 Mpixel sensor. The sensor sports 3 different pixel shift modes which take (a) a sequence of 6 shots taken at slight offsets, (b) a sequence of 4 shots between which the sensor is shifted by 1 pixel, (c) a sequence of 4 shots between which the sensor is shifted by 0.5 pixels. Images obtained by all three modes are subsequently merged into 200 Mpixel images. Hasselblad H5D series: both models H5D-50c MS and H5D-200c MS contain a 50 Mpixel sensor. This sensor sports 2 different pixel shift modes which take (a) a sequence of 6 shots with full and half pixel moveme

    Read more →
  • Sparkles emoji

    Sparkles emoji

    The Sparkles emoji (U+2728 ✨ SPARKLES) is an emoji that has one large star surrounded by smaller stars. Originating from Japan to represent sparkles used in anime and manga, the sparkles are often used as emphasis in text by surrounding words or phrases with it. It is the third most-used emoji in the world on Twitter as of 2021. Since the early 2020s it has been used by major software companies to represent artificial intelligence, marketing the technology as "like magic". == Development == According to Emojipedia, the Sparkles emoji was first used by Japanese mobile operators SoftBank, Docomo and au in the late 1990s. The emoji was added to Unicode 6.0 in 2010 and Emoji 1.0 in 2015. On some platforms the Sparkles emoji has been multicoloured whilst on other platforms it has been one colour. Twitter and Microsoft's Sparkles have changed from being multicoloured to being a single colour. Samsung's version of the emoji previously had a night sky in the background. == Usage == === Interpersonal communication === The Sparkles emoji was originally meant to represent the usage of sparkles in Japanese anime and manga, where the sparkles are used to represent beauty, happiness or awe. The emoji has several meanings and depends upon context. Starting in the late 2010s, the emoji started being used to surround words or phrases to be used as emphasis, an example from the book Because Internet being "I would simply ✨pass away✨". It can also be used as sarcasm, irony or as a way to mock people. Without emoji this could be represented with tildes or asterisks, for example, "~tildes~" or "~asterisk plus tilde~" or "~~true sparkle exuberance~~". The sparkles emoji can be used to represent stars in text, be used to represent cleanliness or can be used to mean "orgasm" whilst sexting. In September 2021 the Sparkles emoji overtook the Pleading Face (🥺) emoji to become the third most-used emoji in the world according to Emojipedia, with approximately 1 per cent of all tweets containing the Sparkles emoji. === Artificial intelligence === In the early 2020s, the Sparkles emoji started being used as an icon to represent artificial intelligence (AI). Companies who use the emoji this way include Google, OpenAI, Samsung, Microsoft, Adobe, Spotify and Zoom. As of August 2024, seven of the top 10 software companies by market capitalisation use the Sparkles emojis with AI. OpenAI has different versions of the Sparkles for different versions of the models that ChatGPT uses. One explanation is that Sparkles is being used by these companies as a way to market AI as "magic". Marketing technology as "magic" has been used before AI, particularly by Apple. Another explanation given by designers and marketers choosing to use Sparkles to signify AI is simply that other platforms are doing it, making it familiar to users. Around 2024, some of these companies started removing two of the smaller stars from the emoji in their AI services and have kept the one large star, an example being Google's Gemini chatbot. In early 2024, the Nielsen Norman Group provided test subjects with the star in isolation and found that people did not associate the symbol with AI, but instead mostly with "optimisation" or "favourite or save an item".

    Read more →
  • Intelligent control

    Intelligent control

    Intelligent control is a class of control techniques that use various artificial intelligence computing approaches like neural networks, Bayesian probability, fuzzy logic, machine learning, reinforcement learning, evolutionary computation and genetic algorithms. == Overview == Intelligent control can be divided into the following major sub-domains: Neural network control Machine learning control Reinforcement learning Bayesian control Fuzzy control Neuro-fuzzy control Expert Systems Genetic control New control techniques are created continuously as new models of intelligent behavior are created and computational methods developed to support them. === Neural network controller === Neural networks have been used to solve problems in almost all spheres of science and technology. Neural network control basically involves two steps: System identification Control It has been shown that a feedforward network with nonlinear, continuous and differentiable activation functions have universal approximation capability. Recurrent networks have also been used for system identification. Given, a set of input-output data pairs, system identification aims to form a mapping among these data pairs. Such a network is supposed to capture the dynamics of a system. For the control part, deep reinforcement learning has shown its ability to control complex systems. === Bayesian controllers === Bayesian probability has produced a number of algorithms that are in common use in many advanced control systems, serving as state space estimators of some variables that are used in the controller. The Kalman filter and the Particle filter are two examples of popular Bayesian control components. The Bayesian approach to controller design often requires an important effort in deriving the so-called system model and measurement model, which are the mathematical relationships linking the state variables to the sensor measurements available in the controlled system. In this respect, it is very closely linked to the system-theoretic approach to control design.

    Read more →
  • Surrogate model

    Surrogate model

    A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so an approximate mathematical model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as a function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the airflow around the wing for different shape variables (e.g., length, curvature, material, etc.). For many real-world problems, however, a single simulation can take many minutes, hours, or even days to complete. As a result, routine tasks such as design optimization, design space exploration, sensitivity analysis and "what-if" analysis become impossible since they require thousands or even millions of simulation evaluations. One way of alleviating this burden is by constructing approximation models, known as surrogate models, metamodels or emulators, that mimic the behavior of the simulation model as closely as possible while being computationally cheaper to evaluate. Surrogate models are constructed using a data-driven, bottom-up approach. The exact, inner working of the simulation code is not assumed to be known (or even understood), relying solely on the input-output behavior. A model is constructed based on modeling the response of the simulator to a limited number of intelligently chosen data points. This approach is also known as behavioral modeling or black-box modeling, though the terminology is not always consistent. When only a single design variable is involved, the process is known as curve fitting. Though using surrogate models in lieu of experiments and simulations in engineering design is more common, surrogate modeling may be used in many other areas of science where there are expensive experiments and/or function evaluations. == Goals == The scientific challenge of surrogate modeling is the generation of a surrogate that is as accurate as possible, using as few simulation evaluations as possible. The process comprises three major steps which may be interleaved iteratively: Sample selection (also known as sequential design, optimal experimental design (OED) or active learning) Construction of the surrogate model and optimizing the model parameters (i.e., bias-variance tradeoff) Appraisal of the accuracy of the surrogate. The accuracy of the surrogate depends on the number and location of samples (expensive experiments or simulations) in the design space. A systematic data representation during training can improve model scalability, thereby reducing the need for expensive simulations. Various design of experiments (DOE) techniques cater to different sources of errors, in particular, errors due to noise in the data or errors due to an improper surrogate model. == Types of surrogate models == Popular surrogate modeling approaches are: polynomial response surfaces; kriging; more generalized Bayesian approaches; gradient-enhanced kriging (GEK); radial basis function; support vector machines; space mapping; artificial neural networks and Bayesian networks. Other methods recently explored include Fourier surrogate modeling , random forests, convolutional neural networks, and generative adversarial networks. For some problems, the nature of the true function is not known a priori, and therefore it is not clear which surrogate model will be the most accurate one. In addition, there is no consensus on how to obtain the most reliable estimates of the accuracy of a given surrogate. Many other problems have known physics properties. In these cases, physics-based surrogates such as space-mapping based models are commonly used. == Invariance properties == Recently proposed comparison-based surrogate models (e.g., ranking support vector machines) for evolutionary algorithms, such as CMA-ES, allow preservation of some invariance properties of surrogate-assisted optimizers: Invariance with respect to monotonic transformations of the function (scaling) Invariance with respect to orthogonal transformations of the search space (rotation) == Applications == An important distinction can be made between two different applications of surrogate models: design optimization and design space approximation (also known as emulation). In surrogate model-based optimization, an initial surrogate is constructed using some of the available budgets of expensive experiments and/or simulations. The remaining experiments/simulations are run for designs which the surrogate model predicts may have promising performance. The process usually takes the form of the following search/update procedure. Initial sample selection (the experiments and/or simulations to be run) Construct surrogate model Search surrogate model (the model can be searched extensively, e.g., using a genetic algorithm, as it is cheap to evaluate) Run and update experiment/simulation at new location(s) found by search and add to sample Iterate steps 2 to 4 until out of time or design is "good enough" Depending on the type of surrogate used and the complexity of the problem, the process may converge on a local or global optimum, or perhaps none at all. In design space approximation, one is not interested in finding the optimal parameter vector, but rather in the global behavior of the system. Here the surrogate is tuned to mimic the underlying model as closely as needed over the complete design space. Such surrogates are a useful, cheap way to gain insight into the global behavior of the system. Optimization can still occur as a post-processing step, although with no update procedure (see above), the optimum found cannot be validated. == Surrogate modeling software == Surrogate Modeling Toolbox (SMT: https://github.com/SMTorg/smt) is a Python package that contains a collection of surrogate modeling methods, sampling techniques, and benchmarking functions. This package provides a library of surrogate models that is simple to use and facilitates the implementation of additional methods. SMT is different from existing surrogate modeling libraries because of its emphasis on derivatives, including training derivatives used for gradient-enhanced modeling, prediction derivatives, and derivatives with respect to the training data. It also includes new surrogate models that are not available elsewhere: kriging by partial-least squares reduction and energy-minimizing spline interpolation. Python library SAMBO Optimization supports sequential optimization with arbitrary models, with tree-based models and Gaussian process models built in. Surrogates.jl is a Julia packages which offers tools like random forests, radial basis methods and kriging. == Surrogate-Assisted Evolutionary Algorithms (SAEAs) == SAEAs are an advanced class of optimization techniques that integrate evolutionary algorithms (EAs) with surrogate models. In traditional EAs, evaluating the fitness of candidate solutions often requires computationally expensive simulations or experiments. SAEAs address this challenge by building a surrogate model, which is a computationally inexpensive approximation of the objective function or constraint functions. The surrogate model serves as a substitute for the actual evaluation process during the evolutionary search. It allows the algorithm to quickly estimate the fitness of new candidate solutions, thereby reducing the number of expensive evaluations needed. This significantly speeds up the optimization process, especially in cases where the objective function evaluations are time-consuming or resource-intensive. SAEAs typically involve three main steps: (1) building the surrogate model using a set of initial sampled data points, (2) performing the evolutionary search using the surrogate model to guide the selection, crossover, and mutation operations, and (3) periodically updating the surrogate model with new data points generated during the evolutionary process to improve its accuracy. By balancing exploration (searching new areas in the solution space) and exploitation (refining known promising areas), SAEAs can efficiently find high-quality solutions to complex optimization problems. They have been successfully applied in various fields, including engineering design, machine learning, and computational finance, where traditional optimization methods may struggle due to the high computational cost of fitness evaluations.

    Read more →
  • Himmat (app)

    Himmat (app)

    Himmat is a women's safety mobile application of Delhi Police. It was launched by Home Minister Rajnath Singh on 1 January 2015. The app is freely available for Android mobile phones and can be downloaded from Delhi Police website. Delhi Police plans to launch app for other platforms in future. Low registrations and other problems resulted in a parliamentary panel calling the app a failure in 2018. Himmat has gone on to be called as one of India's best safety apps for women.

    Read more →
  • Hierarchical control system

    Hierarchical control system

    A hierarchical control system (HCS) is a form of control system in which a set of devices and governing software is arranged in a hierarchical tree. When the links in the tree are implemented by a computer network, then that hierarchical control system is also a form of networked control system. == Overview == A human-built system with complex behavior is often organized as a hierarchy. For example, a command hierarchy has among its notable features the organizational chart of superiors, subordinates, and lines of organizational communication. Hierarchical control systems are organized similarly to divide the decision making responsibility. Each element of the hierarchy is a linked node in the tree. Commands, tasks and goals to be achieved flow down the tree from superior nodes to subordinate nodes, whereas sensations and command results flow up the tree from subordinate to superior nodes. Nodes may also exchange messages with their siblings. The two distinguishing features of a hierarchical control system are related to its layers. Each higher layer of the tree operates with a longer interval of planning and execution time than its immediately lower layer. The lower layers have local tasks, goals, and sensations, and their activities are planned and coordinated by higher layers which do not generally override their decisions. The layers form a hybrid intelligent system in which the lowest, reactive layers are sub-symbolic. The higher layers, having relaxed time constraints, are capable of reasoning from an abstract world model and performing planning. A hierarchical task network is a good fit for planning in a hierarchical control system. Besides artificial systems, an animal's control systems are proposed to be organized as a hierarchy. In perceptual control theory, which postulates that an organism's behavior is a means of controlling its perceptions, the organism's control systems are suggested to be organized in a hierarchical pattern as their perceptions are constructed so. == Control system structure == The accompanying diagram is a general hierarchical model which shows functional manufacturing levels using computerised control of an industrial control system. Referring to the diagram; Level 0 contains the field devices such as flow and temperature sensors, and final control elements, such as control valves Level 1 contains the industrialised Input/Output (I/O) modules, and their associated distributed electronic processors. Level 2 contains the supervisory computers, which collate information from processor nodes on the system, and provide the operator control screens. Level 3 is the production control level, which does not directly control the process, but is concerned with monitoring production and monitoring targets Level 4 is the production scheduling level. == Applications == === Manufacturing, robotics and vehicles === Among the robotic paradigms is the hierarchical paradigm in which a robot operates in a top-down fashion, heavy on planning, especially motion planning. Computer-aided production engineering has been a research focus at NIST since the 1980s. Its Automated Manufacturing Research Facility was used to develop a five layer production control model. In the early 1990s DARPA sponsored research to develop distributed (i.e. networked) intelligent control systems for applications such as military command and control systems. NIST built on earlier research to develop its Real-Time Control System (RCS) and Real-time Control System Software which is a generic hierarchical control system that has been used to operate a manufacturing cell, a robot crane, and an automated vehicle. In November 2007, DARPA held the Urban Challenge. The winning entry, Tartan Racing employed a hierarchical control system, with layered mission planning, motion planning, behavior generation, perception, world modelling, and mechatronics. === Artificial intelligence === Subsumption architecture is a methodology for developing artificial intelligence that is heavily associated with behavior based robotics. This architecture is a way of decomposing complicated intelligent behavior into many "simple" behavior modules, which are in turn organized into layers. Each layer implements a particular goal of the software agent (i.e. system as a whole), and higher layers are increasingly more abstract. Each layer's goal subsumes that of the underlying layers, e.g. the decision to move forward by the eat-food layer takes into account the decision of the lowest obstacle-avoidance layer. Behavior need not be planned by a superior layer, rather behaviors may be triggered by sensory inputs and so are only active under circumstances where they might be appropriate. Reinforcement learning has been used to acquire behavior in a hierarchical control system in which each node can learn to improve its behavior with experience. James Albus, while at NIST, developed a theory for intelligent system design named the Reference Model Architecture (RMA), which is a hierarchical control system inspired by RCS. Albus defines each node to contain these components. Behavior generation is responsible for executing tasks received from the superior, parent node. It also plans for, and issues tasks to, the subordinate nodes. Sensory perception is responsible for receiving sensations from the subordinate nodes, then grouping, filtering, and otherwise processing them into higher level abstractions that update the local state and which form sensations that are sent to the superior node. Value judgment is responsible for evaluating the updated situation and evaluating alternative plans. World Model is the local state that provides a model for the controlled system, controlled process, or environment at the abstraction level of the subordinate nodes. At its lowest levels, the RMA can be implemented as a subsumption architecture, in which the world model is mapped directly to the controlled process or real world, avoiding the need for a mathematical abstraction, and in which time-constrained reactive planning can be implemented as a finite-state machine. Higher levels of the RMA however, may have sophisticated mathematical world models and behavior implemented by automated planning and scheduling. Planning is required when certain behaviors cannot be triggered by current sensations, but rather by predicted or anticipated sensations, especially those that come about as result of the node's actions.

    Read more →
  • Solomonoff's theory of inductive inference

    Solomonoff's theory of inductive inference

    Solomonoff's theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment being observed is generated by an unknown algorithm. This is also called a theory of induction. Due to its basis in the dynamical (state-space model) character of Algorithmic Information Theory, it encompasses statistical as well as dynamical information criteria for model selection. It was introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory. Solomonoff proved that this induction is incomputable (or more precisely, lower semi-computable), but noted that "this incomputability is of a very benign kind", and that it "in no way inhibits its use for practical prediction" (as it can be approximated from below more accurately with more computational resources). It is only "incomputable" in the benign sense that no scientific consensus is able to prove that the best current scientific theory is the best of all possible theories. However, Solomonoff's theory does provide an objective criterion for deciding among the current scientific theories explaining a given set of observations. Solomonoff's induction naturally formalizes Occam's razor by assigning larger prior credences to theories that require a shorter algorithmic description. == Origin == === Philosophical === The theory is based in philosophical foundations, and was founded by Ray Solomonoff around 1960. It is a mathematically formalized combination of Occam's razor and the Principle of Multiple Explanations. All computable theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Marcus Hutter's universal artificial intelligence builds upon this to calculate the expected value of an action. === Principle === Solomonoff's induction has been argued to be the computational formalization of pure Bayesianism. To understand, recall that Bayesianism derives the posterior probability P [ T | D ] {\displaystyle \mathbb {P} [T|D]} of a theory T {\displaystyle T} given data D {\displaystyle D} by applying Bayes rule, which yields P [ T | D ] = P [ D | T ] P [ T ] P [ D | T ] P [ T ] + ∑ A ≠ T P [ D | A ] P [ A ] {\displaystyle \mathbb {P} [T|D]={\frac {\mathbb {P} [D|T]\mathbb {P} [T]}{\mathbb {P} [D|T]\mathbb {P} [T]+\sum _{A\neq T}\mathbb {P} [D|A]\mathbb {P} [A]}}} where theories A {\displaystyle A} are alternatives to theory T {\displaystyle T} . For this equation to make sense, the quantities P [ D | T ] {\displaystyle \mathbb {P} [D|T]} and P [ D | A ] {\displaystyle \mathbb {P} [D|A]} must be well-defined for all theories T {\displaystyle T} and A {\displaystyle A} . In other words, any theory must define a probability distribution over observable data D {\displaystyle D} . Solomonoff's induction essentially boils down to demanding that all such probability distributions be computable. Interestingly, the set of computable probability distributions is a subset of the set of all programs, which is countable. Similarly, the sets of observable data considered by Solomonoff were finite. Without loss of generality, we can thus consider that any observable data is a finite bit string. As a result, Solomonoff's induction can be defined by only invoking discrete probability distributions. Solomonoff's induction then allows to make probabilistic predictions of future data F {\displaystyle F} , by simply obeying the laws of probability. Namely, we have P [ F | D ] = E T [ P [ F | T , D ] ] = ∑ T P [ F | T , D ] P [ T | D ] {\displaystyle \mathbb {P} [F|D]=\mathbb {E} _{T}[\mathbb {P} [F|T,D]]=\sum _{T}\mathbb {P} [F|T,D]\mathbb {P} [T|D]} . This quantity can be interpreted as the average predictions P [ F | T , D ] {\displaystyle \mathbb {P} [F|T,D]} of all theories T {\displaystyle T} given past data D {\displaystyle D} , weighted by their posterior credences P [ T | D ] {\displaystyle \mathbb {P} [T|D]} . === Mathematical === The proof of the "razor" is based on the known mathematical properties of a probability distribution over a countable set. These properties are relevant because the infinite set of all programs is a denumerable set. The sum S of the probabilities of all programs must be exactly equal to one (as per the definition of probability) thus the probabilities must roughly decrease as we enumerate the infinite set of all programs, otherwise S will be strictly greater than one. To be more precise, for every ϵ {\displaystyle \epsilon } > 0, there is some length l such that the probability of all programs longer than l is at most ϵ {\displaystyle \epsilon } . This does not, however, preclude very long programs from having very high probability. Fundamental ingredients of the theory are the concepts of algorithmic probability and Kolmogorov complexity. The universal prior probability of any prefix p of a computable sequence x is the sum of the probabilities of all programs (for a universal computer) that compute something starting with p. Given some p and any computable but unknown probability distribution from which x is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of x in optimal fashion. == Mathematical guarantees == === Solomonoff's completeness === The remarkable property of Solomonoff's induction is its completeness. In essence, the completeness theorem guarantees that the expected cumulative errors made by the predictions based on Solomonoff's induction are upper-bounded by the Kolmogorov complexity of the (stochastic) data generating process. The errors can be measured using the Kullback–Leibler divergence or the square of the difference between the induction's prediction and the probability assigned by the (stochastic) data generating process. === Solomonoff's uncomputability === Unfortunately, Solomonoff also proved that Solomonoff's induction is uncomputable. In fact, he showed that computability and completeness are mutually exclusive: any complete theory must be uncomputable. The proof of this is derived from a game between the induction and the environment. Essentially, any computable induction can be tricked by a computable environment, by choosing the computable environment that negates the computable induction's prediction. This fact can be regarded as an instance of the no free lunch theorem. == Modern applications == === Artificial intelligence === Though Solomonoff's inductive inference is not computable, several AIXI-derived algorithms approximate it in order to make it run on a modern computer. The more computing power they are given, the closer their predictions are to the predictions of inductive inference (their mathematical limit is Solomonoff's inductive inference). Another direction of inductive inference is based on E. Mark Gold's model of learning in the limit from 1967 and has developed since then more and more models of learning. The general scenario is the following: Given a class S of computable functions, is there a learner (that is, recursive functional) which for any input of the form (f(0),f(1),...,f(n)) outputs a hypothesis (an index e with respect to a previously agreed on acceptable numbering of all computable functions; the indexed function may be required consistent with the given values of f). A learner M learns a function f if almost all its hypotheses are the same index e, which generates the function f; M learns S if M learns every f in S. Basic results are that all recursively enumerable classes of functions are learnable while the class REC of all computable functions is not learnable. Many related models have been considered and also the learning of classes of recursively enumerable sets from positive data is a topic studied from Gold's pioneering paper in 1967 onwards. A far reaching extension of the Gold’s approach is developed by Schmidhuber's theory of generalized Kolmogorov complexities, which are kinds of super-recursive algorithms.

    Read more →
  • Hardware for artificial intelligence

    Hardware for artificial intelligence

    Specialized computer hardware is often used to execute artificial intelligence (AI) programs faster, and with less energy, such as Lisp machines, neuromorphic engineering, event cameras, and physical neural networks. Since 2017, several consumer grade CPUs and SoCs have on-die NPUs. As of 2023, the market for AI hardware is dominated by GPUs. As of the 2020s, AI computation is dominated by graphics processing units (GPUs) and newer domain-specific accelerators such as Google's Tensor Processing Units (TPUs), AMD's Instinct MI300 series, and various on-device neural-processing units (NPUs) found in consumer hardware. == Scope == For the purposes of this article, AI hardware refers to computing components and systems specifically designed or optimized to accelerate artificial-intelligence workloads such as machine-learning training or inference. This includes general-purpose accelerators used for AI (for example, GPUs) and domain-specific accelerators (for example, TPUs, NPUs, and other AI ASICs). Event-based cameras are sometimes discussed in the context of neuromorphic computing, but they are input sensors rather than AI compute devices. Conversely, components such as memristors are basic circuit elements rather than specialized AI hardware when considered alone. == Lisp machines == Lisp machines were developed in the late 1970s and early 1980s to make artificial intelligence programs written in the programming language Lisp run faster. == Dataflow architecture == Dataflow architecture processors used for AI serve various purposes with varied implementations like the polymorphic dataflow Convolution Engine by Kinara (formerly Deep Vision), structure-driven dataflow by Hailo, and dataflow scheduling by Cerebras. == Component hardware == === AI accelerators === Since the 2010s, advances in computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer. By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced central processing units (CPUs) as the dominant means to train large-scale commercial cloud AI. OpenAI estimated the hardware compute used in the largest deep learning projects from Alex Net (2012) to Alpha Zero (2017), and found a 300,000-fold increase in the amount of compute needed, with a doubling-time trend of 3.4 months. === General-purpose GPUs for AI === Since the 2010s, graphics processing units (GPUs) have been widely used to train and deploy deep learning models because of their highly parallel architecture and high memory bandwidth. Modern data-center GPUs include dedicated tensor or matrix-math units that accelerate neural-network operations. In 2022, NVIDIA introduced the Hopper-generation H100 GPU, adding FP8 precision support and faster interconnects for large-scale model training. AMD and other vendors have also developed GPUs and accelerators aimed at AI and high-performance computing workloads. === Domain-specific accelerators (ASICs / NPUs) === Beyond general-purpose GPUs, several companies have developed application-specific integrated circuits (ASICs) and neural processing units (NPUs) tailored for AI workloads. Google introduced the Tensor Processing Unit (TPU) in 2016 for deep-learning inference, with later generations supporting large-scale training through dense systolic-array designs and optical interconnects. Other vendors have released similar devices—such as Apple's Neural Engine and various on-device NPUs—that emphasize energy-efficient inference in mobile or edge computing environments. === Memory and interconnects === AI accelerators rely on fast memory and inter-chip links to manage the large data volumes of training and inference. High-bandwidth memory (HBM) stacks, standardized as HBM3 in 2022, provide terabytes-per-second throughput on modern GPUs and ASICs. These accelerators are often connected through dedicated fabrics such as NVIDIA's NVLink and NVSwitch or optical interconnects used in TPU systems to scale performance across thousands of chips.

    Read more →
  • Motor theory of speech perception

    Motor theory of speech perception

    The motor theory of speech perception is the hypothesis that people perceive spoken words by identifying the vocal tract gestures with which they are pronounced rather than by identifying the sound patterns that speech generates. It originally claimed that speech perception is done through a specialized module that is innate and human-specific. Though the idea of a module has been qualified in more recent versions of the theory, the idea remains that the role of the speech motor system is not only to produce speech articulations but also to detect them. The hypothesis has gained more interest outside the field of speech perception than inside. This has increased particularly since the discovery of mirror neurons that link the production and perception of motor movements, including those made by the vocal tract. The theory was initially proposed in the Haskins Laboratories in the 1950s by Alvin Liberman and Franklin S. Cooper, and developed further by Donald Shankweiler, Michael Studdert-Kennedy, Ignatius Mattingly, Carol Fowler and Douglas Whalen. == Origins and development == The hypothesis has its origins in research using pattern playback to create reading machines for the blind that would substitute sounds for orthographic letters. This led to a close examination of how spoken sounds correspond to the acoustic spectrogram of them as a sequence of auditory sounds. This found that successive consonants and vowels overlap in time with one another (a phenomenon known as coarticulation). This suggested that speech is not heard like an acoustic "alphabet" or "cipher," but as a "code" of overlapping speech gestures. === Associationist approach === Initially, the theory was associationist: infants mimic the speech they hear and that this leads to behavioristic associations between articulation and its sensory consequences. Later, this overt mimicry would be short-circuited and become speech perception. This aspect of the theory was dropped, however, with the discovery that prelinguistic infants could already detect most of the phonetic contrasts used to separate different speech sounds. === Cognitivist approach === The behavioristic approach was replaced by a cognitivist one in which there was a speech module. The module detected speech in terms of hidden distal objects rather than at the proximal or immediate level of their input. The evidence for this was the research finding that speech processing was special such as duplex perception. === Changing distal objects === Initially, speech perception was assumed to link to speech objects that were both the invariant movements of speech articulators the invariant motor commands sent to muscles to move the vocal tract articulators This was later revised to include the phonetic gestures rather than motor commands, and then the gestures intended by the speaker at a prevocal, linguistic level, rather than actual movements. === Modern revision === The "speech is special" claim has been dropped, as it was found that speech perception could occur for nonspeech sounds (for example, slamming doors for duplex perception). === Mirror neurons === The discovery of mirror neurons has led to renewed interest in the motor theory of speech perception, and the theory still has its advocates, although there are also critics. == Support == === Nonauditory gesture information === If speech is identified in terms of how it is physically made, then nonauditory information should be incorporated into speech percepts even if it is still subjectively heard as "sounds". This is, in fact, the case. The McGurk effect shows that seeing the production of a spoken syllable that differs from an auditory cue synchronized with it affects the perception of the auditory one. In other words, if someone hears "ba" but sees a video of someone pronouncing "ga", what they hear is different—some people believe they hear "da". People find it easier to hear speech in noise if they can see the speaker. People can hear syllables better when their production can be felt haptically. === Categorical perception === Using a speech synthesizer, speech sounds can be varied in place of articulation along a continuum from /bɑ/ to /dɑ/ to /ɡɑ/, or in voice onset time on a continuum from /dɑ/ to /tɑ/ (for example). When listeners are asked to discriminate between two different sounds, they perceive sounds as belonging to discrete categories, even though the sounds vary continuously. In other words, 10 sounds (with the sound on one extreme being /dɑ/ and the sound on the other extreme being /tɑ/, and the ones in the middle varying on a scale) may all be acoustically different from one another, but the listener will hear all of them as either /dɑ/ or /tɑ/. Likewise, the English consonant /d/ may vary in its acoustic details across different phonetic contexts (the /d/ in /du/ does not technically sound the same as the one in /di/, for example), but all /d/'s as perceived by a listener fall within one category (voiced alveolar plosive) and that is because "linguistic representations are abstract, canonical, phonetic segments or the gestures that underlie these segments." This suggests that humans identify speech using categorical perception, and thus that a specialized module, such as that proposed by the motor theory of speech perception, may be on the right track. === Speech imitation === If people can hear the gestures in speech, then the imitation of speech should be very fast, as in when words are repeated that are heard in headphones as in speech shadowing. People can repeat heard syllables more quickly than they would be able to produce them normally. === Speech production === Hearing speech activates vocal tract muscles, and the motor cortex and premotor cortex. The integration of auditory and visual input in speech perception also involves such areas. Disrupting the premotor cortex disrupts the perception of speech units such as plosives. The activation of the motor areas occurs in terms of the phonemic features which link with the vocal track articulators that create speech gestures. The perception of a speech sound is aided by pre-emptively stimulating the motor representation of the articulators responsible for its pronunciation . Auditory and motor cortical coupling is restricted to a specific range of neuronal firing frequency. === Perception-action meshing === Evidence exists that perception and production are generally coupled in the motor system. This is supported by the existence of mirror neurons that are activated both by seeing (or hearing) an action and when that action is carried out. Another source of evidence is that for common coding theory between the representations used for perception and action. == Criticisms == The motor theory of speech perception is not widely held in the field of speech perception, though it is more popular in other fields, such as theoretical linguistics. As three of its advocates have noted, "it has few proponents within the field of speech perception, and many authors cite it primarily to offer critical commentary".p. 361 Several critiques of it exist. === Multiple sources === Speech perception is affected by nonproduction sources of information, such as context. Individual words are hard to understand in isolation but easy when heard in sentence context. It therefore seems that speech perception uses multiple sources that are integrated together in an optimal way. === Production === The motor theory of speech perception would predict that speech motor abilities in infants predict their speech perception abilities, but in actuality it is the other way around. It would also predict that defects in speech production would impair speech perception, but they do not. However, this only affects the first and already superseded behaviorist version of the theory, where infants were supposed to learn all production-perception patterns by imitation early in childhood. This is no longer the mainstream view of motor-speech theorists. === Speech module === Several sources of evidence for a specialized speech module have failed to be supported. Duplex perception can be observed with door slams. The McGurk effect can also be achieved with nonlinguistic stimuli, such as showing someone a video of a basketball bouncing but playing the sound of a ping-pong ball bouncing. As for categorical perception, listeners can be sensitive to acoustic differences within single phonetic categories. As a result, this part of the theory has been dropped by some researchers. === Sublexical tasks === The evidence provided for the motor theory of speech perception is limited to tasks such as syllable discrimination that use speech units not full spoken words or spoken sentences. As a result, "speech perception is sometimes interpreted as referring to the perception of speech at the sublexical level. However, the ultimate goal of these studies is presumably to understand the neural processes supporting the ability to process spee

    Read more →
  • Data-driven model

    Data-driven model

    Data-driven models are a class of computational models that primarily rely on historical data collected throughout a system's or process' lifetime to establish relationships between input, internal, and output variables. Commonly found in numerous articles and publications, data-driven models have evolved from earlier statistical models, overcoming limitations posed by strict assumptions about probability distributions. These models have gained prominence across various fields, particularly in the era of big data, artificial intelligence, and machine learning, where they offer valuable insights and predictions based on the available data. == Background == These models have evolved from earlier statistical models, which were based on certain assumptions about probability distributions that often proved to be overly restrictive. The emergence of data-driven models in the 1950s and 1960s coincided with the development of digital computers, advancements in artificial intelligence research, and the introduction of new approaches in non-behavioural modelling, such as pattern recognition and automatic classification. == Key Concepts == Data-driven models encompass a wide range of techniques and methodologies that aim to intelligently process and analyse large datasets. Examples include fuzzy logic, fuzzy and rough sets for handling uncertainty, neural networks for approximating functions, global optimization and evolutionary computing, statistical learning theory, and Bayesian methods. These models have found applications in various fields, including economics, customer relations management, financial services, medicine, and the military, among others. Machine learning, a subfield of artificial intelligence, is closely related to data-driven modelling as it also focuses on using historical data to create models that can make predictions and identify patterns. In fact, many data-driven models incorporate machine learning techniques, such as regression, classification, and clustering algorithms, to process and analyse data. In recent years, the concept of data-driven models has gained considerable attention in the field of water resources, with numerous applications, academic courses, and scientific publications using the term as a generalization for models that rely on data rather than physics. This classification has been featured in various publications and has even spurred the development of hybrid models in the past decade. Hybrid models attempt to quantify the degree of physically based information used in hydrological models and determine whether the process of building the model is primarily driven by physics or purely data-based. As a result, data-driven models have become an essential topic of discussion and exploration within water resources management and research. The term "data-driven modelling" (DDM) refers to the overarching paradigm of using historical data in conjunction with advanced computational techniques, including machine learning and artificial intelligence, to create models that can reveal underlying trends, patterns, and, in some cases, make predictions Data-driven models can be built with or without detailed knowledge of the underlying processes governing the system behavior, which makes them particularly useful when such knowledge is missing or fragmented.

    Read more →
  • Artificial wisdom

    Artificial wisdom

    Artificial wisdom (AW) is an artificial intelligence (AI) system which is able to display the human traits of wisdom and morals while being able to contemplate its own “endpoint”. Artificial wisdom can be described as artificial intelligence reaching the top-level of decision-making when confronted with the most complex challenging situations. The term artificial wisdom is used when the "intelligence" is based on more than by chance collecting and interpreting data, but by design enriched with smart and conscience strategies that wise people would use. == Overview == The goal of artificial wisdom is to create artificial intelligence that can successfully replicate the “uniquely human trait[s]” of having wisdom and morals as closely as possible. Thus, artificial wisdom, must “incorporate [the] ethical and moral considerations” of the data it uses. There are also many significant ethical and legal implications of AW which are compounded by the rapid advances in AI and related technologies alongside the lack of the development of ethics, guidelines, and regulations without the oversight of any kind of overarching advisory board. Additionally, there are challenges in how to develop, test, and implement AW in real world scenarios. Existing tests do not test the internal thought process by which a computer system reaches its conclusion, only the result of said process. When examining computer-aided wisdom; the partnership of artificial intelligence and contemplative neuroscience, concerns regarding the future of artificial intelligence shift to a more optimistic viewpoint. This artificial wisdom forms the basis of Louis Molnar's monographic article on artificial philosophy, where he coined the term and proposes how artificial intelligence might view its place in the grand scheme of things. == Definitions == There are no universal or standardized definitions for human intelligence, artificial intelligence, human wisdom, or artificial wisdom. However, the DIKW pyramid, describes the continuum of relationship between data, information, knowledge, and wisdom, puts wisdom at the highest level in its hierarchy. Gottfredson defines intelligence as “the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience”. Definitions for wisdom typically include requiring: The ability for emotional regulation, Pro-social behaviors (e.g., empathy, compassion, and altruism), Self-reflection, “A balance between decisiveness and acceptance of uncertainty and diversity of perspectives, and social advising.” As previously defined, Artificial Wisdom would then be an AI system which is able to solve problems via “an understanding of…context, ethics and moral principles,” rather than simple pre-defined inputs or “learned patterns.” Some scientists have also considered the field of artificial consciousness. However, Jeste states that “…it is generally agreed that only humans can have consciousness, autonomy, will, and theory of mind.” An artificially wise system must also be able to contemplate its end goal and recognize its own ignorance. Additionally, to contemplate its end goal, a wise system must have a “correct conception of worthwhile goals (broadly speaking) or well-being (narrowly speaking)”. "Stephen Grimm further suggests that the following three types of knowledge are individually necessary for wisdom: first, "knowledge of what is good or important for well-being", second, "knowledge of one’s standing, relative to what is good or important for well-being", and third, "knowledge of a strategy for obtaining what is good or important for wellbeing."" == Problems == There are notable problems with attempting to create an artificially wise system. Consciousness, autonomy, and will are considered strictly human features. === Values === There are significant ethical and philosophical issues when attempting to create an intelligent or a wise system. Notably, whose moral values will be used to train the system to be wise. Differing moral values and prejudice can already be seen from various organizations and governments in artificial intelligence. Deployment strategies and values of Artificial Wisdom will conflict between leaders, companies, and countries. Nusbaum states, “When values are in conflict, leaders often make choices that are clever or smart about their own needs, but are often not wise.” === Ethics === Science fiction author Isaac Asimov realized the need to control the technology in the 1940s when he wrote the three laws of robotics as follows: A robot may not injure a human directly or indirectly. A robot must obey human’s orders. A robot should seek to protect its own existence. Additionally, the pace at which technology is rapidly advancing artificial intelligence and thus the need for artificial wisdom may “have outpaced the development of societal guidelines have raised serious questions about the ethics and morality of AI, and called for international oversight and regulations to ensure safety.” === Principal impossibility === One argument, coined by Tsai as the “argument against AW,” or AAAW, postulates the principal impossibility of Artificial Wisdom. The argument is based on the philosophical differences between practical wisdom, also called phronesis, and practical intelligence. Said difference isn’t in “selecting the correct means, but reasoning correctly about what ends to follow”. Tsai puts the argument into a logical proposition as follows: “(P1) An agent is genuinely wise only if the agent can deliberate about the final goal of the domain in which the agent is situated.” “(P2) An intelligent agent cannot deliberate about the final goal of the domain in which the agent is situated.” “(C1) An intelligent agent cannot be genuinely wise.” “(P3) An AW is, at its core, intelligent.” “(C2) An AW cannot be genuinely wise.”

    Read more →
  • Hyperparameter optimization

    Hyperparameter optimization

    In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process, which must be configured before the process starts. Hyperparameter optimization determines the set of hyperparameters that yields an optimal model which minimizes a predefined loss function on a given data set. The objective function takes a set of hyperparameters and returns the associated loss. Cross-validation is often used to estimate this generalization performance, and therefore choose the set of values for hyperparameters that maximize it. == Approaches == === Grid search === The traditional method for hyperparameter optimization has been grid search, or a parameter sweep, which is simply an exhaustive searching through a manually specified subset of the hyperparameter space of a learning algorithm. A grid search algorithm must be guided by some performance metric, typically measured by cross-validation on the training set or evaluation on a hold-out validation set. Since the parameter space of a machine learner may include real-valued or unbounded value spaces for certain parameters, manually set bounds and discretization may be necessary before applying grid search. For example, a typical soft-margin SVM classifier equipped with an RBF kernel has at least two hyperparameters that need to be tuned for good performance on unseen data: a regularization constant C and a kernel hyperparameter γ. Both parameters are continuous, so to perform grid search, one selects a finite set of "reasonable" values for each, say C ∈ { 10 , 100 , 1000 } {\displaystyle C\in \{10,100,1000\}} γ ∈ { 0.1 , 0.2 , 0.5 , 1.0 } {\displaystyle \gamma \in \{0.1,0.2,0.5,1.0\}} Grid search then trains an SVM with each pair (C, γ) in the Cartesian product of these two sets and evaluates their performance on a held-out validation set (or by internal cross-validation on the training set, in which case multiple SVMs are trained per pair). Finally, the grid search algorithm outputs the settings that achieved the highest score in the validation procedure. Grid search suffers from the curse of dimensionality, but is often embarrassingly parallel because the hyperparameter settings it evaluates are typically independent of each other. === Random search === Random Search replaces the exhaustive enumeration of all combinations by selecting them randomly. This can be simply applied to the discrete setting described above, but also generalizes to continuous and mixed spaces. A benefit over grid search is that random search can explore many more values than grid search could for continuous hyperparameters. It can outperform Grid search, especially when only a small number of hyperparameters affects the final performance of the machine learning algorithm. In this case, the optimization problem is said to have a low intrinsic dimensionality. Random Search is also embarrassingly parallel, and additionally allows the inclusion of prior knowledge by specifying the distribution from which to sample. Despite its simplicity, random search remains one of the important base-lines against which to compare the performance of new hyperparameter optimization methods. === Bayesian optimization === Bayesian optimization is a global optimization method for noisy black-box functions. Applied to hyperparameter optimization, Bayesian optimization builds a probabilistic model of the function mapping from hyperparameter values to the objective evaluated on a validation set. By iteratively evaluating a promising hyperparameter configuration based on the current model, and then updating it, Bayesian optimization aims to gather observations revealing as much information as possible about this function and, in particular, the location of the optimum. It tries to balance exploration (hyperparameters for which the outcome is most uncertain) and exploitation (hyperparameters expected close to the optimum). In practice, Bayesian optimization has been shown to obtain better results in fewer evaluations compared to grid search and random search, due to the ability to reason about the quality of experiments before they are run. === Gradient-based optimization === For specific learning algorithms, it is possible to compute the gradient with respect to hyperparameters and then optimize the hyperparameters using gradient descent. The first usage of these techniques was focused on neural networks. Since then, these methods have been extended to other models such as support vector machines or logistic regression. A different approach in order to obtain a gradient with respect to hyperparameters consists in differentiating the steps of an iterative optimization algorithm using automatic differentiation. A more recent work along this direction uses the implicit function theorem to calculate hypergradients and proposes a stable approximation of the inverse Hessian. The method scales to millions of hyperparameters and requires constant memory. In a different approach, a hypernetwork is trained to approximate the best response function. One of the advantages of this method is that it can handle discrete hyperparameters as well. Self-tuning networks offer a memory efficient version of this approach by choosing a compact representation for the hypernetwork. More recently, Δ-STN has improved this method further by a slight reparameterization of the hypernetwork which speeds up training. Δ-STN also yields a better approximation of the best-response Jacobian by linearizing the network in the weights, hence removing unnecessary nonlinear effects of large changes in the weights. Apart from hypernetwork approaches, gradient-based methods can be used to optimize discrete hyperparameters also by adopting a continuous relaxation of the parameters. Such methods have been extensively used for the optimization of architecture hyperparameters in neural architecture search. === Evolutionary optimization === Evolutionary optimization is a methodology for the global optimization of noisy black-box functions. In hyperparameter optimization, evolutionary optimization uses evolutionary algorithms to search the space of hyperparameters for a given algorithm. Evolutionary hyperparameter optimization follows a process inspired by the biological concept of evolution: Create an initial population of random solutions (i.e., randomly generate tuples of hyperparameters, typically 100+) Evaluate the hyperparameter tuples and acquire their fitness function (e.g., 10-fold cross-validation accuracy of the machine learning algorithm with those hyperparameters) Rank the hyperparameter tuples by their relative fitness Replace the worst-performing hyperparameter tuples with new ones generated via crossover and mutation Repeat steps 2-4 until satisfactory algorithm performance is reached or is no longer improving. Evolutionary optimization has been used in hyperparameter optimization for statistical machine learning algorithms, automated machine learning, typical neural network and deep neural network architecture search, as well as training of the weights in deep neural networks. === Population-based === Population Based Training (PBT) learns both hyperparameter values and network weights. Multiple learning processes operate independently, using different hyperparameters. As with evolutionary methods, poorly performing models are iteratively replaced with models that adopt modified hyperparameter values and weights based on the better performers. This replacement model warm starting is the primary differentiator between PBT and other evolutionary methods. PBT thus allows the hyperparameters to evolve and eliminates the need for manual hypertuning. The process makes no assumptions regarding model architecture, loss functions or training procedures. PBT and its variants are adaptive methods: they update hyperparameters during the training of the models. On the contrary, non-adaptive methods have the sub-optimal strategy to assign a constant set of hyperparameters for the whole training. === Early stopping-based === A class of early stopping-based hyperparameter optimization algorithms is purpose-built for large search spaces of continuous and discrete hyperparameters, particularly when the computational cost to evaluate the performance of a set of hyperparameters is high. Irace implements the iterated racing algorithm, that focuses the search around the most promising configurations, using statistical tests to discard the ones that perform poorly. Another early stopping hyperparameter optimization algorithm is successive halving (SHA), which begins as a random search but periodically prunes low-performing models, thereby focusing computational resources on more promising models. Asynchronous successive halving (ASHA) further improves upon SHA's resource utilization profile by removing the need to synchronously evaluate a

    Read more →
  • Hit-testing

    Hit-testing

    In computer graphics programming, hit-testing (hit detection, picking, or pick correlation) is the process of determining whether a user-controlled cursor (such as a mouse cursor or touch-point on a touch-screen interface) intersects a given graphical object (such as a shape, line, or curve) drawn on the screen. Hit-testing may be performed on the movement or activation of a mouse or other pointing device. Hit-testing is used by GUI environments to respond to user actions, such as selecting a menu item or a target in a game based on its visual location. In web programming languages such as HTML, SVG, and CSS, this is associated with the concept of pointer-events (e.g. user-initiated cursor movement or object selection). Collision detection is a related concept for detecting intersections of two or more different graphical objects, rather than intersection of a cursor with one or more graphical objects. == Algorithm == There are many different algorithms that may be used to perform hit-testing, with different performance or accuracy outcomes. One common hit-test algorithm for axis aligned bounding boxes. A key idea is that the box being tested must be either entirely above, entirely below, entirely to the right or left of the current box. If this is not possible, they are colliding. Example logic is presented in the pseudo-code below: In Python:

    Read more →
  • CrewAI

    CrewAI

    CrewAI is an open-source software framework and platform for building AI agents and multi-agent systems. Written primarily in Python, it is used to define artificial-intelligence agents, assign tasks to them, and coordinate their work through agent teams and workflows. The framework is associated with CrewAI Inc., a startup developing enterprise tools for automating business workflows with large language model-based agents. == History == CrewAI was first released on the Python Package Index in December 2023. The project was created by João Moura and later developed by CrewAI Inc. and open-source contributors. In October 2024, TechCrunch reported that CrewAI had raised $18 million across seed and Series A funding rounds from investors including Boldstart Ventures, Craft Ventures, Earl Grey Capital, and Insight Partners. The report also stated that Andrew Ng and HubSpot co-founder Dharmesh Shah had invested in the company. SiliconANGLE described the company as the developer of an open-source framework for building artificial-intelligence agents and reported that the funding consisted of a seed round led by Boldstart Ventures and a Series A led by Insight Partners. By late 2024, CrewAI had introduced commercial enterprise products built on top of its open-source components. TechCrunch reported that the company's enterprise offering added access controls, analytics, support, and templates for workflow automation. == Features == CrewAI is designed around groups of agents, sometimes called "crews", that can be assigned roles, goals, and tasks. The framework supports agent collaboration, task delegation, tool use, memory, and knowledge sources for retrieval-augmented generation workflows. The project describes two main building blocks: "Crews", which are used for autonomous agent collaboration, and "Flows", which are used for more controlled event-driven workflows. The framework is independent of LangChain and is released under the MIT License. It can be installed as a Python package and is commonly used with external large language model APIs or local models, depending on the developer's configuration. == Business model == CrewAI combines an open-source framework with commercial enterprise products. Its enterprise products are intended for organizations that need to build, monitor, and manage agent-based automations with additional security, observability, and administrative controls.

    Read more →
  • Pythia (machine learning)

    Pythia (machine learning)

    Pythia is an ancient text restoration model that recovers missing characters from damaged text input using deep neural networks. It was created by Yannis Assael, Thea Sommerschield, and Jonathan Prag, researchers from Google DeepMind and the University of Oxford. To study the society and the history of ancient civilisations, ancient history relies on disciplines such as epigraphy, the study of ancient inscribed texts. Hundreds of thousands of these texts, known as inscriptions, have survived to our day, but are often damaged over the centuries. Illegible parts of the text must then be restored by specialists, called epigraphists, in order to extract meaningful information from the text and use it to expand our knowledge of the context in which the text was written. Pythia takes as input the damaged text, and is trained to return hypothesised restorations of ancient Greek inscriptions, working as an assistive aid for ancient historians. Its neural network architecture works at both the character- and word-level, thereby effectively handling long-term context information, and dealing efficiently with incomplete word representations. Pythia is applicable to any discipline dealing with ancient texts (philology, papyrology, codicology) and can work in any language (ancient or modern).

    Read more →