In search of the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.
Human–robot interaction
Human–robot interaction (HRI) is the study of interactions between humans and robots. Human–robot interaction is a multidisciplinary field with contributions from human–computer interaction, artificial intelligence, robotics, natural language processing, design, psychology and philosophy. A subfield known as physical human–robot interaction (pHRI) has tended to focus on device design to enable people to safely interact with robotic systems. == Origins == Human–robot interaction has been a topic of both science fiction and academic speculation even before any robots existed. Because much of active HRI development depends on natural language processing, many aspects of HRI are continuations of human communications, a field of research which is much older than robotics. The origin of HRI as a discrete problem was stated by 20th-century author Isaac Asimov in 1941, in his novel I, Robot. Asimov coined Three Laws of Robotics, namely: A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws. These three laws provide an overview of the goals engineers and researchers hold for safety in the HRI field, although the fields of robot ethics and machine ethics are more complex than these three principles. However, generally human–robot interaction prioritizes the safety of humans that interact with potentially dangerous robotics equipment. Solutions to this problem range from the philosophical approach of treating robots as ethical agents (individuals with moral agency), to the practical approach of creating safety zones. These safety zones use technologies such as lidar to detect human presence or physical barriers to protect humans by preventing any contact between machine and operator. Although initially robots in the human–robot interaction field required some human intervention to function, research has expanded this to the extent that fully autonomous systems are now far more common than in the early 2000s. Autonomous systems include from simultaneous localization and mapping systems which provide intelligent robot movement to natural-language processing and natural-language generation systems which allow for natural, human-esque interaction which meet well-defined psychological benchmarks. Anthropomorphic robots (machines which imitate human body structure) are better described by the biomimetics field, but overlap with HRI in many research applications. Examples of robots which demonstrate this trend include Willow Garage's PR2 robot, the NASA Robonaut, and Honda ASIMO. However, robots in the human–robot interaction field are not limited to human-like robots: Paro and Kismet are both robots designed to elicit emotional response from humans, and so fall into the category of human–robot interaction. Goals in HRI range from industrial manufacturing through Cobots, medical technology through rehabilitation, autism intervention, and elder care devices, entertainment, human augmentation, and human convenience. Future research therefore covers a wide range of fields, much of which focuses on assistive robotics, robot-assisted search-and-rescue, and space exploration. == The goal of friendly human–robot interactions == Robots are artificial agents with capacities of perception and action in the physical world often referred by researchers as workspace. Their use has been generalized in factories but nowadays they tend to be found in the most technologically advanced societies in such critical domains as search and rescue, military battle, mine and bomb detection, scientific exploration, law enforcement, entertainment and hospital care. These new domains of applications imply a closer interaction with the user, sharing the workspace but also goals in terms of task achievement. The subfield of physical human–robot interaction (pHRI) has largely focused on device design to enable people to safely interact with robotic systems but is increasingly developing algorithmic approaches in an attempt to support fluent and expressive interactions between humans and robotic systems. With the advance in AI, the research is focusing on one part towards the safest physical interaction but also on a socially correct interaction, dependent on cultural criteria. The goal is to build an intuitive, and easy communication with the robot through speech, gestures, and facial expressions. Kerstin Dautenhahn refers to friendly Human–robot interaction as "Robotiquette" defining it as the "social rules for robot behaviour (a 'robotiquette') that is comfortable and acceptable to humans" The robot has to adapt itself to our way of expressing desires and orders and not the contrary. But every day environments such as homes have much more complex social rules than those implied by factories or even military environments. Thus, the robot needs perceiving and understanding capacities to build dynamic models of its surroundings. It needs to categorize objects, recognize and locate humans and further recognize their emotions. The need for dynamic capacities pushes forward every sub-field of robotics. Furthermore, by understanding and perceiving social cues, robots can enable collaborative scenarios with humans. For example, with the rapid rise of personal fabrication machines such as desktop 3D printers, laser cutters, etc., entering our homes, scenarios may arise where robots can collaboratively share control, co-ordinate and achieve tasks together. Industrial robots have already been integrated into industrial assembly lines and are collaboratively working with humans. The social impact of such robots have been studied and has indicated that workers still treat robots and social entities, rely on social cues to understand and work together. On the other end of HRI research the cognitive modelling of the "relationship" between human and the robots benefits the psychologists and robotic researchers the user study are often of interests on both sides. This research endeavours part of human society. For effective human – humanoid robot interaction numerous communication skills and related features should be implemented in the design of such artificial agents/systems. == General HRI research == HRI research spans a wide range of fields, some general to the nature of HRI. === Methods for perceiving humans === Methods for perceiving humans in the environment are based on sensor information. Research on sensing components and software led by Microsoft provide useful results for extracting the human kinematics (see Kinect). An example of older technique is to use colour information for example the fact that for light skinned people the hands are lighter than the clothes worn. In any case a human modelled a priori can then be fitted to the sensor data. The robot builds or has (depending on the level of autonomy the robot has) a 3D mapping of its surroundings to which is assigned the humans locations. Most methods intend to build a 3D model through vision of the environment. The proprioception sensors permit the robot to have information over its own state. This information is relative to a reference. Theories of proxemics may be used to perceive and plan around a person's personal space. A speech recognition system is used to interpret human desires or commands. By combining the information inferred by proprioception, sensor and speech the human position and state (standing, seated). In this matter, natural-language processing is concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural-language data. For instance, neural-network architectures and learning algorithms that can be applied to various natural-language processing tasks including part-of-speech tagging, chunking, named-entity recognition, and semantic role labeling. === Methods for motion planning === Motion planning in dynamic environments is a challenge that can at the moment only be achieved for robots with 3 to 10 degrees of freedom. Humanoid robots or even 2 armed robots, which can have up to 40 degrees of freedom, are unsuited for dynamic environments with today's technology. However lower-dimensional robots can use the potential field method to compute trajectories which avoid collisions with humans. === Cognitive models and theory of mind === Humans exhibit negative social and emotional responses as well as decreased trust toward some robots that closely, but imperfectly, resemble humans; this phenomenon has been termed the "Uncanny Valley". However recent research in telepresence robots has established that mimicking human body postures and expressive gestures has made the robots likeable and engaging in a remote setting. Further, the presence o
Problem solving
Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to get from point A to B) to complex issues in business and technical fields. The former is an example of simple problem solving (SPS) addressing one issue, whereas the latter is complex problem solving (CPS) with multiple interrelated obstacles. Another classification of problem-solving tasks is into well-defined problems with specific obstacles and goals, and ill-defined problems in which the current situation is troublesome but it is not clear what kind of resolution to aim for. Similarly, one may distinguish formal or fact-based problems requiring psychometric intelligence, versus socio-emotional problems which depend on the changeable emotions of individuals or groups, such as tactful behavior, fashion, or gift choices. Solutions require sufficient resources and knowledge to attain the goal. Professionals such as lawyers, doctors, programmers, and consultants are largely problem solvers for issues that require technical skills and knowledge beyond general competence. Many businesses have found profitable markets by recognizing a problem and creating a solution: the more widespread and inconvenient the problem, the greater the opportunity to develop a scalable solution. There are many specialized problem-solving techniques and methods in fields such as science, engineering, business, medicine, mathematics, computer science, philosophy, and social organization. The mental techniques to identify, analyze, and solve problems are studied in psychology and cognitive sciences. Also widely researched are the mental obstacles that prevent people from finding solutions; problem-solving impediments include confirmation bias, mental set, and functional fixedness. == Definition == The term problem solving has a slightly different meaning depending on the discipline. For instance, it is a mental process in psychology and a computerized process in computer science. There are two different types of problems: ill-defined and well-defined; different approaches are used for each. Well-defined problems have specific end goals and clearly expected solutions, while ill-defined problems do not. Well-defined problems allow for more initial planning than ill-defined problems. Solving problems sometimes involves dealing with pragmatics (the way that context contributes to meaning) and semantics (the interpretation of the problem). The ability to understand what the end goal of the problem is, and what rules could be applied, represents the key to solving the problem. Sometimes a problem requires abstract thinking or coming up with a creative solution. Problem solving has two major domains: mathematical problem solving and personal problem solving. Each concerns some difficulty or barrier that is encountered. === Psychology === Problem solving in psychology refers to the process of finding solutions to problems encountered in life. Solutions to these problems are usually situation- or context-specific. The process starts with problem finding and problem shaping, in which the problem is discovered and simplified. The next step is to generate possible solutions and evaluate them. Finally a solution is selected to be implemented and verified. Problems have an end goal to be reached; how you get there depends upon problem orientation (problem-solving coping style and skills) and systematic analysis. Mental health professionals study the human problem-solving processes using methods such as introspection, behaviorism, simulation, computer modeling, and experiment. Social psychologists look into the person-environment relationship aspect of the problem and independent and interdependent problem-solving methods. Problem solving has been defined as a higher-order cognitive process and intellectual function that requires the modulation and control of more routine or fundamental skills. Empirical research shows many different strategies and factors influence everyday problem solving. Rehabilitation psychologists studying people with frontal lobe injuries have found that deficits in emotional control and reasoning can be re-mediated with effective rehabilitation and could improve the capacity of injured persons to resolve everyday problems. Interpersonal everyday problem solving is dependent upon personal motivational and contextual components. One such component is the emotional valence of "real-world" problems, which can either impede or aid problem-solving performance. Researchers have focused on the role of emotions in problem solving, demonstrating that poor emotional control can disrupt focus on the target task, impede problem resolution, and lead to negative outcomes such as fatigue, depression, and inertia. In conceptualization,human problem solving consists of two related processes: problem orientation, and the motivational/attitudinal/affective approach to problematic situations and problem-solving skills. People's strategies cohere with their goals and stem from the process of comparing oneself with others. === Cognitive sciences === Among the first experimental psychologists to study problem solving were the Gestaltists in Germany, such as Karl Duncker in The Psychology of Productive Thinking (1935). Perhaps best known is the work of Allen Newell and Herbert A. Simon. Experiments in the 1960s and early 1970s asked participants to solve relatively simple, well-defined, but not previously seen laboratory tasks. These simple problems, such as the Tower of Hanoi, admitted optimal solutions that could be found quickly, allowing researchers to observe the full problem-solving process. Researchers assumed that these model problems would elicit the characteristic cognitive processes by which more complex "real world" problems are solved. An outstanding problem-solving technique found by this research is the principle of decomposition. === Computer science === Much of computer science and artificial intelligence involves designing automated systems to solve a specified type of problem: to accept input data and calculate a correct or adequate response, reasonably quickly. Algorithms are recipes or instructions that direct such systems, written into computer programs. Steps for designing such systems include problem determination, heuristics, root cause analysis, de-duplication, analysis, diagnosis, and repair. Analytic techniques include linear and nonlinear programming, queuing systems, and simulation. A large, perennial obstacle is to find and fix errors in computer programs: debugging. === Logic === Formal logic concerns issues like validity, truth, inference, argumentation, and proof. In a problem-solving context, it can be used to formally represent a problem as a theorem to be proved, and to represent the knowledge needed to solve the problem as the premises to be used in a proof that the problem has a solution. The use of computers to prove mathematical theorems using formal logic emerged as the field of automated theorem proving in the 1950s. It included the use of heuristic methods designed to simulate human problem solving, as in the Logic Theory Machine, developed by Allen Newell, Herbert A. Simon and J. C. Shaw, as well as algorithmic methods such as the resolution principle developed by John Alan Robinson. In addition to its use for finding proofs of mathematical theorems, automated theorem-proving has also been used for program verification in computer science. In 1958, John McCarthy proposed the advice taker, to represent information in formal logic and to derive answers to questions using automated theorem-proving. An important step in this direction was made by Cordell Green in 1969, who used a resolution theorem prover for question-answering and for such other applications in artificial intelligence as robot planning. The resolution theorem-prover used by Cordell Green bore little resemblance to human problem solving methods. In response to criticism of that approach from researchers at MIT, Robert Kowalski developed logic programming and SLD resolution, which solves problems by problem decomposition. He has advocated logic for both computer and human problem solving and computational logic to improve human thinking. === Engineering === When products or processes fail, problem solving techniques can be used to develop corrective actions that can be taken to prevent further failures. Such techniques can also be applied to a product or process prior to an actual failure event—to predict, analyze, and mitigate a potential problem in advance. Techniques such as failure mode and effects analysis can proactively reduce the likelihood of problems. In either the reactive or the proactive case, it is necessary to build a causal explanation through a process of diagnosis. In deriving an explanation of effects in terms of causes, abduction generates new ideas or hypothes
Cross-validation (statistics)
Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-validation includes resampling and sample splitting methods that use different portions of the data to test and train a model on different iterations. It is often used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. It can also be used to assess the quality of a fitted model and the stability of its parameters. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested (called the validation dataset or testing set). The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem). One round of cross-validation involves partitioning a sample of data into complementary subsets, performing the analysis on one subset (called the training set), and validating the analysis on the other subset (called the validation set or testing set). To reduce variability, in most methods multiple rounds of cross-validation are performed using different partitions, and the validation results are combined (e.g. averaged) over the rounds to give an estimate of the model's predictive performance. In summary, cross-validation combines (averages) measures of fitness in prediction to derive a more accurate estimate of model prediction performance. == Motivation == Assume a model with one or more unknown parameters, and a data set to which the model can be fit (the training data set). The fitting process optimizes the model parameters to make the model fit the training data as well as possible. If an independent sample of validation data is taken from the same population as the training data, it will generally turn out that the model does not fit the validation data as well as it fits the training data. The size of this difference is likely to be large especially when the size of the training data set is small, or when the number of parameters in the model is large. Cross-validation is a way to estimate the size of this effect. === Example: linear regression === In linear regression, there exist real response values y 1 , … , y n {\textstyle y_{1},\ldots ,y_{n}} , and n p-dimensional vector covariates x1, ..., xn. The components of the vector xi are denoted xi1, ..., xip. If least squares is used to fit a function in the form of a hyperplane ŷ = a + βTx to the data (xi, yi) 1 ≤ i ≤ n, then the fit can be assessed using the mean squared error (MSE). The MSE for given estimated parameter values a and β on the training set (xi, yi) 1 ≤ i ≤ n is defined as: MSE = 1 n ∑ i = 1 n ( y i − y ^ i ) 2 = 1 n ∑ i = 1 n ( y i − a − β T x i ) 2 = 1 n ∑ i = 1 n ( y i − a − β 1 x i 1 − ⋯ − β p x i p ) 2 {\displaystyle {\begin{aligned}{\text{MSE}}&={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-{\hat {y}}_{i})^{2}={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-a-{\boldsymbol {\beta }}^{T}\mathbf {x} _{i})^{2}\\&={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-a-\beta _{1}x_{i1}-\dots -\beta _{p}x_{ip})^{2}\end{aligned}}} If the model is correctly specified, it can be shown under mild assumptions that the expected value of the MSE for the training set is (n − p − 1)/(n + p + 1) < 1 times the expected value of the MSE for the validation set (the expected value is taken over the distribution of training sets). Thus, a fitted model and computed MSE on the training set will result in an optimistically biased assessment of how well the model will fit an independent data set. This biased estimate is called the in-sample estimate of the fit, whereas the cross-validation estimate is an out-of-sample estimate. Since in linear regression it is possible to directly compute the factor (n − p − 1)/(n + p + 1) by which the training MSE underestimates the validation MSE under the assumption that the model specification is valid, cross-validation can be used for checking whether the model has been overfitted, in which case the MSE in the validation set will substantially exceed its anticipated value. (Cross-validation in the context of linear regression is also useful in that it can be used to select an optimally regularized cost function.) === General case === In most other regression procedures (e.g. logistic regression), there is no simple formula to compute the expected out-of-sample fit. Cross-validation is, thus, a generally applicable way to predict the performance of a model on unavailable data using numerical computation in place of theoretical analysis. == Types == Two types of cross-validation can be distinguished: exhaustive and non-exhaustive cross-validation. === Exhaustive cross-validation === Exhaustive cross-validation methods are cross-validation methods which learn and test on all possible ways to divide the original sample into a training and a validation set. ==== Leave-p-out cross-validation ==== Leave-p-out cross-validation (LpO CV) involves using p observations as the validation set and the remaining observations as the training set. This is repeated on all ways to cut the original sample on a validation set of p observations and a training set. LpO cross-validation require training and validating the model C p n {\displaystyle C_{p}^{n}} times, where n is the number of observations in the original sample, and where C p n {\displaystyle C_{p}^{n}} is the binomial coefficient. For p > 1 and for even moderately large n, LpO CV can become computationally infeasible. For example, with n = 100 and p = 30, C 30 100 ≈ 3 × 10 25 . {\displaystyle C_{30}^{100}\approx 3\times 10^{25}.} A variant of LpO cross-validation with p=2 known as leave-pair-out cross-validation has been recommended as a nearly unbiased method for estimating the area under ROC curve of binary classifiers. ==== Leave-one-out cross-validation ==== Leave-one-out cross-validation (LOOCV) is a particular case of leave-p-out cross-validation with p = 1. The process looks similar to jackknife; however, with cross-validation one computes a statistic on the left-out sample(s), while with jackknifing one computes a statistic from the kept samples only. LOO cross-validation requires less computation time than LpO cross-validation because there are only C 1 n = n {\displaystyle C_{1}^{n}=n} passes rather than C p n {\displaystyle C_{p}^{n}} . However, n {\displaystyle n} passes may still require quite a large computation time, in which case other approaches such as k-fold cross validation may be more appropriate. Pseudo-code algorithm: Input: x, {vector of length N with x-values of incoming points} y, {vector of length N with y-values of the expected result} interpolate( x_in, y_in, x_out ), { returns the estimation for point x_out after the model is trained with x_in-y_in pairs} Output: err, {estimate for the prediction error} Steps: err ← 0 for i ← 1, ..., N do // define the cross-validation subsets x_in ← (x[1], ..., x[i − 1], x[i + 1], ..., x[N]) y_in ← (y[1], ..., y[i − 1], y[i + 1], ..., y[N]) x_out ← x[i] y_out ← interpolate(x_in, y_in, x_out) err ← err + (y[i] − y_out)^2 end for err ← err/N === Non-exhaustive cross-validation === Non-exhaustive cross validation methods do not compute all ways of splitting the original sample. These methods are approximations of leave-p-out cross-validation. ==== k-fold cross-validation ==== In k-fold cross-validation, the original sample is randomly partitioned into k equal sized subsamples, often referred to as "folds". Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times, with each of the k subsamples used exactly once as the validation data. The k results can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling (see below) is that all observations are used for both training and validation, and each observation is used for validation exactly once. 10-fold cross-validation is commonly used, but in general k remains an unfixed parameter. For example, setting k = 2 results in 2-fold cross-validation. In 2-fold cross-validation, the dataset is randomly shuffled into two sets d0 and d1, so that both sets are equal size (this is usually implemented by shuffling the data array and then splitting it in two). We then train on d0 and validate on d1, followed by training on d1 and validating on d0. When k = n (the number of observations), k-fold cross-validation is equivalent to leave-one-out cr
Coupled pattern learner
Coupled Pattern Learner (CPL) is a machine learning algorithm which couples the semi-supervised learning of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods. == Coupled Pattern Learner == Semi-supervised learning approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an ontology defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009. == CPL overview == CPL is an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difficult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’. == CPL description == === Coupling of predicates === CPL primarily relies on the notion of coupling the learning of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways. Sharing among same-arity predicates according to logical relations Relation argument type-checking === Sharing among same-arity predicates === Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is mutually exclusive with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’. === Relation argument type-checking === This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classified as belonging to the correct argument types. === Algorithm description === Following is a quick summary of the CPL algorithm. Input: An ontology O, and a text corpus C Output: Trusted instances/patterns for each predicate for i=1,2,...,∞ do foreach predicate p in O do EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances; FILTER candidates that violate coupling; RANK candidate instances/patterns; PROMOTE top candidates; end end ==== Inputs ==== A large corpus of Part-Of-Speech tagged sentences and an initial ontology with predefined categories, relations, mutually exclusive relationships between same-arity predicates, subset relationships between some categories, seed instances for all predicates, and seed patterns for the categories. ==== Candidate extraction ==== CPL finds new candidate instances by using newly promoted patterns to extract the noun phrases that co-occur with those patterns in the text corpus. CPL extracts, Category Instances Category Patterns Relation Instances Relation Patterns ==== Candidate filtering ==== Candidate instances and patterns are filtered to maintain high precision, and to avoid extremely specific patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns. ==== Candidate ranking ==== CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern. ==== Candidate promotion ==== CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are only promoted if they co-occur with at least two promoted patterns or instances, respectively. == Meta-Bootstrap Learner == Meta-Bootstrap Learner (MBL) was also proposed by the authors of CPL. Meta-Bootstrap learner couples the training of multiple extraction techniques with a multi-view constraint, which requires the extractors to agree. It makes addition of coupling constraints on top of existing extraction algorithms, while treating them as black boxes, feasible. MBL assumes that the errors made by different extraction techniques are independent. Following is a quick summary of MBL. Input: An ontology O, a set of extractors ε Output: Trusted instances for each predicate for i=1,2,...,∞ do foreach predicate p in O do foreach extractor e in ε do Extract new candidates for p using e with recently promoted instances; end FILTER candidates that violate mutual-exclusion or type-checking constraints; PROMOTE candidates that were extracted by all extractors; end end Subordinate algorithms used with MBL do not promote any instance on their own, they report the evidence about each candidate to MBL and MBL is responsible for promoting instances. == Applications == In their paper authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase
Teknomo–Fernandez algorithm
The Teknomo–Fernandez algorithm (TF algorithm), is an efficient algorithm for generating the background image of a given video sequence. By assuming that the background image is shown in the majority of the video, the algorithm is able to generate a good background image of a video in O ( R ) {\displaystyle O(R)} -time using only a small number of binary operations and Boolean bit operations, which require a small amount of memory and has built-in operators found in many programming languages such as C, C++, and Java. == History == People tracking from videos usually involves some form of background subtraction to segment foreground from background. Once foreground images are extracted, then desired algorithms (such as those for motion tracking, object tracking, and facial recognition) may be executed using these images. However, background subtraction requires that the background image is already available and unfortunately, this is not always the case. Traditionally, the background image is searched for manually or automatically from the video images when there are no objects. More recently, automatic background generation through object detection, medial filtering, medoid filtering, approximated median filtering, linear predictive filter, non-parametric model, Kalman filter, and adaptive smoothening have been suggested; however, most of these methods have high computational complexity and are resource-intensive. The Teknomo–Fernandez algorithm is also an automatic background generation algorithm. Its advantage, however, is its computational speed of only O ( R ) {\displaystyle O(R)} -time, depending on the resolution R {\displaystyle R} of an image and its accuracy gained within a manageable number of frames. Only at least three frames from a video is needed to produce the background image assuming that for every pixel position, the background occurs in the majority of the videos. Furthermore, it can be performed for both grayscale and colored videos. == Assumptions == The camera is stationary. The light of the environment changes only slowly relative to the motions of the people in the scene. The number of people does not occupy the scene for most of the time at the same place. Generally, however, the algorithm will certainly work whenever the following single important assumption holds: For each pixel position, the majority of the pixel values in the entire video contain the pixel value of the actual background image (at that position).As long as each part of the background is shown in the majority of the video, the entire background image needs not to appear in any of its frames. The algorithm is expected to work accurately. == Background image generation == === Equations === For three frames of image sequence x 1 {\displaystyle x_{1}} , x 2 {\displaystyle x_{2}} , and x 3 {\displaystyle x_{3}} , the background image B {\displaystyle B} is obtained using B = x 3 ( x 1 ⊕ x 2 ) + x 1 x 2 {\displaystyle B=x_{3}(x_{1}\oplus x_{2})+x_{1}x_{2}} where ⊕ {\displaystyle \oplus } denotes the exclusive disjunctive bit operator. The Boolean mode function S {\displaystyle S} of the table occurs when the number of 1 entries is larger than half of the number of images such that S = { 1 , if ∑ i = 1 n x i ≥ ⌈ n 2 + 1 ⌉ , and n ≥ 3 0 , otherwise {\displaystyle S={\begin{cases}1,&{\text{if }}\sum _{i=1}^{n}x_{i}\geq \left\lceil {\frac {n}{2}}+1\right\rceil ,{\text{ and }}n\geq 3\\0,&{\text{otherwise}}\end{cases}}} For three images, the background image B {\displaystyle B} can be taken as the value x ¯ 1 x 2 x 3 + x 1 x ¯ 2 x 3 + x 1 x 2 x ¯ 3 + x 1 x 2 x 3 {\displaystyle {\bar {x}}_{1}x_{2}x_{3}+x_{1}{\bar {x}}_{2}x_{3}+x_{1}x_{2}{\bar {x}}_{3}+x_{1}x_{2}x_{3}} === Background generation algorithm === At the first level, three frames are selected at random from the image sequence to produce a background image by combining them using the first equation. This yields a better background image at the second level. The procedure is repeated until desired level L {\displaystyle L} . == Theoretical accuracy == At level ℓ {\displaystyle \ell } , the probability p ℓ {\displaystyle p_{\ell }} that the modal bit predicted is the actual modal bit is represented by the equation p ℓ = ( p ℓ − 1 ) 3 + 3 ( p ℓ − 1 ) 2 ( 1 − p ℓ − 1 ) {\displaystyle p_{\ell }=(p_{\ell -1})^{3}+3(p_{\ell -1})^{2}(1-p_{\ell -1})} . The table below gives the computed probability values across several levels using some specific initial probabilities. It can be observed that even if the modal bit at the considered position is at a low 60% of the frames, the probability of accurate modal bit determination is already more than 99% at 6 levels. == Space complexity == The space requirement of the Teknomo–Fernandez algorithm is given by the function O ( R F + R 3 L ) {\displaystyle O(RF+R3^{L})} , depending on the resolution R {\displaystyle R} of the image, the number F {\displaystyle F} of frames in the video, and the desired number L {\displaystyle L} of levels. However, the fact that L {\displaystyle L} will probably not exceed 6 reduces the space complexity to O ( R F ) {\displaystyle O(RF)} . == Time complexity == The entire algorithm runs in O ( R ) {\displaystyle O(R)} -time, only depending on the resolution of the image. Computing the modal bit for each bit can be done in O ( 1 ) {\displaystyle O(1)} -time while the computation of the resulting image from the three given images can be done in O ( R ) {\displaystyle O(R)} -time. The number of the images to be processed in L {\displaystyle L} levels is O ( 3 L ) {\displaystyle O(3^{L})} . However, since L ≤ 6 {\displaystyle L\leq 6} , then this is actually O ( 1 ) {\displaystyle O(1)} , thus the algorithm runs in O ( R ) {\displaystyle O(R)} . == Variants == A variant of the Teknomo–Fernandez algorithm that incorporates the Monte-Carlo method named CRF has been developed. Two different configurations of CRF were implemented: CRF9,2 and CRF81,1. Experiments on some colored video sequences showed that the CRF configurations outperform the TF algorithm in terms of accuracy. However, the TF algorithm remains more efficient in terms of processing time. == Applications == Object detection Face detection Face recognition Pedestrian detection Video surveillance Motion capture Human-computer interaction Content-based video coding Traffic monitoring Real-time gesture recognition
Cross-entropy method
The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases: Draw a sample from a probability distribution. Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems. == Estimation via importance sampling == Consider the general problem of estimating the quantity ℓ = E u [ H ( X ) ] = ∫ H ( x ) f ( x ; u ) d x {\displaystyle \ell =\mathbb {E} _{\mathbf {u} }[H(\mathbf {X} )]=\int H(\mathbf {x} )\,f(\mathbf {x} ;\mathbf {u} )\,{\textrm {d}}\mathbf {x} } , where H {\displaystyle H} is some performance function and f ( x ; u ) {\displaystyle f(\mathbf {x} ;\mathbf {u} )} is a member of some parametric family of distributions. Using importance sampling this quantity can be estimated as ℓ ^ = 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) g ( X i ) {\displaystyle {\hat {\ell }}={\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{g(\mathbf {X} _{i})}}} , where X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} is a random sample from g {\displaystyle g\,} . For positive H {\displaystyle H} , the theoretically optimal importance sampling density (PDF) is given by g ∗ ( x ) = H ( x ) f ( x ; u ) / ℓ {\displaystyle g^{}(\mathbf {x} )=H(\mathbf {x} )f(\mathbf {x} ;\mathbf {u} )/\ell } . This, however, depends on the unknown ℓ {\displaystyle \ell } . The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF g ∗ {\displaystyle g^{}} . == Generic CE algorithm == Choose initial parameter vector v ( 0 ) {\displaystyle \mathbf {v} ^{(0)}} ; set t = 1. Generate a random sample X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} from f ( ⋅ ; v ( t − 1 ) ) {\displaystyle f(\cdot ;\mathbf {v} ^{(t-1)})} Solve for v ( t ) {\displaystyle \mathbf {v} ^{(t)}} , where v ( t ) = argmax v 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) f ( X i ; v ( t − 1 ) ) log f ( X i ; v ) {\displaystyle \mathbf {v} ^{(t)}=\mathop {\textrm {argmax}} _{\mathbf {v} }{\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})}}\log f(\mathbf {X} _{i};\mathbf {v} )} If convergence is reached then stop; otherwise, increase t by 1 and reiterate from step 2. In several cases, the solution to step 3 can be found analytically. Situations in which this occurs are When f {\displaystyle f\,} belongs to the natural exponential family When f {\displaystyle f\,} is discrete with finite support When H ( X ) = I { x ∈ A } {\displaystyle H(\mathbf {X} )=\mathrm {I} _{\{\mathbf {x} \in A\}}} and f ( X i ; u ) = f ( X i ; v ( t − 1 ) ) {\displaystyle f(\mathbf {X} _{i};\mathbf {u} )=f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})} , then v ( t ) {\displaystyle \mathbf {v} ^{(t)}} corresponds to the maximum likelihood estimator based on those X k ∈ A {\displaystyle \mathbf {X} _{k}\in A} . == Continuous optimization—example == The same CE algorithm can be used for optimization, rather than estimation. Suppose the problem is to maximize some function S {\displaystyle S} , for example, S ( x ) = e − ( x − 2 ) 2 + 0.8 e − ( x + 2 ) 2 {\displaystyle S(x)={\textrm {e}}^{-(x-2)^{2}}+0.8\,{\textrm {e}}^{-(x+2)^{2}}} . To apply CE, one considers first the associated stochastic problem of estimating P θ ( S ( X ) ≥ γ ) {\displaystyle \mathbb {P} _{\boldsymbol {\theta }}(S(X)\geq \gamma )} for a given level γ {\displaystyle \gamma \,} , and parametric family { f ( ⋅ ; θ ) } {\displaystyle \left\{f(\cdot ;{\boldsymbol {\theta }})\right\}} , for example the 1-dimensional Gaussian distribution, parameterized by its mean μ t {\displaystyle \mu _{t}\,} and variance σ t 2 {\displaystyle \sigma _{t}^{2}} (so θ = ( μ , σ 2 ) {\displaystyle {\boldsymbol {\theta }}=(\mu ,\sigma ^{2})} here). Hence, for a given γ {\displaystyle \gamma \,} , the goal is to find θ {\displaystyle {\boldsymbol {\theta }}} so that D K L ( I { S ( x ) ≥ γ } ‖ f θ ) {\displaystyle D_{\mathrm {KL} }({\textrm {I}}_{\{S(x)\geq \gamma \}}\|f_{\boldsymbol {\theta }})} is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above. It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value ≥ γ {\displaystyle \geq \gamma } . The worst of the elite samples is then used as the level parameter for the next iteration. This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm. === Pseudocode === // Initialize parameters μ := −6 σ2 := 100 t := 0 maxits := 100 N := 100 Ne := 10 // While maxits not exceeded and not converged while t < maxits and σ2 > ε do // Obtain N samples from current sampling distribution X := SampleGaussian(μ, σ2, N) // Evaluate objective function at sampled points S := exp(−(X − 2) ^ 2) + 0.8 exp(−(X + 2) ^ 2) // Sort X by objective function values in descending order X := sort(X, S) // Update parameters of sampling distribution via elite samples μ := mean(X(1:Ne)) σ2 := var(X(1:Ne)) t := t + 1 // Return mean of final sampling distribution as solution return μ == Related methods == Simulated annealing Genetic algorithms Harmony search Estimation of distribution algorithm Tabu search Natural Evolution Strategy Ant colony optimization algorithms