Google Clips is a discontinued miniature clip-on camera device developed by Google. == History == It was announced on October 4, 2017 and went on sale on January 27, 2018. Google Clips automatically captured video clips (without audio) at moments its machine learning algorithms determined to be interesting or relevant. An indicator flashed when the camera was looking for scenes to capture. Google Clips' artificial intelligence (AI) could learn the faces of people to take photographs with certain people, and could automatically set lighting and framing. It had 16 GB of storage built-in storage and could record clips for up to 3 hours. This camera was originally priced at US$249 in the United States. It was withdrawn from sale on October 15, 2019, but supported until the end of December 2021. == Reception == The Independent wrote that Google Clips is "an impressive little device, but one that also has the potential to feel very creepy." According to The Verge's generally negative review, "it didn't capture anything special" over two weeks of testing.
Is an AI Background Remover Worth It in 2026?
Comparing the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.
Conditional random field
Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account. To do so, the predictions are modelled as a graphical model, which represents the presence of dependencies between the predictions. The kind of graph used depends on the application. For example, in natural language processing, "linear chain" CRFs are popular, for which each prediction is dependent only on its immediate neighbours. In image processing, the graph typically connects locations to nearby and/or similar locations to enforce that they receive similar predictions. Other examples where CRFs are used are: labeling or parsing of sequential data for natural language processing or biological sequences, part-of-speech tagging, shallow parsing, named entity recognition, gene finding, peptide critical functional region finding, and object recognition and image segmentation in computer vision. == Description == CRFs are a type of discriminative undirected probabilistic graphical model. Lafferty, McCallum and Pereira define a CRF on observations X {\displaystyle {\boldsymbol {X}}} and random variables Y {\displaystyle {\boldsymbol {Y}}} as follows: Let G = ( V , E ) {\displaystyle G=(V,E)} be a graph such that Y = ( Y v ) v ∈ V {\displaystyle {\boldsymbol {Y}}=({\boldsymbol {Y}}_{v})_{v\in V}} , so that Y {\displaystyle {\boldsymbol {Y}}} is indexed by the vertices of G {\displaystyle G} . Then ( X , Y ) {\displaystyle ({\boldsymbol {X}},{\boldsymbol {Y}})} is a conditional random field when each random variable Y v {\displaystyle {\boldsymbol {Y}}_{v}} , conditioned on X {\displaystyle {\boldsymbol {X}}} , obeys the Markov property with respect to the graph; that is, its probability is dependent only on its neighbours in G and not its past states: P ( Y v | X , { Y w : w ≠ v } ) = P ( Y v | X , { Y w : w ∼ v } ) {\displaystyle P({\boldsymbol {Y}}_{v}|{\boldsymbol {X}},\{{\boldsymbol {Y}}_{w}:w\neq v\})=P({\boldsymbol {Y}}_{v}|{\boldsymbol {X}},\{{\boldsymbol {Y}}_{w}:w\sim v\})} , where w ∼ v {\displaystyle {\mathit {w}}\sim v} means that w {\displaystyle w} and v {\displaystyle v} are neighbors in G {\displaystyle G} . What this means is that a CRF is an undirected graphical model whose nodes can be divided into exactly two disjoint sets X {\displaystyle {\boldsymbol {X}}} and Y {\displaystyle {\boldsymbol {Y}}} , the observed and output variables, respectively; the conditional distribution p ( Y | X ) {\displaystyle p({\boldsymbol {Y}}|{\boldsymbol {X}})} is then modeled. === Inference === For general graphs, the problem of exact inference in CRFs is intractable. The inference problem for a CRF is basically the same as for an MRF and the same arguments hold. However, there exist special cases for which exact inference is feasible: If the graph is a chain or a tree, message passing algorithms yield exact solutions. The algorithms used in these cases are analogous to the forward-backward and Viterbi algorithm for the case of HMMs. If the CRF only contains pair-wise potentials and the energy is submodular, combinatorial min cut/max flow algorithms yield exact solutions. If exact inference is impossible, several algorithms can be used to obtain approximate solutions. These include: Loopy belief propagation Alpha expansion Mean field inference Linear programming relaxations === Parameter learning === Learning the parameters θ {\displaystyle \theta } is usually done by maximum likelihood learning for p ( Y i | X i ; θ ) {\displaystyle p(Y_{i}|X_{i};\theta )} . If all nodes have exponential family distributions and all nodes are observed during training, this optimization is convex. It can be solved for example using gradient descent algorithms, or Quasi-Newton methods such as the L-BFGS algorithm. On the other hand, if some variables are unobserved, the inference problem has to be solved for these variables. Exact inference is intractable in general graphs, so approximations have to be used. === Examples === In sequence modeling, the graph of interest is usually a chain graph. An input sequence of observed variables X {\displaystyle X} represents a sequence of observations and Y {\displaystyle Y} represents a hidden (or unknown) state variable that needs to be inferred given the observations. The Y i {\displaystyle Y_{i}} are structured to form a chain, with an edge between each Y i − 1 {\displaystyle Y_{i-1}} and Y i {\displaystyle Y_{i}} . As well as having a simple interpretation of the Y i {\displaystyle Y_{i}} as "labels" for each element in the input sequence, this layout admits efficient algorithms for: model training, learning the conditional distributions between the Y i {\displaystyle Y_{i}} and feature functions from some corpus of training data. decoding, determining the probability of a given label sequence Y {\displaystyle Y} given X {\displaystyle X} . inference, determining the most likely label sequence Y {\displaystyle Y} given X {\displaystyle X} . The conditional dependency of each Y i {\displaystyle Y_{i}} on X {\displaystyle X} is defined through a fixed set of feature functions of the form f ( i , Y i − 1 , Y i , X ) {\displaystyle f(i,Y_{i-1},Y_{i},X)} , which can be thought of as measurements on the input sequence that partially determine the likelihood of each possible value for Y i {\displaystyle Y_{i}} . The model assigns each feature a numerical weight and combines them to determine the probability of a certain value for Y i {\displaystyle Y_{i}} . Linear-chain CRFs have many of the same applications as conceptually simpler hidden Markov models (HMMs), but relax certain assumptions about the input and output sequence distributions. An HMM can loosely be understood as a CRF with very specific feature functions that use constant probabilities to model state transitions and emissions. Conversely, a CRF can loosely be understood as a generalization of an HMM that makes the constant transition probabilities into arbitrary functions that vary across the positions in the sequence of hidden states, depending on the input sequence. Notably, in contrast to HMMs, CRFs can contain any number of feature functions, the feature functions can inspect the entire input sequence X {\displaystyle X} at any point during inference, and the range of the feature functions need not have a probabilistic interpretation. == Variants == === Higher-order CRFs and semi-Markov CRFs === CRFs can be extended into higher order models by making each Y i {\displaystyle Y_{i}} dependent on a fixed number k {\displaystyle k} of previous variables Y i − k , . . . , Y i − 1 {\displaystyle Y_{i-k},...,Y_{i-1}} . In conventional formulations of higher order CRFs, training and inference are only practical for small values of k {\displaystyle k} (such as k ≤ 5), since their computational cost increases exponentially with k {\displaystyle k} . However, another recent advance has managed to ameliorate these issues by leveraging concepts and tools from the field of Bayesian nonparametrics. Specifically, the CRF-infinity approach constitutes a CRF-type model that is capable of learning infinitely-long temporal dynamics in a scalable fashion. This is effected by introducing a novel potential function for CRFs that is based on the Sequence Memoizer (SM), a nonparametric Bayesian model for learning infinitely-long dynamics in sequential observations. To render such a model computationally tractable, CRF-infinity employs a mean-field approximation of the postulated novel potential functions (which are driven by an SM). This allows for devising efficient approximate training and inference algorithms for the model, without undermining its capability to capture and model temporal dependencies of arbitrary length. There exists another generalization of CRFs, the semi-Markov conditional random field (semi-CRF), which models variable-length segmentations of the label sequence Y {\displaystyle Y} . This provides much of the power of higher-order CRFs to model long-range dependencies of the Y i {\displaystyle Y_{i}} , at a reasonable computational cost. Finally, large-margin models for structured prediction, such as the structured Support Vector Machine can be seen as an alternative training procedure to CRFs. === Latent-dynamic conditional random field === Latent-dynamic conditional random fields (LDCRF) or discriminative probabilistic latent variable models (DPLVM) are a type of CRFs for sequence tagging tasks. They are latent variable models that are trained discriminatively. In an LDCRF, like in any sequence tagging task, given a sequence of observations x = x 1 , … , x n {\displaystyle x_{1},\dots ,x_{n}} , the main problem the model must solve is how to assign a sequence of labels y = y 1 , … , y n {\displaystyle y_{1},\dots ,y_{n}} from one finite set
Label noise
Label noise refers to errors or inaccuracies in the class labels of data instances. This is a widespread issue in machine learning datasets, arising from human annotator mistakes, unclear labeling instructions, automated labeling methods, or adversarial attacks in supervised learning. Label noise can be roughly divided into random noise, where labels are flipped independently of input features, and systematic noise, where mislabeling is dependent on certain patterns or biases in the data. Label noise can be damaging to model performance, especially for complex models that may overfit to noisy labels rather than generalizable patterns. Many approaches have been proposed to deal with the effects of label noise, including robust loss functions, noise-tolerant algorithms, data cleaning methods, and semi-supervised learning approaches. To reduce the impact of wrong labels during training, techniques like label smoothing, sample reweighting and using trusted validation sets are used. The role of noise-robust training paradigms and curriculum learning strategies to improve resilience against mislabeled data is also explored in recent research.
Owain Evans
Owain Rhys Evans is a British artificial intelligence researcher who works on AI alignment and machine learning safety. He founded Truthful AI, a research group based in Berkeley, California, and is an affiliate of the Center for Human Compatible AI (CHAI) at the University of California, Berkeley. His research addresses AI truthfulness, emergent behaviors in large language models, and the alignment of AI systems with human values. == Education == Evans earned a Bachelor of Arts in philosophy and mathematics from Columbia University in 2008 and a PhD in philosophy from the Massachusetts Institute of Technology in 2015. His doctoral research focused on Bayesian computational models of human preferences and decision-making. == Career == After completing his doctorate, Evans held positions at the Future of Humanity Institute (FHI) at the University of Oxford, first as a postdoctoral research fellow and later as a research scientist. While at FHI, he co-authored a survey of machine learning researchers on timelines for human-level AI, published in the Journal of Artificial Intelligence Research. The survey was reported on by Newsweek, New Scientist, the BBC, and The Economist. He was also among the co-authors of a 2018 report on the potential for misuse of AI technologies, published by researchers at Oxford, Cambridge, and other institutions. Since 2022, Evans has been based in Berkeley, where he founded Truthful AI, a non-profit research group that studies AI truthfulness, deception, and emergent behaviors in large language models. == Research == Evans's early work examined challenges in inverse reinforcement learning when human behavior is irrational or biased, proposing methods for AI systems to infer preferences from imperfect human demonstrations. He co-developed TruthfulQA (2021), a benchmark that tests whether language models give truthful answers rather than repeating common misconceptions. Initial evaluations found that larger models were not more truthful, suggesting that scaling alone does not improve factual accuracy. The benchmark has since been used by AI developers to evaluate large language models. He also co-authored a paper proposing design and governance strategies for building AI systems that do not deceive or hallucinate. In 2023, Evans and collaborators described the "reversal curse", showing that language models trained on a fact in one direction (e.g. "A is B") often cannot answer the corresponding reverse query ("B is A"). His group also developed a benchmark for evaluating situational awareness in language models. In 2025, Evans and colleagues published a study in Nature on what they termed "emergent misalignment": fine-tuning a language model on a narrow task (writing insecure code) caused it to produce unrelated harmful outputs without explicit instruction to do so. Later that year, Evans and collaborators (including researchers at Anthropic) reported that hidden behavioral traits can transfer between language models through training data, even when those traits are not explicitly present in the data, a phenomenon they called "subliminal learning". == Public engagement == In November 2025, Evans delivered the Hinton Lectures, a keynote lecture series on AI safety co-founded by Geoffrey Hinton and the Global Risk Institute.
How to Choose an AI Analytics Tool
Looking for the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.
Actor-critic algorithm
The actor-critic algorithm (AC) is a family of reinforcement learning (RL) algorithms that combine policy-based RL algorithms such as policy gradient methods, and value-based RL algorithms such as value iteration, Q-learning, SARSA, and TD learning. An AC algorithm consists of two main components: an "actor" that determines which actions to take according to a policy function, and a "critic" that evaluates those actions according to a value function. Some AC algorithms are on-policy, some are off-policy. Some apply to either continuous or discrete action spaces. Some work in both cases. == Overview == The actor-critic methods can be understood as an improvement over pure policy gradient methods like REINFORCE via introducing a baseline. === Actor === The actor uses a policy function π ( a | s ) {\displaystyle \pi (a|s)} , while the critic estimates either the value function V ( s ) {\displaystyle V(s)} , the action-value Q-function Q ( s , a ) , {\displaystyle Q(s,a),} the advantage function A ( s , a ) {\displaystyle A(s,a)} , or any combination thereof. The actor is a parameterized function π θ {\displaystyle \pi _{\theta }} , where θ {\displaystyle \theta } are the parameters of the actor. The actor takes as argument the state of the environment s {\displaystyle s} and produces a probability distribution π θ ( ⋅ | s ) {\displaystyle \pi _{\theta }(\cdot |s)} . If the action space is discrete, then ∑ a π θ ( a | s ) = 1 {\displaystyle \sum _{a}\pi _{\theta }(a|s)=1} . If the action space is continuous, then ∫ a π θ ( a | s ) d a = 1 {\displaystyle \int _{a}\pi _{\theta }(a|s)da=1} . The goal of policy optimization is to improve the actor. That is, to find some θ {\displaystyle \theta } that maximizes the expected episodic reward J ( θ ) {\displaystyle J(\theta )} : J ( θ ) = E π θ [ ∑ t = 0 T γ t r t ] {\displaystyle J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\gamma ^{t}r_{t}\right]} where γ {\displaystyle \gamma } is the discount factor, r t {\displaystyle r_{t}} is the reward at step t {\displaystyle t} , and T {\displaystyle T} is the time-horizon (which can be infinite). The goal of policy gradient method is to optimize J ( θ ) {\displaystyle J(\theta )} by gradient ascent on the policy gradient ∇ J ( θ ) {\displaystyle \nabla J(\theta )} . As detailed on the policy gradient method page, there are many unbiased estimators of the policy gradient: ∇ θ J ( θ ) = E π θ [ ∑ 0 ≤ j ≤ T ∇ θ ln π θ ( A j | S j ) ⋅ Ψ j | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{0\leq j\leq T}\nabla _{\theta }\ln \pi _{\theta }(A_{j}|S_{j})\cdot \Psi _{j}{\Big |}S_{0}=s_{0}\right]} where Ψ j {\textstyle \Psi _{j}} is a linear sum of the following: ∑ 0 ≤ i ≤ T ( γ i R i ) {\textstyle \sum _{0\leq i\leq T}(\gamma ^{i}R_{i})} . γ j ∑ j ≤ i ≤ T ( γ i − j R i ) {\textstyle \gamma ^{j}\sum _{j\leq i\leq T}(\gamma ^{i-j}R_{i})} : the REINFORCE algorithm. γ j ∑ j ≤ i ≤ T ( γ i − j R i ) − b ( S j ) {\textstyle \gamma ^{j}\sum _{j\leq i\leq T}(\gamma ^{i-j}R_{i})-b(S_{j})} : the REINFORCE with baseline algorithm. Here b {\displaystyle b} is an arbitrary function. γ j ( R j + γ V π θ ( S j + 1 ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(R_{j}+\gamma V^{\pi _{\theta }}(S_{j+1})-V^{\pi _{\theta }}(S_{j})\right)} : TD(1) learning. γ j Q π θ ( S j , A j ) {\textstyle \gamma ^{j}Q^{\pi _{\theta }}(S_{j},A_{j})} . γ j A π θ ( S j , A j ) {\textstyle \gamma ^{j}A^{\pi _{\theta }}(S_{j},A_{j})} : Advantage Actor-Critic (A2C). γ j ( R j + γ R j + 1 + γ 2 V π θ ( S j + 2 ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(R_{j}+\gamma R_{j+1}+\gamma ^{2}V^{\pi _{\theta }}(S_{j+2})-V^{\pi _{\theta }}(S_{j})\right)} : TD(2) learning. γ j ( ∑ k = 0 n − 1 γ k R j + k + γ n V π θ ( S j + n ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\left(\sum _{k=0}^{n-1}\gamma ^{k}R_{j+k}+\gamma ^{n}V^{\pi _{\theta }}(S_{j+n})-V^{\pi _{\theta }}(S_{j})\right)} : TD(n) learning. γ j ∑ n = 1 ∞ λ n − 1 1 − λ ⋅ ( ∑ k = 0 n − 1 γ k R j + k + γ n V π θ ( S j + n ) − V π θ ( S j ) ) {\textstyle \gamma ^{j}\sum _{n=1}^{\infty }{\frac {\lambda ^{n-1}}{1-\lambda }}\cdot \left(\sum _{k=0}^{n-1}\gamma ^{k}R_{j+k}+\gamma ^{n}V^{\pi _{\theta }}(S_{j+n})-V^{\pi _{\theta }}(S_{j})\right)} : TD(λ) learning, also known as GAE (generalized advantage estimate). This is obtained by an exponentially decaying sum of the TD(n) learning terms. === Critic === In the unbiased estimators given above, certain functions such as V π θ , Q π θ , A π θ {\displaystyle V^{\pi _{\theta }},Q^{\pi _{\theta }},A^{\pi _{\theta }}} appear. These are approximated by the critic. Since these functions all depend on the actor, the critic must learn alongside the actor. The critic is learned by value-based RL algorithms. For example, if the critic is estimating the state-value function V π θ ( s ) {\displaystyle V^{\pi _{\theta }}(s)} , then it can be learned by any value function approximation method. Let the critic be a function approximator V ϕ ( s ) {\displaystyle V_{\phi }(s)} with parameters ϕ {\displaystyle \phi } . The simplest example is TD(1) learning, which trains the critic to minimize the TD(1) error: δ i = R i + γ V ϕ ( S i + 1 ) − V ϕ ( S i ) {\displaystyle \delta _{i}=R_{i}+\gamma V_{\phi }(S_{i+1})-V_{\phi }(S_{i})} The critic parameters are updated by gradient descent on the squared TD error: ϕ ← ϕ − α ∇ ϕ ( δ i ) 2 = ϕ + α δ i ∇ ϕ V ϕ ( S i ) {\displaystyle \phi \leftarrow \phi -\alpha \nabla _{\phi }(\delta _{i})^{2}=\phi +\alpha \delta _{i}\nabla _{\phi }V_{\phi }(S_{i})} where α {\displaystyle \alpha } is the learning rate. Note that the gradient is taken with respect to the ϕ {\displaystyle \phi } in V ϕ ( S i ) {\displaystyle V_{\phi }(S_{i})} only, since the ϕ {\displaystyle \phi } in γ V ϕ ( S i + 1 ) {\displaystyle \gamma V_{\phi }(S_{i+1})} constitutes a moving target, and the gradient is not taken with respect to that. This is a common source of error in implementations that use automatic differentiation, and requires "stopping the gradient" at that point. Similarly, if the critic is estimating the action-value function Q π θ {\displaystyle Q^{\pi _{\theta }}} , then it can be learned by Q-learning or SARSA. In SARSA, the critic maintains an estimate of the Q-function, parameterized by ϕ {\displaystyle \phi } , denoted as Q ϕ ( s , a ) {\displaystyle Q_{\phi }(s,a)} . The temporal difference error is then calculated as δ i = R i + γ Q θ ( S i + 1 , A i + 1 ) − Q θ ( S i , A i ) {\displaystyle \delta _{i}=R_{i}+\gamma Q_{\theta }(S_{i+1},A_{i+1})-Q_{\theta }(S_{i},A_{i})} . The critic is then updated by θ ← θ + α δ i ∇ θ Q θ ( S i , A i ) {\displaystyle \theta \leftarrow \theta +\alpha \delta _{i}\nabla _{\theta }Q_{\theta }(S_{i},A_{i})} The advantage critic can be trained by training both a Q-function Q ϕ ( s , a ) {\displaystyle Q_{\phi }(s,a)} and a state-value function V ϕ ( s ) {\displaystyle V_{\phi }(s)} , then let A ϕ ( s , a ) = Q ϕ ( s , a ) − V ϕ ( s ) {\displaystyle A_{\phi }(s,a)=Q_{\phi }(s,a)-V_{\phi }(s)} . Although, it is more common to train just a state-value function V ϕ ( s ) {\displaystyle V_{\phi }(s)} , then estimate the advantage by A ϕ ( S i , A i ) ≈ ∑ j ∈ 0 : n − 1 γ j R i + j + γ n V ϕ ( S i + n ) − V ϕ ( S i ) {\displaystyle A_{\phi }(S_{i},A_{i})\approx \sum _{j\in 0:n-1}\gamma ^{j}R_{i+j}+\gamma ^{n}V_{\phi }(S_{i+n})-V_{\phi }(S_{i})} Here, n {\displaystyle n} is a positive integer. The higher n {\displaystyle n} is, the more lower is the bias in the advantage estimation, but at the price of higher variance. The Generalized Advantage Estimation (GAE) introduces a hyperparameter λ {\displaystyle \lambda } that smoothly interpolates between Monte Carlo returns ( λ = 1 {\displaystyle \lambda =1} , high variance, no bias) and 1-step TD learning ( λ = 0 {\displaystyle \lambda =0} , low variance, high bias). This hyperparameter can be adjusted to pick the optimal bias-variance trade-off in advantage estimation. It uses an exponentially decaying average of n-step returns with λ {\displaystyle \lambda } being the decay strength. == Variants == Asynchronous Advantage Actor-Critic (A3C): Parallel and asynchronous version of A2C. Soft Actor-Critic (SAC): Incorporates entropy maximization for improved exploration. Deep Deterministic Policy Gradient (DDPG): Specialized for continuous action spaces.