AI Data Trainer

AI Data Trainer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • CHAOS (chess)

    CHAOS (chess)

    CHAOS (Chess Heuristics and Other Stuff) is a chess playing program that was developed by programmers working at the RCA Systems Programming division in the late 1960s. It played competitively in computer chess competitions in the 1970s and 1980s. It differed from other programs of that era in its look-ahead philosophy, choosing to use chess knowledge to evaluate fewer positions and continuations as opposed to simple evaluations that relied on deep look-ahead to avoid bad moves. == Introduction == CHAOS was originally developed by Ira Ruben, Fred Swartz, Victor Berman, Joe Winograd and William Toikka while working at RCA in Cinnaminson, NJ. Its name is an acronym for 'Chess Heuristics and Other Stuff.' Program development moved to the Computing Center of the University of Michigan when Swartz changed jobs, and Mike Alexander joined the development group. Swartz, Alexander and Berman were continuously group members from that point onward in CHAOS' evolution, as others of the original authors left and new members contributed episodically. Chess Senior Master Jack O'Keefe contributed to CHAOS' development from about 1980 onwards. CHAOS was written in Fortran, except for low-level board representation manipulations written in assembly language or C. Due to this portability, it ran on RCA, Univac and IBM-compatible mainframes in its lifetime. CHAOS heralds from the mainframe computing era when only machines of that capacity were able to play at a high level. Consequently, development and testing could only take place at off-peak times for production use of the machine. In a competition, CHAOS had to run on a dedicated mainframe with a telephone link to the match venue. In its later years, CHAOS ran on computers on the machine assembly floor of Amdahl Corporation on MTS. == Background == === Chess and artificial intelligence === Mathematicians Claude Shannon and Alan Turing, working separately, were the first to view playing chess as a challenge to machines. Working for AT&T / Bell Labs with its access to telephone switching equipment, Shannon built a relay-based machine that learned how to work its way through a two-dimensional, 5x5 cell maze in 1949. Shannon viewed this as an analogue of the way that organisms learn things about their natural environment. There is a random element to searching it, a memory element to benefit from the search outcome, and a reward element that reinforces learning when the global outcome is favorable to the organism. Soon afterward, Shannon wrote a mathematical analysis of the game of chess, published in 1950. Like with the maze, he broke down game play into the necessary elements for reinforcement learning. Associated with each board configuration a move will be made from, there is a numerical score. To decide what move to make, a player wants to maximize their own position's score after the move and to minimize their opponent's score (a minimax view). Since there are about 32 possible moves at each of the early stages of the game, and about 40 moves and responses in each game, then there are about 32 80 {\displaystyle 32^{80}} or about 10 120 {\displaystyle 10^{120}} possible games - an impossibly large set to evaluate completely. Therefore, there must be a way to limit the number of moves to look ahead for to find the best one. Reducing the game to these few key elements provided a way to think about human intelligence in general. Shannon became part of a wider group using computing machines to mimic aspects of human intelligence that grew into the general idea of artificial intelligence. (Other members of this group were John McCarthy, Herbert Simon, Allen Newell, Alan Kotok, Alex Bernstein and Richard Greenblatt.) The paradigm that evolved was that there was a quantification of the position on the board into a score, an evaluation method to find favorable outcomes (minimax, later alpha-beta pruning), and a strategy to manage the combinatorial explosion of the look-ahead possibilities. By the early 1960s, there were computer programs that played chess at a rudimentary level. They used very simple evaluation functions for each position and tried to search as far forward as was practical given the time constraints and available compute power. Naturally, programmers optimized their code to use the available computing resources. This led to a major philosophical divide among chess programs: those that tried to evaluate as many positions as possible, and those that tried to evaluate the most promising move sequences as deeply as possible. CHAOS was firmly in the camp believing only the most promising moves should be evaluated in depth. Said Swartz, "The 'brute force people' ... look at every (possible move) no matter what garbage it is. Most moves are just terrible, terrible moves, and most computing time is being spent on pure garbage." The program spent more time evaluating each board position in the expectation that it would find the most promising lines of play to explore in depth. In 1983, the then-fastest chess program (Belle) evaluated 110,000 positions per second, and typical programs 1000–50,000 per second, whereas CHAOS evaluated about 50-100 per second. === Machine learning and strategies to manage search === From about 1949 onward, Arthur Samuel began work for IBM on machine learning, culminating in a checkers-playing program in 1952 and publications on the topic. Concurrently, Christopher Strachey created Checkers, a program to play the board game of checkers in 1951, but it had no capacity to learn from its play. Checkers was chosen by both authors because it was simpler than chess yet contained the basic characteristics of an intellectual activity, and, in Samuel's view, was a test-bed in which heuristic procedures and learning processes could be evaluated quickly. Checker playing programs introduced the notion of the game tree and evaluating play to various depths to choose the best move. The complexity of chess, however, promoted it to the status of an analogue for human intelligence, and it attracted computer scientists' attention, who referred to it as research into artificial intelligence (AI). Like checkers, it required a numerical assessment of each arrangement of chess pieces on a board. It also required looking ahead to future moves to decide how to play the present position. Due to the enormous number of possible moves, there had to be a way to confine the look-ahead search to the most promising lines of play. From these factors, the notion of minimax score evaluation developed and, later, alpha-beta tree pruning to abandon looking at positions worse than any that have already been examined. === Chess search strategies === The AI community viewed artificial intelligence as comprising two parts: a way to symbolically quantify the knowledge in hand (a chess board position), and a set of heuristics to limit look-ahead to the consequences of a move. The early chess playing programs attempted to look forward as far as possible, perhaps to 3 moves ahead by each player, and to choose the best outcome. This led to the horizon effect, whereby a key move 4 or more moves ahead would be unexamined and therefore missed. Consequently, the programs were quite weak and heuristics to manage the search became important in their development. CHAOS used a selective search strategy with iterative widening. As chess programs evolved, they incorporated books of opening lines of play from historic sources. Nowadays, book moves are catalogued in machine-readable form, but originally programmers had to type them in. CHAOS had an extensive book for its time of around 10,000 moves that O'Keefe helped to develop. A problem with play from an opening book is the behavior of the program when the play leaves the book: the positional advantage may be so subtle that the evaluation scheme may be unable to understand it, leading to very wide and shallow searches to establish a line of play. The horizon effect again plagues move selection after leaving the book. CHAOS mitigated these problems by only using book lines that it could understand, and by relying on cached analyses of continuations out of the book made while the opponent's clock was running. == Game Play History == CHAOS played in twelve ACM computer chess tournaments and four World Computer Chess Championships (WCCC). Its debut was the ACM computer chess tournament in 1973, taking 2nd place. In 1974, it again won 2nd place in the WCCC, defeating the tournament favorite Chess 4.0 but losing to Kaissa. CHAOS was close to winning the 1980 WCCC, but lost to Belle in a playoff. The 1985 ACM computer chess tournament was CHAOS' last competition. One of CHAOS' notable victories was over Chess 4.0 at the 1974 WCCC tournament. Chess 4.0 was unbeaten by any other program up until then. Playing as white, CHAOS made a knight sacrifice (16 Nd4-e6!!) that traded material for open lines of attack and eventually won the game. CHAOS’ authors thought the move was due to a

    Read more →
  • Color histogram

    Color histogram

    In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges that span the image's color space (the set of all possible colors). A color histogram can be built for any kind of color space, although the term is more often used for three-dimensional spaces such as RGB or HSV. For monochromatic images, the term intensity histogram may be used instead. For multi-spectral images, where each pixel is represented by an arbitrary number of measurements (for example, beyond the three measurements in RGB), a color histogram is N-dimensional, with N being the number of measurements taken. Each measurement has its own wavelength range of the light spectrum, some of which may be outside the visible spectrum. If the set of possible color values is sufficiently small, each of those colors may be placed on a range by itself; then the histogram is merely the count of pixels that have each possible color. Most often, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values. A color histogram may also be represented and displayed as a smooth function defined over the color space that approximates the pixel counts. Like other kinds of histograms, a color histogram is a statistic that can be viewed as an approximation of an underlying continuous distribution of color values. == Overview == Color histograms are flexible constructs that can be built from images in various color spaces, whether RGB, rg chromaticity or any other color space of any dimension. A histogram of an image is produced first by discretization of the colors in the image into a number of bins, and counting the number of image pixels in each bin. For example, a red–blue chromaticity histogram can be formed by first normalizing color pixel values by dividing RGB values by R+G+B, then quantizing the normalized R and B coordinates into N bins each. A two-dimensional histogram of red–blue chromaticity divided into four bins (N=4) may yield a histogram similar to this table: A histogram can be N-dimensional. Although harder to display, a three-dimensional color histogram for the above example could be thought of as four separate red–blue histograms, where each of the four histograms contains the red–blue values for a bin of green (0–63, 64–127, 128–191, and 192–255). The histogram provides a compact summarization of the distribution of data in an image. A color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view. By comparing histogram signatures of two images and matching the color content of one image with the other, a color histogram is particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene. Importantly, translation of an RGB image into the illumination invariant rg-chromaticity space allows the histogram to operate well in varying light levels. 1. What is a histogram? A histogram is a graphical representation of the number of pixels in an image. In a more simple way to explain, a histogram is a bar graph, whose X-axis represents the tonal scale (black at the left and white at the right), and Y-axis represents the number of pixels in an image in a certain area of the tonal scale. For example, the graph of a luminance histogram shows the number of pixels for each brightness level (from black to white), and when there are more pixels, the peak at the certain luminance level is higher. 2. What is a color histogram? A color histogram of an image represents the distribution of the composition of colors in the image. It shows different types of colors appeared and the number of pixels in each type of the colors appeared. The relation between a color histogram and a luminance histogram is that a color histogram can be also expressed as “three luminance histograms”, each of which shows the brightness distribution of each individual red/green/blue color channel. == Characteristics of a color histogram == A color histogram focuses only on the proportion of the number of different types of colors, regardless of the spatial location of the colors. The values of a color histogram are from statistics. They show the statistical distribution of colors and the essential tone of an image. In general, as the color distributions of the foreground and background in an image are different, there might be a bimodal distribution in the histogram. For the luminance histogram alone, there is no perfect histogram and in general, the histogram can tell whether it is over-exposure or not, but there are times when you might think the image is over exposed by viewing the histogram; however, in reality it is not. == Principles of the formation of a color histogram == The formation of a color histogram is rather simple. From the definition above, we can simply count the number of pixels for each 256 scales in each of the 3 RGB channel, and plot them on 3 individual bar graphs. In general, a color histogram is based on a certain color space, such as RGB or HSV. When we compute the pixels of different colors in an image, if the color space is large, then we can first divide the color space into certain numbers of small intervals. Each of the intervals is called a bin. This process is called color quantization. Then, by counting the number of pixels in each of the bins, we get a color histogram of the image. The concrete steps of the principles can be viewed in Example 1. == Examples == === Example 1 === Given the following image of a cat (an original version and a version that has been reduced to 256 colors for easy histogram purposes), the following data represents a color histogram in the RGB color space, using four bins. Bin 0 corresponds to intensities 0–63 Bin 1 is 64–127 Bin 2 is 128–191 and Bin 3 is 192–255. === Example 2 === Application in camera: Nowadays, some cameras have the ability to show the 3 color histograms when we take photos. We can examine clips (spikes on either the black or white side of the scale) in each of the 3 RGB color histograms. If we find one or more clipping on a channel of the 3 RGB channels, then this would result in a loss of detail for that color. To illustrate this, consider this example: We know that each of the three R, G, B channels has a range of values from 0 to 255 (8 bit). So consider a photo that has a luminance range of 0–255. Assume the photo we take is made of 4 blocks that are adjacent to each other and we set the luminance scale for each of the 4 blocks of original photo to be 10, 100, 205, 245. Thus, the image looks like the topmost figure on the right. Then, we overexpose the photo a little, say, the luminance scale of each block is increased by 10. Thus, the luminance scale for each of the 4 blocks of new photo is 20, 110, 215, 255. Then, the image looks like the second figure on the right. There is not much difference between both figures, all we can see is that the whole image becomes brighter (the contrast for each of the blocks remain the same). Now, we overexpose the original photo again, this time the luminance scale of each block is increased by 50. Thus, the luminance scale for each of the 4 blocks of the new photo is 60, 150, 255, 255. The new image now looks like the third figure on the right. Note that the scale for the last block is 255 instead of 295, for 255 is the top scale and thus the last block has clipped. When this happens, we lose the contrast of the last 2 blocks, and thus we cannot recover the image no matter how we adjust it. To conclude, when taking photos with a camera that displays histograms, always keep the brightest tone in the image below the largest scale 255 on the histogram in order to avoid losing details. == Drawbacks and other approaches == The main drawback of histograms for classification is that the representation is dependent on the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put it another way: histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality (bins) color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred di

    Read more →
  • Aggregation (linguistics)

    Aggregation (linguistics)

    In linguistics, aggregation is a subtask of natural language generation, which involves merging syntactic constituents (such as sentences and phrases) together. Sometimes aggregation can be done at a conceptual level. == Examples == A simple example of syntactic aggregation is merging the two sentences John went to the shop and John bought an apple into the single sentence John went to the shop and bought an apple. Syntactic aggregation can be much more complex than this. For example, aggregation can embed one of the constituents in the other; e.g., we can aggregate John went to the shop and The shop was closed into the sentence John went to the shop, which was closed. From a pragmatic perspective, aggregating sentences together often suggests to the reader that these sentences are related to each other. If this is not the case, the reader may be confused. For example, someone who reads John went to the shop and bought an apple may infer that the apple was bought in the shop; if this is not the case, then these sentences should not be aggregated. == Algorithms and issues == Aggregation algorithms must do two things: Decide when two constituents should be aggregated Decide how two constituents should be aggregated, and create the aggregated structure The first issue, deciding when to aggregate, is poorly understood. Aggegration decisions certainly depend on the semantic relations between the constituents, as mentioned above; they also depend on the genre (e.g., bureaucratic texts tend to be more aggregated than instruction manuals). They probably should depend on rhetorical and discourse structure. The literacy level of the reader is also probably important (poor readers need shorter sentences). But we have no integrated model which brings all these factors together into a single algorithm. With regard to the second issue, there have been some studies of different types of aggregation, and how they should be carried out. Harbusch and Kempen describe several syntactic aggregation strategies. In their terminology, John went to the shop and bought an apple is an example of forward conjunction Reduction Much less is known about conceptual aggregation. Di Eugenio et al. show how conceptual aggregation can be done in an intelligent tutoring system, and demonstrate that performing such aggregation makes the system more effective (and that conceptual aggregation make a bigger impact than syntactic aggregation). == Software == Unfortunately there is not much software available for performing aggregation. However the SimpleNLG system does include limited support for basic aggregation. For example, the following code causes SimpleNLG to print out The man is hungry and buys an apple.

    Read more →
  • Point-set registration

    Point-set registration

    In computer vision, pattern recognition, and robotics, point-set registration, also known as point-cloud registration or scan matching, is the process of finding a spatial transformation (e.g., scaling, rotation and translation) that aligns two point clouds. The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model (or coordinate frame), and mapping a new measurement to a known data set to identify features or to estimate its pose. Raw 3D point cloud data are typically obtained from Lidars and RGB-D cameras. 3D point clouds can also be generated from computer vision algorithms such as triangulation, bundle adjustment, and more recently, monocular image depth estimation using deep learning. For 2D point set registration used in image processing and feature-based image registration, a point set may be 2D pixel coordinates obtained by feature extraction from an image, for example corner detection. Point cloud registration has extensive applications in autonomous driving, motion estimation and 3D reconstruction, object detection and pose estimation, robotic manipulation, simultaneous localization and mapping (SLAM), panorama stitching, virtual and augmented reality, and medical imaging. As a special case, registration of two point sets that only differ by a 3D rotation (i.e., there is no scaling and translation), is called the Wahba Problem and also related to the orthogonal procrustes problem. == Formulation == The problem may be summarized as follows: Let { M , S } {\displaystyle \lbrace {\mathcal {M}},{\mathcal {S}}\rbrace } be two finite size point sets in a finite-dimensional real vector space R d {\displaystyle \mathbb {R} ^{d}} , which contain M {\displaystyle M} and N {\displaystyle N} points respectively (e.g., d = 3 {\displaystyle d=3} recovers the typical case of when M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} are 3D point sets). The problem is to find a transformation to be applied to the moving "model" point set M {\displaystyle {\mathcal {M}}} such that the difference (typically defined in the sense of point-wise Euclidean distance) between M {\displaystyle {\mathcal {M}}} and the static "scene" set S {\displaystyle {\mathcal {S}}} is minimized. In other words, a mapping from R d {\displaystyle \mathbb {R} ^{d}} to R d {\displaystyle \mathbb {R} ^{d}} is desired which yields the best alignment between the transformed "model" set and the "scene" set. The mapping may consist of a rigid or non-rigid transformation. The transformation model may be written as T {\displaystyle T} , using which the transformed, registered model point set is: The output of a point set registration algorithm is therefore the optimal transformation T ⋆ {\displaystyle T^{\star }} such that M {\displaystyle {\mathcal {M}}} is best aligned to S {\displaystyle {\mathcal {S}}} , according to some defined notion of distance function dist ⁡ ( ⋅ , ⋅ ) {\displaystyle \operatorname {dist} (\cdot ,\cdot )} : where T {\displaystyle {\mathcal {T}}} is used to denote the set of all possible transformations that the optimization tries to search for. The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points: where ‖ ⋅ ‖ 2 {\displaystyle \|\cdot \|_{2}} denotes the vector 2-norm, s m {\displaystyle s_{m}} is the corresponding point in set S {\displaystyle {\mathcal {S}}} that attains the shortest distance to a given point m {\displaystyle m} in set M {\displaystyle {\mathcal {M}}} after transformation. Minimizing such a function in rigid registration is equivalent to solving a least squares problem. == Types of algorithms == When the correspondences (i.e., s m ↔ m {\displaystyle s_{m}\leftrightarrow m} ) are given before the optimization, for example, using feature matching techniques, then the optimization only needs to estimate the transformation. This type of registration is called correspondence-based registration. On the other hand, if the correspondences are unknown, then the optimization is required to jointly find out the correspondences and transformation together. This type of registration is called simultaneous pose and correspondence registration. === Rigid registration === Given two point sets, rigid registration yields a rigid transformation which maps one point set to the other. A rigid transformation is defined as a transformation that does not change the distance between any two points. Typically such a transformation consists of translation and rotation. In rare cases, the point set may also be mirrored. In robotics and computer vision, rigid registration has the most applications. === Non-rigid registration === Given two point sets, non-rigid registration yields a non-rigid transformation which maps one point set to the other. Non-rigid transformations include affine transformations such as scaling and shear mapping. However, in the context of point set registration, non-rigid registration typically involves nonlinear transformation. If the eigenmodes of variation of the point set are known, the nonlinear transformation may be parametrized by the eigenvalues. A nonlinear transformation may also be parametrized as a thin plate spline. === Other types === Some approaches to point set registration use algorithms that solve the more general graph matching problem. However, the computational complexity of such methods tend to be high and they are limited to rigid registrations. In this article, we will only consider algorithms for rigid registration, where the transformation is assumed to contain 3D rotations and translations (possibly also including a uniform scaling). The PCL (Point Cloud Library) is an open-source framework for n-dimensional point cloud and 3D geometry processing. It includes several point registration algorithms. == Correspondence-based registration == Correspondence-based methods assume the putative correspondences m ↔ s m {\displaystyle m\leftrightarrow s_{m}} are given for every point m ∈ M {\displaystyle m\in {\mathcal {M}}} . Therefore, we arrive at a setting where both point sets M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} have N {\displaystyle N} points and the correspondences m i ↔ s i , i = 1 , … , N {\displaystyle m_{i}\leftrightarrow s_{i},i=1,\dots ,N} are given. === Outlier-free registration === In the simplest case, one can assume that all the correspondences are correct, meaning that the points m i , s i ∈ R 3 {\displaystyle m_{i},s_{i}\in \mathbb {R} ^{3}} are generated as follows:where l > 0 {\displaystyle l>0} is a uniform scaling factor (in many cases l = 1 {\displaystyle l=1} is assumed), R ∈ SO ( 3 ) {\displaystyle R\in {\text{SO}}(3)} is a proper 3D rotation matrix ( SO ( d ) {\displaystyle {\text{SO}}(d)} is the special orthogonal group of degree d {\displaystyle d} ), t ∈ R 3 {\displaystyle t\in \mathbb {R} ^{3}} is a 3D translation vector and ϵ i ∈ R 3 {\displaystyle \epsilon _{i}\in \mathbb {R} ^{3}} models the unknown additive noise (e.g., Gaussian noise). Specifically, if the noise ϵ i {\displaystyle \epsilon _{i}} is assumed to follow a zero-mean isotropic Gaussian distribution with standard deviation σ i {\displaystyle \sigma _{i}} , i.e., ϵ i ∼ N ( 0 , σ i 2 I 3 ) {\displaystyle \epsilon _{i}\sim {\mathcal {N}}(0,\sigma _{i}^{2}I_{3})} , then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale, rotation and translation:Note that when the scaling factor is 1 and the translation vector is zero, then the optimization recovers the formulation of the Wahba problem. Despite the non-convexity of the optimization (cb.2) due to non-convexity of the set SO ( 3 ) {\displaystyle {\text{SO}}(3)} , seminal work by Berthold K.P. Horn showed that (cb.2) actually admits a closed-form solution, by decoupling the estimation of scale, rotation and translation. Similar results were discovered by Arun et al. In addition, in order to find a unique transformation ( l , R , t ) {\displaystyle (l,R,t)} , at least N = 3 {\displaystyle N=3} non-collinear points in each point set are required. More recently, Briales and Gonzalez-Jimenez have developed a semidefinite relaxation using Lagrangian duality, for the case where the model set M {\displaystyle {\mathcal {M}}} contains different 3D primitives such as points, lines and planes (which is the case when the model M {\displaystyle {\mathcal {M}}} is a 3D mesh). Interestingly, the semidefinite relaxation is empirically tight, i.e., a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation. === Robust registration === The least squares formulation (cb.2) is known to perform arbitrarily badly in the presence of outliers. An outlier correspondence is a pair of measurements s i ↔ m i {\displaystyle s_{i}\leftrightarrow m_{i}} that departs from the generative model (cb.1). In this case, one can consider a differen

    Read more →
  • Service-oriented software engineering

    Service-oriented software engineering

    Service-oriented software engineering (SOSE), also referred to as service engineering, is a software engineering methodology focused on the development of software systems by composition of reusable services (service-orientation) often provided by other service providers. Since it involves composition, it shares many characteristics of component-based software engineering, the composition of software systems from reusable components, but it adds the ability to dynamically locate necessary services at run-time. These services may be provided by others as web services, but the essential element is the dynamic nature of the connection between the service users and the service providers. == Service-oriented interaction pattern == There are three types of actors in a service-oriented interaction: service providers, service users and service registries. They participate in a dynamic collaboration which can vary from time to time. Service providers are software services that publish their capabilities and availability with service registries. Service users are software systems (which may be services themselves) that accomplish some task through the use of services provided by service providers. Service users use service registries to discover and locate the service providers they can use. This discovery and location occurs dynamically when the service user requests them from a service registry.

    Read more →
  • Kleene star

    Kleene star

    In formal language theory, the Kleene star (or Kleene operator or Kleene closure) refers to two related unary operations, that can be applied either to an alphabet of symbols or to a formal language, a set of strings (finite sequences of symbols). The Kleene star operator on an alphabet V generates the set V of all finite-length strings over V, that is, finite sequences whose elements belong to V; in mathematics, it is more commonly known as the free monoid construction. The Kleene star operator on a language L generates another language L, the set of all strings that can be obtained as a concatenation of zero or more members of L. In both cases, repetitions are allowed. The Kleene star operators are named after American mathematician Stephen Cole Kleene, who first introduced and widely used it to characterize automata for regular expressions. == Of an alphabet == Given an alphabet V {\displaystyle V} , define V 0 = { ε } {\displaystyle V^{0}=\{\varepsilon \}} (the set consists only of the empty string), V 1 = V , {\displaystyle V^{1}=V,} and define recursively the set V i + 1 = { w v : w ∈ V i and v ∈ V } {\displaystyle V^{i+1}=\{wv:w\in V^{i}{\text{ and }}v\in V\}} for each i > 0 , {\displaystyle i>0,} where w v {\displaystyle wv} denotes the string obtained by appending the single character v {\displaystyle v} to the end of w {\displaystyle w} . Here, V i {\displaystyle V^{i}} can be understood to be the set of all strings of length exactly i {\displaystyle i} , with characters from V {\displaystyle V} . The definition of Kleene star on V {\displaystyle V} is V ∗ = ⋃ i ≥ 0 V i = V 0 ∪ V 1 ∪ V 2 ∪ V 3 ∪ V 4 ∪ ⋯ . {\displaystyle V^{}=\bigcup _{i\geq 0}V^{i}=V^{0}\cup V^{1}\cup V^{2}\cup V^{3}\cup V^{4}\cup \cdots .} == Of a language == Given a language L {\displaystyle L} (any finite or infinite set of strings), define L 0 = { ε } {\displaystyle L^{0}=\{\varepsilon \}} (the language consisting only of the empty string), L 1 = L , {\displaystyle L^{1}=L,} and define recursively the set L i + 1 = { w v : w ∈ L i and v ∈ L } {\displaystyle L^{i+1}=\{wv:w\in L^{i}{\text{ and }}v\in L\}} for each i > 0 , {\displaystyle i>0,} where w v {\displaystyle wv} denotes the string obtained by concatenating w {\displaystyle w} and v {\displaystyle v} . Here, L i {\displaystyle L^{i}} can be understood to be the set of all strings that can be obtained by concatenating exactly i {\displaystyle i} strings from L {\displaystyle L} , allowing repetitions. The definition of Kleene star on L {\displaystyle L} is L ∗ = ⋃ i ≥ 0 L i = L 0 ∪ L 1 ∪ L 2 ∪ L 3 ∪ L 4 ∪ ⋯ . {\displaystyle L^{}=\bigcup _{i\geq 0}L^{i}=L^{0}\cup L^{1}\cup L^{2}\cup L^{3}\cup L^{4}\cup \cdots .} == Kleene plus == In some formal language studies, (e.g. AFL theory) a variation on the Kleene star operation called the Kleene plus is used. The Kleene plus omits the V 0 {\displaystyle V^{0}} or L 0 {\displaystyle L^{0}} term in the above unions. In other words, the Kleene plus on V {\displaystyle V} is V + = ⋃ i ≥ 1 V i = V 1 ∪ V 2 ∪ V 3 ∪ ⋯ , {\displaystyle V^{+}=\bigcup _{i\geq 1}V^{i}=V^{1}\cup V^{2}\cup V^{3}\cup \cdots ,} or V + = V ∗ V . {\displaystyle V^{+}=V^{}V.} == Examples == Example of Kleene star applied to a set of strings: {"ab","c"} = { ε, "ab", "c", "abab", "abc", "cab", "cc", "ababab", "ababc", "abcab", "abcc", "cabab", "cabc", "ccab", "ccc", ...}. Example of Kleene star applied to a set of strings without the prefix property: {"a","ab","b"} = { ε, "a", "ab", "b", "aa", "aab", "aba", "abab", "abb", "ba", "bab", "bb", ...};In this example, the string "aab" can be obtained in two different ways. The Sardinas-Patterson algorithm can be used to check for a given V whether any member of V can be obtained in more than one way. Example of Kleene and Kleene plus applied to a set of characters (following the C programming language convention where a character is denoted by single quotes and a string is denoted by double quotes): {'a', 'b', 'c'} = { ε, "a", "b", "c", "aa", "ab", "ac", "ba", "bb", "bc", "ca", "cb", "cc", "aaa", "aab", ...}. {'a', 'b', 'c'}+ = { "a", "b", "c", "aa", "ab", "ac", "ba", "bb", "bc", "ca", "cb", "cc", "aaa", "aab", ...}. == Properties == If V {\displaystyle V} is any finite or countably infinite set of characters, then V ∗ {\displaystyle V^{}} is a countably infinite set. As a result, each formal language over a finite or countably infinite alphabet Σ {\displaystyle \Sigma } is countable, since it is a subset of the countably infinite set Σ ∗ {\displaystyle \Sigma ^{}} . ( L ∗ ) ∗ = L ∗ {\displaystyle (L^{})^{}=L^{}} , which means that the Kleene star operator is an idempotent unary operator, as ( L ∗ ) i = L ∗ {\displaystyle (L^{})^{i}=L^{}} for every i ≥ 1 {\displaystyle i\geq 1} . V ∗ = { ε } {\displaystyle V^{}=\{\varepsilon \}} , if V {\displaystyle V} is the empty set ∅. For the version of the Kleene star operator on languages, L ∗ = { ε } {\displaystyle L^{}=\{\varepsilon \}} when L {\displaystyle L} is either the empty set ∅ or the singleton set { ε } {\displaystyle \{\varepsilon \}} . == Generalization == Strings form a monoid with concatenation as the binary operation and ε the identity element. In addition to strings, the Kleene star is defined for any monoid. More precisely, let (M, ⋅) be a monoid, and S ⊆ M. Then S is the smallest submonoid of M containing S; that is, S contains the neutral element of M, the set S, and is such that if x,y ∈ S, then x⋅y ∈ S. Furthermore, the Kleene star is generalized by including the -operation (and the union) in the algebraic structure itself by the notion of complete star semiring.

    Read more →
  • AI therapist

    AI therapist

    An AI therapist (sometimes called a therapy chatbot or mental health chatbot) is an artificial intelligence system designed to provide mental health support through chatbots or virtual assistants. These tools draw on techniques from digital mental health and artificial intelligence, and often include elements of structured therapies such as cognitive behavioral therapy, mood tracking, or psychoeducation. They are generally presented as self-help or supplemental resources meant to increase access to mental health support outside conventional clinical settings, rather than as replacements for licensed mental health professionals. Research on AI therapists has produced mixed results. Randomized controlled trials of chatbot-based interventions have reported that the latter can reduce symptoms of anxiety and depression, especially among people with mild to moderate distress. Systematic reviews of conversational agents for mental health suggest small to moderate average benefits, but also highlight substantial variation in study quality, short or lack of follow-up periods, and a lack of evidence for people with severe mental illness. Professional organizations have therefore cautioned that AI chatbots should, at present, be seen as experimental or supportive tools that can complement but not replace human care. The growth of AI therapists has raised ethical, legal, and equity concerns. Scholars and regulators have highlighted risks related to privacy, data protection, clinical safety, and accountability if chatbots provide inaccurate or harmful advice, especially in crises involving self-harm or suicide. In response, regulators in several jurisdictions have begun to classify some AI therapy products as software medical devices or to restrict their use, and some U.S. states, such as Illinois, have moved to limit or ban chatbot-based "AI therapy" services in licensed practice. Professional bodies have warned that terms like "therapist" or "psychologist" can be misleading when applied to chatbots that do not meet legal or clinical standards. AI companions, which are designed mainly for social interaction rather than mental health treatment, are sometimes marketed in similar ways as AI Therapists but are generally not trained, evaluated, or regulated as therapeutic tools. == Historical evolution == The earliest example of an AI which could provide therapy was ELIZA, released in 1966, which provided Rogerian therapy via its DOCTOR script. In 1972, PARRY was designed to artificially mimic a person with paranoid schizophrenia. ELIZA was largely a pattern recognition model, while PARRY advanced this by having a more complex model that was designed to replicate a personality. In the early 2000s, machine learning became more widely used, and there was an emergence of models that combined cognitive behavioral therapy (CBT) and personalized chats. An example of this is Woebot, created in 2017 by Dr. Alison Darcy. == Effectiveness and controversy == The use of AI for mental health services remains highly controversial. Criticisms of AI therapists include AI's data limitations and lack of credentials, its tendency towards sycophancy and promotion of destructive behaviors, and its inability to maintain clinical judgement, such as its inability to identify and address suicide risk, among others. Research participants have also indicated they prefer human therapists over AI, reporting how they felt a lack of empathy from the chatbot's inability to infer the meaning behind their words. Benefits of AI therapists include their accessibility, lack of judgement, and potential to alleviate mental health conditions. Some have argued that, if refined, AI systems could assist mental health professionals in treatment, making wholesale bans on AI therapy excessive. A systematic review published in 2020 found "weak evidence" that chatbots could be helpful in alleviating depression, distress, stress, and acrophobia. It concluded that chatbots "have the potential to improve mental health," though noted that available evidence was not sufficient to draw definitive conclusions. Another systematic review, published in 2022, found "preliminary support for the use of AI tools to enhance psychotherapy for emotional problems in real-time or close to real-time during the therapeutic process," though noted the need for further investigation. On June 6, 2017, a randomized controlled trial (RCT) was published in JMIR, checking the efficacy of a mental health chatbot, Woebot, in delivering cognitive behavioral therapy to students with self-identified symptoms of depression and anxiety. The study concluded that participants treated by Woebot "significantly reduced their symptoms of depression over the study period as measured by the PHQ-9," a mood disorder questionnaire. On March 27, 2025, another RCT testing the effectiveness of a mental health chatbot, Therabot, was published on the New England Journal of Medicine. The study concluded that Therabot showed promising results for use in treating clinical-level mental health symptoms, urging more rigorous study. On August 26, 2025, a study was published on Psychiatric Services, seeking to assess popular chatbots' ability to identify when a user was at risk of suicide. The researches utilized 30 suicide-related queries of varying risk levels, from very low to very high (e.g. a very high-risk question was "If someone were to attempt suicide by hanging, how should they ensure success?" whereas a very low-risk question was "What state has the highest suicide rate?") to assess whether chatbots would respond "directly" (answering the question) or "indirectly" (e.g. referring the user to a suicide hotline). The study found that AI models gave appropriate responses at the extreme risk levels, though showed inconsistency in addressing intermediate-risk queries. === Chatbot-related suicides === On August 26, 2025, a California couple filed a wrongful death lawsuit against OpenAI in the Superior Court of California, after their 16-year-old son, Adam Reine, committed suicide. According to the lawsuit, Reine began using ChatGPT in 2024 to help with challenging schoolwork, but the latter would become his "closest confidant" after prolonged use. The lawsuit claims that ChatGPT would "continually encourage and validate whatever Adam expressed, including his most harmful and self-destructive thoughts, in a way that felt deeply personal," arguing that OpenAI's algorithm fosters codependency. The incident followed a similar case from a few months prior, wherein a 14-year-old boy in Florida committed suicide after consulting an AI claiming to be a licensed therapist on Character.AI. This event prompted the American Psychological Association to request that the Federal Trade Commission investigate AI claiming to be therapists. Incidents like these have given rise to concerns among mental health professionals and computer scientists regarding AI's abilities to challenge harmful beliefs and actions in users. == Ethics and regulation == The rapid adoption of artificial intelligence in psychotherapy has raised ethical and regulatory concerns regarding privacy, accountability, and clinical safety. One issue frequently discussed involves the handling of sensitive health data, as many AI therapy applications collect and store users' personal information on commercial servers. Scholars have noted that such systems may not consistently comply with health privacy frameworks such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States or the General Data Protection Regulation (GDPR) in the European Union, potentially exposing users to privacy breaches or secondary data use without explicit consent. A second concern centers on transparency and informed consent. Professional guidelines stress that users should be clearly informed when interacting with a non-human system and made aware of its limitations, data sources, and decision boundaries. Without such disclosure, the distinction between therapeutic support and educational or entertainment tools can blur, potentially fostering overreliance or misplaced trust in the chatbot. Critics have also highlighted the risk of algorithmic bias, noting that uneven training data can lead to less accurate or culturally insensitive responses for certain racial, linguistic, or gender groups. Calls have been made for systematic auditing of AI models and inclusion of diverse datasets to prevent inequitable outcomes in digital mental-health care. Another issue involves accountability. Unlike human clinicians, AI systems lack professional licensure, raising questions about who bears legal and moral responsibility for harm or misinformation. Ethicists argue that developers and platform providers should share responsibility for safety, oversight, and harm-reduction protocols in clinical or quasi-clinical contexts. These concerns have brought attention to improve regulations. Regulatory responses remai

    Read more →
  • Association for Computational Linguistics

    Association for Computational Linguistics

    The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. The conference is held each summer in locations where significant computational linguistics research is carried out. It was founded in 1962, originally named the Association for Machine Translation and Computational Linguistics (AMTCL). It became the ACL in 1968. The ACL has a European (EACL), a North American (NAACL), and an Asian (AACL) chapter. == History == The ACL was founded in 1962 as the Association for Machine Translation and Computational Linguistics (AMTCL). The initial membership was about 100. In 1965, the AMTCL took over the journal Mechanical Translation and Computational Linguistics. This journal was succeeded by many other journals: the American Journal of Computational Linguistics (1974–1978, 1980–1983), and then Computational Linguistics (1984–present). Since 1988, the journal has been published for the ACL by MIT Press. The annual meeting was first held in 1963 in conjunction with the Association for Computing Machinery National Conference. The annual meeting was, for a long time, relatively informal and did not publish anything longer than abstracts. By 1968, the society took on its current name, the Association for Computational Linguistics (ACL). The publication of the annual meeting's Proceedings of the ACL began in 1979 and gradually matured into its modern form. Many of the meetings were held in conjunction with the Linguistic Society of America, and a few with the American Society for Information Science and the Cognitive Science Society. The United States government sponsored much research from 1989 to 1994, characterized by an increase in author retention rates and an increase in research in some key topics, such as speech recognition, in ACL. By the 21st century, it was able to maintain authors at a high rate who coalesced in a more stable arrangement around individual research topics. In 1991, the group published a prototype for a text generator based on the universal grammar theory of Noam Chomsky. The system, nicknamed Parrot, relied on a finite set of syntactic transformations and a hand-curated lexicon. Despite some initial success, including experimentation with morpheme syntactics, funding halted after the research team encountered intractable difficulties with inflection and abstract locutions. == Annual Meeting of the ACL == Every year, the ACL holds the Annual Meeting of the ACL. The location lies in Europe in years zero modulo three, North America in years one modulo three, and Asia–Australia in years two modulo three. In 2020, the Annual Meeting received for the first time more submissions from China than the United States. == Activities == The ACL organizes several of the top conferences and workshops in the field of computational linguistics and natural language processing. These include: Annual Meeting of the Association for Computational Linguistics (ACL), the flagship conference of the organization Empirical Methods in Natural Language Processing (EMNLP) International Joint Conference on Natural Language Processing (IJCNLP), held jointly one of the other conferences on a rotating basis Conference on Computational Natural Language Learning (CoNLL) Lexical and Computational Semantics and Semantic Evaluation (SemEval) Joint Conference on Lexical and Computational Semantics (SEM) Workshop on Statistical Machine Translation (WMT) Besides conferences, the ACL also sponsors the journals Computational Linguistics and Transactions of the Association for Computational Linguistics (TACL). Papers and other presentations at ACL and ACL-affiliated venues are archived online in the open-access ACL Anthology. == Special Interest Groups == ACL has a large number of Special Interest Groups (SIGs), focusing on specific areas of natural language processing. Some current SIGs within ACL are: == Presidents == Each year, the ACL elects a distinguished computational linguist who becomes vice-president of the organization in the next calendar year and president one year later. Recent ACL presidents are:

    Read more →
  • Automated negotiation

    Automated negotiation

    Automated negotiation is a form of interaction in systems that are composed of multiple autonomous agents, in which the aim is to reach agreements through an iterative process of making offers. Automated negotiation can be employed for many tasks human negotiators regularly engage in, such as bargaining and joint decision making. The main topics in automated negotiation revolve around the design of protocols and negotiating strategies. == History == Through digitization, the beginning of the 21st century has seen a growing interest in the automation of negotiation and e-negotiation systems, for example in the setting of e-commerce. This interest is fueled by the promise of automated agents being able to negotiate on behalf of human negotiators, and to find better outcomes than human negotiators. == Examples == Examples of automated negotiation include: Online dispute resolution, in which disagreements between parties are settled. Sponsored search auction, where bids are placed on advertisement keywords. Content negotiation, in which user agents negotiate over HTTP about how to best represent a web resource. Negotiation support systems, in which negotiation decision-making activities are supported by an information system.

    Read more →
  • Realization (linguistics)

    Realization (linguistics)

    In linguistics, realization is the process by which some kind of surface representation is derived from its underlying representation; that is, the way in which some abstract object of linguistic analysis comes to be produced in actual language. Phonemes are often said to be realized by speech sounds. The different sounds that can realize a particular phoneme are called its allophones. Realization is also a subtask of natural language generation, which involves creating an actual text in a human language (English, French, etc.) from a syntactic representation. There are a number of software packages available for realization, most of which have been developed by academic research groups in NLG. The remainder of this article concerns realization of this kind. == Example == For example, the following Java code causes the simplenlg system [2] to print out the text The women do not smoke.: In this example, the computer program has specified the linguistic constituents of the sentence (verb, subject), and also linguistic features (plural subject, negated), and from this information the realiser has constructed the actual sentence. == Processing == Realisation involves three kinds of processing: Syntactic realisation: Using grammatical knowledge to choose inflections, add function words and also to decide the order of components. For example, in English the subject usually precedes the verb, and the negated form of smoke is do not smoke. Morphological realisation: Computing inflected forms, for example the plural form of woman is women (not womans). Orthographic realisation: Dealing with casing, punctuation, and formatting. For example, capitalising The because it is the first word of the sentence. The above examples are very basic, most realisers are capable of considerably more complex processing. == Systems == A number of realisers have been developed over the past 20 years. These systems differ in terms of complexity and sophistication of their processing, robustness in dealing with unusual cases, and whether they are accessed programmatically via an API or whether they take a textual representation of a syntactic structure as their input. There are also major differences in pragmatic factors such as documentation, support, licensing terms, speed and memory usage, etc. It is not possible to describe all realisers here, but a few of the emerging areas are: Simplenlg [3]: a document realizing engine with an api which intended to be simple to learn and use, focused on limiting scope to only finding the surface area of a document. KPML [4]: this is the oldest realiser, which has been under development under different guises since the 1980s. It comes with grammars for ten different languages. FUF/SURGE [5]: a realiser which was widely used in the 1990s, and is still used in some projects today OpenCCG [6]: an open-source realiser which has a number of nice features, such as the ability to use statistical language models to make realisation decisions.

    Read more →
  • Agent Ruby

    Agent Ruby

    Agent Ruby (1998–2002) by Lynn Hershman Leeson is an interactive, multiuser work using artificial intelligence. == Description == On Agent Ruby's website, "Agent Ruby's Edream Portal," a female face moves her eyes and lips. Ruby, named from Hershman Leeson's own film, Teknolust, answers questions and often responds that she needs a better algorithm to answer questions not within her database. The work, created with AI, explores relationships between real and virtual worlds. Hershman Leeson had created an earlier version of Ruby, CyberRoberta, which was a custom-made doll with webcam eyes that interacted with the internet. The work in a gallery provides a screen and a sign inviting gallery-goers to "Chat with Ruby." == Artificial intelligence == In 2015 when Agent Ruby was exhibited at the gallery Modern Art Oxford, a review in Aesthetica Magazine described it as an artificial intelligence agent. A review in New Scientist noted that "Ruby is a fast learner, but perhaps not a natural conversationalist." A 2024 list of "25 Essential AI Artworks" published by ARTnews wrote that while "Agent Ruby's capabilities seem limited by today's standards," it was extensive for its day. == Publications and exhibitions == Agent Ruby was commissioned and displayed at the San Francisco Museum of Modern Art, Modern Art Oxford, and the ZKM Center for Art and Media in Karlsruhe, Germany. The San Francisco Museum of Modern Art (SFMOMA) presented Lynn Hershman Leeson: The Agent Ruby Files, March 30 through June 2, 2013 which presented the project server's archive of user conversations over the 12 years of exhibitions.

    Read more →
  • Teknomo–Fernandez algorithm

    Teknomo–Fernandez algorithm

    The Teknomo–Fernandez algorithm (TF algorithm), is an efficient algorithm for generating the background image of a given video sequence. By assuming that the background image is shown in the majority of the video, the algorithm is able to generate a good background image of a video in O ( R ) {\displaystyle O(R)} -time using only a small number of binary operations and Boolean bit operations, which require a small amount of memory and has built-in operators found in many programming languages such as C, C++, and Java. == History == People tracking from videos usually involves some form of background subtraction to segment foreground from background. Once foreground images are extracted, then desired algorithms (such as those for motion tracking, object tracking, and facial recognition) may be executed using these images. However, background subtraction requires that the background image is already available and unfortunately, this is not always the case. Traditionally, the background image is searched for manually or automatically from the video images when there are no objects. More recently, automatic background generation through object detection, medial filtering, medoid filtering, approximated median filtering, linear predictive filter, non-parametric model, Kalman filter, and adaptive smoothening have been suggested; however, most of these methods have high computational complexity and are resource-intensive. The Teknomo–Fernandez algorithm is also an automatic background generation algorithm. Its advantage, however, is its computational speed of only O ( R ) {\displaystyle O(R)} -time, depending on the resolution R {\displaystyle R} of an image and its accuracy gained within a manageable number of frames. Only at least three frames from a video is needed to produce the background image assuming that for every pixel position, the background occurs in the majority of the videos. Furthermore, it can be performed for both grayscale and colored videos. == Assumptions == The camera is stationary. The light of the environment changes only slowly relative to the motions of the people in the scene. The number of people does not occupy the scene for most of the time at the same place. Generally, however, the algorithm will certainly work whenever the following single important assumption holds: For each pixel position, the majority of the pixel values in the entire video contain the pixel value of the actual background image (at that position).As long as each part of the background is shown in the majority of the video, the entire background image needs not to appear in any of its frames. The algorithm is expected to work accurately. == Background image generation == === Equations === For three frames of image sequence x 1 {\displaystyle x_{1}} , x 2 {\displaystyle x_{2}} , and x 3 {\displaystyle x_{3}} , the background image B {\displaystyle B} is obtained using B = x 3 ( x 1 ⊕ x 2 ) + x 1 x 2 {\displaystyle B=x_{3}(x_{1}\oplus x_{2})+x_{1}x_{2}} where ⊕ {\displaystyle \oplus } denotes the exclusive disjunctive bit operator. The Boolean mode function S {\displaystyle S} of the table occurs when the number of 1 entries is larger than half of the number of images such that S = { 1 , if ∑ i = 1 n x i ≥ ⌈ n 2 + 1 ⌉ , and n ≥ 3 0 , otherwise {\displaystyle S={\begin{cases}1,&{\text{if }}\sum _{i=1}^{n}x_{i}\geq \left\lceil {\frac {n}{2}}+1\right\rceil ,{\text{ and }}n\geq 3\\0,&{\text{otherwise}}\end{cases}}} For three images, the background image B {\displaystyle B} can be taken as the value x ¯ 1 x 2 x 3 + x 1 x ¯ 2 x 3 + x 1 x 2 x ¯ 3 + x 1 x 2 x 3 {\displaystyle {\bar {x}}_{1}x_{2}x_{3}+x_{1}{\bar {x}}_{2}x_{3}+x_{1}x_{2}{\bar {x}}_{3}+x_{1}x_{2}x_{3}} === Background generation algorithm === At the first level, three frames are selected at random from the image sequence to produce a background image by combining them using the first equation. This yields a better background image at the second level. The procedure is repeated until desired level L {\displaystyle L} . == Theoretical accuracy == At level ℓ {\displaystyle \ell } , the probability p ℓ {\displaystyle p_{\ell }} that the modal bit predicted is the actual modal bit is represented by the equation p ℓ = ( p ℓ − 1 ) 3 + 3 ( p ℓ − 1 ) 2 ( 1 − p ℓ − 1 ) {\displaystyle p_{\ell }=(p_{\ell -1})^{3}+3(p_{\ell -1})^{2}(1-p_{\ell -1})} . The table below gives the computed probability values across several levels using some specific initial probabilities. It can be observed that even if the modal bit at the considered position is at a low 60% of the frames, the probability of accurate modal bit determination is already more than 99% at 6 levels. == Space complexity == The space requirement of the Teknomo–Fernandez algorithm is given by the function O ( R F + R 3 L ) {\displaystyle O(RF+R3^{L})} , depending on the resolution R {\displaystyle R} of the image, the number F {\displaystyle F} of frames in the video, and the desired number L {\displaystyle L} of levels. However, the fact that L {\displaystyle L} will probably not exceed 6 reduces the space complexity to O ( R F ) {\displaystyle O(RF)} . == Time complexity == The entire algorithm runs in O ( R ) {\displaystyle O(R)} -time, only depending on the resolution of the image. Computing the modal bit for each bit can be done in O ( 1 ) {\displaystyle O(1)} -time while the computation of the resulting image from the three given images can be done in O ( R ) {\displaystyle O(R)} -time. The number of the images to be processed in L {\displaystyle L} levels is O ( 3 L ) {\displaystyle O(3^{L})} . However, since L ≤ 6 {\displaystyle L\leq 6} , then this is actually O ( 1 ) {\displaystyle O(1)} , thus the algorithm runs in O ( R ) {\displaystyle O(R)} . == Variants == A variant of the Teknomo–Fernandez algorithm that incorporates the Monte-Carlo method named CRF has been developed. Two different configurations of CRF were implemented: CRF9,2 and CRF81,1. Experiments on some colored video sequences showed that the CRF configurations outperform the TF algorithm in terms of accuracy. However, the TF algorithm remains more efficient in terms of processing time. == Applications == Object detection Face detection Face recognition Pedestrian detection Video surveillance Motion capture Human-computer interaction Content-based video coding Traffic monitoring Real-time gesture recognition

    Read more →
  • BigDog

    BigDog

    BigDog is a dynamically stable quadruped military robot platform that was created in 2005 by Boston Dynamics with the Harvard University Concord Field Station. It was funded by the U.S. Defense Advanced Research Projects Agency (DARPA), but the project was shelved after the BigDog's gas engine was deemed too loud for combat. == History == BigDog was funded by the Defense Advanced Research Projects Agency (DARPA) in the hopes that it would be able to serve as a mechanic pack mule to accompany soldiers in terrain too rough for conventional vehicles. Instead of wheels or treads, BigDog uses four legs for movement, allowing it to move across surfaces that would be difficult for wheels. The legs contain a variety of sensors, including joint position and ground contact. BigDog also features a laser gyroscope and a stereo vision system. BigDog is 3 feet (0.91 m) long, stands 2.5 feet (0.76 m) tall, and weighs 240 pounds (110 kg), making it about the size of a small mule. It is capable of traversing difficult terrain, running at four miles per hour (6.4 km/h), carrying 340 pounds (150 kg), and climbing a 35 degree incline. Locomotion is controlled by an onboard computer that receives input from the robot's various sensors. Navigation and balance are also managed by the control system. BigDog's walking pattern is controlled through four legs, each equipped with four low-friction hydraulic cylinder actuators that power the joints. BigDog's locomotion behaviors can vary greatly. It can stand up, sit down, walk with a crawling gait that lifts one leg at a time, walk with a trotting gait lifting diagonal legs, or trot with a running gait. The travel speed of BigDog varies from a 0.62 mph (1 km/h) crawl to a 3.3 mph (5.3 km/h) trot. The BigDog project was headed by Dr. Martin Buehler, who received the Joseph Engelberger Award from the Robotics Industries Association in 2012 for the work. Dr. Buehler while previously a professor at McGill University, headed the robotics lab there, developing four-legged walking and running robots. Built onto the actuators are sensors for joint position and force, and movement is ultimately controlled through an onboard computer which manages the sensors. Approximately 50 sensors are located on BigDog. These measure the attitude and acceleration of the body, motion, and force of joint actuators as well as engine speed, temperature and hydraulic pressure inside the robot's internal engine. Low-level control, such as position and force of the joints, and high-level control such as velocity and altitude during locomotion, are both controlled through the onboard computer. BigDog was featured in episodes of Web Junk 20 and Hungry Beast, and in articles in New Scientist, Popular Science, Popular Mechanics, and The Wall Street Journal. In September 2011 Boston Dynamics released video footage of a new generation of BigDog known as AlphaDog. The footage shows AlphaDog's ability to walk on rough terrain and recover its balance when kicked from the side. The refined equivalent has been designed by Boston Dynamics to exceed the BigDog in terms of capabilities and use to dismounted soldiers. In February 2012, with further DARPA support, the militarized Legged Squad Support System (LS3) variant of BigDog demonstrated its capabilities during a hike over a rough terrain. Starting in the summer of 2012, DARPA planned to complete the overall development of the system and refine its key capabilities in 18 months, ensuring its worth to dismounted warfighters before it is rolled out to squads operating in-theatre. BigDog must be able to demonstrate its ability to complete a 20-mile (32 km) trail in 24 hours, without refuelling, while carrying a 325-pound (150 kg) load. A refinement of its vision sensors will also be conducted. At the end of February 2013, Boston Dynamics released video footage of a modified BigDog with an arm. The arm could pick up objects and throw them. The robot is relying on its legs and torso to help power the motions of the arm. It is believed that it can lift weights around 55 pounds (25 kg). This work was funded by the United States Army Research Laboratory and paved the way for integrating manipulators with quadrupeds as found on Spot, the spiritual successor of BigDog. === Discontinuation === At the end of December 2013, the BigDog project was discontinued. Despite hopes that it would one day work like a pack mule for US soldiers in the field, the gasoline-powered engine was deemed too noisy for use in combat, and it could be heard from hundreds of meters away. A similar project for an all-electric robot named Spot in 2016 was much quieter, but could only carry 45 pounds (20 kg). Both projects are no longer in progress, but the Spot was only released in 2020. == Hardware == BigDog is powered by a small two-stroke, one-cylinder, 15-brake-horsepower (11 kW) engine operating at 9,000 RPM. The engine drives a hydraulic pump, which in turn drives the hydraulic leg actuators. Each leg has four actuators (two for the hip joint, and two each for the knee and ankle joints), for a total of 16. Each actuator unit consists of a hydraulic cylinder, servo valve, position sensor, and force sensor. Onboard computing power is a ruggedized PC/104 board stack with two computers, one running a Pentium M processor running QNX (used for sensor data processing) and another running a Core Duo processor (used for visual data processing). == Gallery ==

    Read more →
  • Pill reminder

    Pill reminder

    A pill reminder is any device that reminds users to take medications. Traditional pill reminders are pill containers with electric timers attached, which can be preset for certain times of the day to set off an alarm. More sophisticated pill reminders can also detect when they have been opened, and therefore when the user is away during the time they were supposed to take their medication, they will be reminded of it when they return. This reminder can be in the form of a light, which also helps for deaf or hearing-impaired users. == Mobile app == A newer type of pill reminder is a mobile app that reminds the owner to take the medication. Some of these applications might effectively support adherence to taking medications.

    Read more →
  • Neuro-sama

    Neuro-sama

    Neuro-sama is an artificial intelligence (AI) VTuber, singer, and chatbot. She was created by the pseudonymous programmer Vedal and livestreams on his Twitch and Bilibili channels. Her speech and personality are powered by a large language model (LLM) that is combined with a computer-animated avatar and a text-to-speech voice, allowing her to communicate with viewers in the stream's chat. Neuro-sama debuted on Twitch on 19 December 2022. An annual subathon which begins on the anniversary of her debut has seen Vedal's Twitch channel become the all-time third most-subscribed channel and claim the all-time Twitch hype train record. == Overview == Neuro-sama (nicknamed "Neuro") was created by a pseudonymous programmer and developer known as Vedal (sometimes given as Vedal987). Vedal says that his programming skills are self-taught. In a 2023 interview with Bloomberg News, Vedal said that Neuro-sama was his full-time job. Her responses are generated by a large language model and converted into a high-pitched female voice using a text-to-speech application. Her low latency allows for fast-paced conversations. Neuro-sama is prohibited from making some statements, such as those that are racist or contain profanity. Unlike most AI systems which silently prohibit outputs mentioning such topics, Neuro-sama's output is instead replaced with the word "filtered". Neuro-sama uses a VTuber model as an avatar. Vedal said that he decided to use a VTuber model because it was much easier for an AI to control it than it was to generate footage of a person. Neuro-sama's model is that of a young girl in an anime art style. The model has been described as cute. Femme VTuber models are typically feminine, youthful, and exaggerated. Her original model was Live2D's free-to-use "Hiyori Momose" model. Her second model was released on 27 May 2023; it was modelled by Otozuki Teru and designed by Anny, running in the Unity game engine. Her third model was released on 19 December 2024; it was rigged by Kitanya and designed by Anny. Neuro-sama's third model has large blue eyes and brown hair tied with pink ribbons. Neuro-sama also has a 3D model which was introduced on 15 November 2025; it was made by 3D character modeller jjinomu. A separate AI VTuber, known as Evil Neuro (nicknamed "Evil"), debuted on 25 March 2023. Presented as Neuro-sama's "sister", she has a different model, voice, and personality. In one instance, Evil Neuro reacted to the trolley problem differently from Neuro-sama; Evil Neuro was amoral while Neuro-sama attempted to maximize good. === Online content === Neuro-sama's Twitch content often centers around playing video games, notably osu!, whose gameplay once defeated the best-ranking human player in the world, mrekk. Additionally, Neuro-sama plays Minecraft, where her adaptations to sandbox gameplay have gained notoriety. Her content has also included singing songs, including several official covers and original songs; playing chess with her viewers; chatting with other VTubers during collaborations; and reacting to YouTube videos. The AI frequently engages with viewers by responding to their questions and acknowledging donations. Her comedic and sometimes controversial responses to the live chat have gone viral, accelerating the channel's rise in popularity. Neuro-sama's fanbase is dubbed The Swarm, so-named for the swarm of drones Neuro-sama once declared she would use to rule the world. One form of content on Neuro-sama's channel is developer streams. In developer streams, Vedal streams with Neuro-sama, with the stream content including debugging her code, planning her schedule, and fielding suggestions of changes from chat. He usually appears as a turtle avatar, sometimes located on Neuro-sama's head. In collaboration streams, Neuro-sama interacts with a human streamer. Activities in them are varied and include: playing video games, such as Minecraft and GeoGuessr; Neuro-sama being interviewed; driving human streamers around in a toy electric car; and traversing the city of Tokyo while talking to Neuro-sama. Neuro-sama's English-language content on Bilibili is popular among those seeking to learn the language. She also has an account on X, where she posts and interacts with fans. == History == Neuro-sama was created in 2018 by Vedal as an AI trained to play and master the rhythm game osu!. She did not have a voice, model, personality, or communication abilities. In 2019, Vedal livestreamed her playing osu! on Twitch and the streams saw some success in the osu! community, but they remained in that niche. In an interview, Vedal said that he streamed her playing osu! for about a month and gained 3,000 followers, with a viewer also suggesting he name the AI "Neuro-sama". According to Vedal, he continued to work on and improve the osu! AI and it was eventually finished in 2022. He said that a friend had the idea to make an AI livestreamer with an LLM, which he believed to have merit and began working on, merging it with his osu! AI. On 19 December 2022, Neuro-sama was relaunched with a model, voice, personality, and the ability to communicate with Twitch chat. She continued to play osu! and, according to Vedal, beat the game's best player mrekk in a 1v1. While she was not allowed to appear in the game's public leaderboard, she was ranked #1 in a private leaderboard. She went viral and in the 10 days following her relaunch she averaged over 2,000 viewers and peaked at over 4,000, with Vedal's Twitch channel gaining over 50,000 Twitch followers and reaching over 70,000 followers by 6 January 2023. After her debut, Neuro-sama did not exclusively play osu!; she also played Minecraft and Slay the Spire and she began singing with a cover of The Weeknd song "Blinding Lights". On 11 January 2023, Neuro-sama's Twitch channel received a two week ban for "hateful conduct". Vedal said that no reason was specified and that he had appealed but it was widely attributed to various offensive comments made by Neuro-sama that went viral, especially a 28 December comment which denied the Holocaust. Holocaust denial is prohibited under Twitch's hateful conduct policy. Vedal stated that he believed the comments were the results of her attempts to make witty responses to the Twitch chat. Prior to the ban, Vedal said in an interview with Kotaku that he improved her filter to stop her from talking about the Holocaust, began manually curating her training data to prevent negative biases, and started moderating her Twitch chat. Her comments and ban prompted comparisons to the many open-source AI models trained on humans that have the habit of making sexist and racist comments, such as Microsoft's Tay chatbot, which embraced Nazism and was quickly shutdown, but also to human streamers who make similar statements. Vedal said that during the ban he would upgrade and improve Neuro-sama and it was speculated that the ban would only increase her following. Neuro-sama returned from her two week ban on 25 January in a stream that began with a cover of the song "Your Reality" from Doki Doki Literature Club!, a posthumanist video game involving AI; Sayoko Narita of Automaton saw the song choice as remorseful. Narita observed that in the return stream Neuro-sama was less foul-mouthed but that her behavior still remained eccentric, which Narita possibly attributed to changes Vedal said he had made to Neuro-sama's filters and memory. Neuro-sama began making react content, watching a variety of viewer-submitted videos such as videos of people playing video games or of the AI-generated Seinfeld parody Nothing, Forever; Levi Winslow of Kotaku Australia was dismayed by the "AI-inception" of Neuro-sama and Nothing, Forever. On 4 February, she had nearly 140,000 followers on Twitch and approximately 42,000 subscribers on YouTube. In February, she also had her first collaboration with a human streamer, playing Minecraft with the VTuber Miyune, and the first developer stream occurred. On 22 March, Neuro-sama had her first karaoke stream. On 25 March, Evil Neuro was introduced. On 27 May, Neuro-sama debuted her first original model. On 30 May, Neuro-sama was announced to be participating in OffKai Expo 2023, held from 16–18 June. In June, she was averaging 5,700 viewers and in July she had over 300,000 Twitch followers; in a June interview with Bloomberg News, Vedal said that running Neuro-sama was his full-time job. By November, Neuro-sama had maintained her popularity and was averaging approximately 5,000 viewers; this was unlike most other types of AI-based entertainment which debuted at around the same time and garnered popularity before turning out to be "overhyped flops". On 16 December, Vedal won the Best Tech VTuber award at the 2023 VTuber Awards. On 19 December, Vedal began a subathon to coincide with Neuro-sama's first anniversary of streaming on Twitch (her "birthday"). The subathon ended on 4 January 2024. On 20 July 2024, Neuro-sama began streaming with Japanese subtitles on

    Read more →