The Rabbit r1 is an artificial intelligence personal assistant device developed by the American technology startup Rabbit Inc and co-designed by Teenage Engineering. It was announced at the 2024 Consumer Electronics Show as a handheld device intended to perform digital tasks through voice commands, touch interaction, and web-based AI agents. The r1 was marketed around Rabbit's concept of a "large action model" (LAM), which the company described as software able to operate websites and services on behalf of users. The device runs rabbitOS, an operating system based on the Android Open Source Project. Its services have included AI search, image recognition, voice interaction, music playback, rideshare and food-ordering integrations, and later experimental web-agent features such as LAM Playground and teach mode. Initial reviews were largely negative, with reviewers criticizing the device's limited functionality, bugs, and unclear advantages over a smartphone. Critics also questioned Rabbit's claims after the r1 software was shown to run on an Android phone. Rabbit continued to issue software updates after launch, including rabbitOS 2 in September 2025, which introduced a redesigned card-based interface, gesture navigation, and a "creations" feature for generating small software tools and experiences on the device. Rabbit Inc was founded by Jesse Lyu Cheng. == Hardware == Display: A 2.88-inch touchscreen for interactive user input. Input: push-to-talk button to activate voice commands; scroll wheel; Gyroscope; Magnetometer; Accelerometer; GPS. Camera: 8 MP single camera, with a resolution of 3264x2448, allowing for the connected external AI to use computer vision. Audio: Equipped with a speaker and dual microphones for audio interaction. Connectivity: Supports Wi-Fi and cellular connections via a SIM card slot to access internet services. Processor: Runs on a 2.3GHz MediaTek Helio P35 processor. Memory: Contains 4GB of RAM for operational tasks. Storage: Offers 128GB of internal storage for data. Ports: Utilizes a USB-C port for charging and data connections. == Software == The Rabbit r1 runs rabbitOS, which is based on the Android Open Source Project (AOSP), specifically Android 13. Rabbit founder Jesse Lyu described rabbitOS as a "very bespoke AOSP" after reports that the r1's software could be run on a conventional Android phone. Rabbit described the r1 as using a large action model (LAM), a type of AI agent intended to perform tasks across software interfaces rather than only answer questions. At launch, the device supported a limited set of services, including AI search, vision features, music playback, and some third-party integrations. Perplexity.ai was one of the AI services used to answer user queries. In 2024, Rabbit released several software updates that added features and attempted to address early criticism of the device. In July 2024, the company launched "beta rabbit", an advanced search and conversation mode for more complex queries. In October 2024, it released LAM Playground, a web-based agent feature intended to let the r1 operate websites on behalf of users. Reviewers found the feature experimental; Android Authority reported that it could perform some navigation tasks but struggled with CAPTCHAs, loops, and unintended behavior. In November 2024, Rabbit introduced a beta "teach mode", which allowed users to demonstrate web-based tasks in the Rabbithole web portal and later ask the r1 to repeat them. The company described teach mode as experimental, and The Verge noted that Rabbit warned users that results could be unpredictable and that CAPTCHA-protected sites could cause problems. Rabbit released rabbitOS 2 in September 2025. The update redesigned the interface around a card-based layout, added additional touchscreen gestures, and introduced "creations", a feature that lets users generate simple software tools, games, and interfaces through natural-language prompts. Coverage of the update described it as a major software overhaul rather than new hardware. == Reception == === Funding === Rabbit raised $20 million in funding from Khosla Ventures, Synergis Capital and Kakao Investment in October 2023. The company announced an additional $10 million in funding in December 2023. === Sales === Following its announcement at the 2024 Consumer Electronics Show, 130,000 units were sold. On August 13, 2024, Rabbit announced that sales of r1 had expanded to the entire European Union (except Malta) and United Kingdom. On August 21, 2024, sales of r1 expanded to Singapore. === Reviews === The r1 was met with strong criticism immediately after Rabbit began shipping the device. Some reviews questioned what the device was able to do that a smartphone could not, while comparing it to the similar Humane Ai Pin. YouTuber Marques Brownlee called the device "barely reviewable". Android Authority's Mishaal Rahman managed to install Rabbit r1's software on a Pixel 6a smartphone, after a tipster shared an APK file. The Verge echoed the claims made by Rahman. In response, Lyu published statements confirming its use of Android, but denying that the r1 is an Android app. Mashable called its Vision features impressive, but said that "these praise-worthy features are overshadowed by buggy performance". Ars Technica wrote a blog post claiming "the company is blocking access from bootleg APKs". TechCrunch gave a slightly more positive review, calling the device a "fun peep at a possible future", but could not "advise anyone to buy one now." Shortly after the launch of r1, Rabbit began a weekly cadence of software updates to address much of the criticism from the early reviews, including "battery and GPS performance, time zone selection, and more". Digital Trends said the Magic Camera feature "takes the most mundane, ordinary, and badly composed photos and makes something fun and eye-catching from them." Mashable said the "beta rabbit" feature "makes Rabbit R1 more conversational and intelligent". Later coverage noted that Rabbit continued to update the r1 after its poorly received launch. The Verge reported in September 2024 that about 5,000 of roughly 100,000 purchasers were using the device at any given moment, citing Lyu, and described the product as having launched before it was ready. In 2025, coverage of rabbitOS 2 described the update as an attempt to reset the device's software experience after the criticism of its original release. == Controversies == === GAMA project === Rabbit Inc has garnered attention due to allegations surrounding its funding and the company's past projects. The company came under scrutiny when Stephen Findeisen, known as Coffeezilla on YouTube, published a video in May 2024, alleging that Rabbit Incorporation was "built on a scam". Rabbit Incorporation, initially named Cyber Manufacturing Co, rebranded just two months before launching the Rabbit R1. The company, under its former name, raised $6 million in November 2021 for a project called GAMA, described as a "Next Generation NFT Project." Jesse Lyu, the CEO of Rabbit Incorporation, referred to GAMA as a "fun little project." Coffeezilla, who investigates influencer scams, highlighted old Clubhouse recordings of Jesse Lyu discussing the GAMA project. In these recordings, Lyu emphasized the substantial funding behind GAMA and its potential to be a revolutionary, carbon-negative cryptocurrency. Coffeezilla questioned the whereabouts of the funds raised for GAMA, estimating that approximately $1 million in refunds to investors remained unresolved. He suggested that the rebranding to Rabbit Incorporation and the shift to developing the Rabbit R1 were attempts to divert from the GAMA project's issues. In response to Coffeezilla's inquiries, Rabbit Incorporation stated that the $6 million raised was used for the GAMA project. The company said that NFTs cannot be refunded unless the owner agrees to "burn" them on the blockchain. Rabbit Incorporation also said that the GAMA project was open-sourced and returned to the community, aligning with community feedback. They also mentioned that efforts to buy back NFTs were made to counteract malicious trading and maintain market stability. === Security === In June 2024, Engadget reported that the Rabbitude team, a community reverse engineering project, had gained access to the r1's codebase revealing that r1's software contained several hardcoded API keys in its code for ElevenLabs, Microsoft Azure, Yelp, and Google Maps, potentially allowing unauthorized access to r1 responses, including those containing the users' personal information. For a short time, Rabbit immediately began revoking and rotating those secrets and confirmed that the code was leaked by an employee who had "been terminated and remains under investigation". In July 2024, the company revealed that all user chats and device pairing data were logged on the r1 with no ability to delete them. This meant that lost or stolen devices could be used to extract user
Medical imaging
Medical imaging is the technique and process of imaging the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology). Medical imaging seeks to reveal internal structures hidden by the skin and bones, as well as to diagnose and treat disease. Medical imaging also establishes a database of normal anatomy and physiology to make it possible to identify abnormalities. Although imaging of removed organs and tissues can be performed for medical reasons, such procedures are usually considered part of pathology instead of medical imaging. Measurement and recording techniques that are not primarily designed to produce images, such as electroencephalography (EEG), magnetoencephalography (MEG), electrocardiography (ECG), and others, represent other technologies that produce data susceptible to representation as a parameter graph versus time or maps that contain data about the measurement locations. In a limited comparison, these technologies can be considered forms of medical imaging in another discipline of medical instrumentation. As of 2010, 5 billion medical imaging studies had been conducted worldwide. Radiation exposure from medical imaging in 2006 made up about 50% of total ionizing radiation exposure in the United States. Medical imaging equipment is manufactured using technology from the semiconductor industry, including CMOS integrated circuit chips, power semiconductor devices, sensors such as image sensors (particularly CMOS sensors) and biosensors, and processors such as microcontrollers, microprocessors, digital signal processors, media processors and system-on-chip devices. As of 2015, annual shipments of medical imaging chips amount to 46 million units and $1.1 billion. The term "noninvasive" is used to denote a procedure where no instrument is introduced into a patient's body, which is the case for most imaging techniques used. == History == In 1972, engineer Godfrey Hounsfield from the British company EMI invented the X-ray computed tomography device for head diagnosis, which is commonly referred to as computed tomography (CT). The CT nucleus method is based on the projecting X-rays through a section of the human head, which are then processed by computer to reconstruct the cross-sectional image, known as image reconstruction. In 1975, EMI successfully developed a CT device for the entire body, enabling the clear acquisition of tomographic images of various parts of the human body. This revolutionary diagnostic technique earned Hounsfield and physicist Allan Cormack the Nobel Prize in Physiology or Medicine in 1979. Digital image processing technology for medical applications was inducted into the Space Foundation's Space Technology Hall of Fame in 1994. By 2010, over 5 billion medical imaging studies had been conducted worldwide. Radiation exposure from medical imaging in 2006 accounted for about 50% of total ionizing radiation exposure in the United States. Medical imaging equipment is manufactured using technology from the semiconductor industry, including CMOS integrated circuit chips, power semiconductor devices, sensors such as image sensors (particularly CMOS sensors) and biosensors, as well as processors like microcontrollers, microprocessors, digital signal processors, media processors and system-on-chip devices. As of 2015, annual shipments of medical imaging chips reached 46 million units, generating a market value of $1.1 billion. == Types == In the clinical context, "invisible light" medical imaging is generally equated to radiology or "clinical imaging". "Visible light" medical imaging involves digital video or still pictures that can be seen without special equipment. Dermatology and wound care are two modalities that use visible light imagery. Interpretation of medical images is generally undertaken by a physician specialising in radiology known as a radiologist; however, this may be undertaken by any healthcare professional who is trained and certified in radiological clinical evaluation. Increasingly interpretation is being undertaken by non-physicians, for example radiographers frequently train in interpretation as part of expanded practice. Diagnostic radiography designates the technical aspects of medical imaging and in particular the acquisition of medical images. The radiographer (also known as a radiologic technologist) is usually responsible for acquiring medical images of diagnostic quality; although other professionals may train in this area, notably some radiological interventions performed by radiologists are done so without a radiographer. As a field of scientific investigation, medical imaging constitutes a sub-discipline of biomedical engineering, medical physics or medicine depending on the context: Research and development in the area of instrumentation, image acquisition (e.g., radiography), modeling and quantification are usually the preserve of biomedical engineering, medical physics, and computer science; Research into the application and interpretation of medical images is usually the preserve of radiology and the medical sub-discipline relevant to medical condition or area of medical science (neuroscience, cardiology, psychiatry, psychology, etc.) under investigation. Many of the techniques developed for medical imaging also have scientific and industrial applications. === Radiography === Two forms of radiographic images are in use in medical imaging. Projection radiography and fluoroscopy, with the latter being useful for catheter guidance. These 2D techniques are still in wide use despite the advance of 3D tomography due to the low cost, high resolution, and depending on the application, lower radiation dosages with 2D technique. This imaging modality uses a wide beam of X-rays for image acquisition and is the first imaging technique available in modern medicine. Fluoroscopy produces real-time images of internal structures of the body in a similar fashion to radiography, but employs a constant input of X-rays, at a lower dose rate. Contrast media, such as barium, iodine, and air are used to visualize internal organs as they work. Fluoroscopy is also used in image-guided procedures when constant feedback during a procedure is required. An image receptor is required to convert the radiation into an image after it has passed through the area of interest. Early on, this was a fluorescing screen, which gave way to an Image Amplifier (IA) which was a large vacuum tube that had the receiving end coated with cesium iodide, and a mirror at the opposite end. Eventually the mirror was replaced with a TV camera. Projectional radiographs, more commonly known as X-rays, are often used to determine the type and extent of a fracture as well as for detecting pathological changes in the lungs. With the use of radio-opaque contrast media, such as barium, they can also be used to visualize the structure of the stomach and intestines – this can help diagnose ulcers or certain types of colon cancer. === Magnetic resonance imaging === A magnetic resonance imaging instrument (MRI scanner), or "nuclear magnetic resonance (NMR) imaging" scanner as it was originally known, uses powerful magnets to polarize and excite hydrogen nuclei (i.e., single protons) of water molecules in human tissue, producing a detectable signal that is spatially encoded, resulting in images of the body. The MRI machine emits a radio frequency (RF) pulse at the resonant frequency of the hydrogen atoms on water molecules. Radio frequency antennas ("RF coils") send the pulse to the area of the body to be examined. The RF pulse is absorbed by protons, causing their direction with respect to the primary magnetic field to change. When the RF pulse is turned off, the protons "relax" back to alignment with the primary magnet and emit radio waves in the process. This radio-frequency emission from the hydrogen atoms on water is what is detected and reconstructed into an image. The resonant frequency of a spinning magnetic dipole (of which protons are one example) is called the Larmor frequency and is determined by the strength of the main magnetic field and the chemical environment of the nuclei of interest. MRI uses three electromagnetic fields: a very strong (typically 1.5 to 3 teslas) static magnetic field to polarize the hydrogen nuclei, called the primary field; gradient fields that can be modified to vary in space and time (on the order of 1 kHz) for spatial encoding, often simply called gradients; and a spatially homogeneous radio-frequency (RF) field for manipulation of the hydrogen nuclei to produce measurable signals, collected through an RF antenna. Like CT, MRI traditionally creates a two-dimensional image of a thin "slice" of the body and is therefore considered a tomographic imaging technique. Modern MRI instruments are capable of producing images in the form of 3D blocks, which may be considered a generalization of the single-slice
Night Sky (app)
Night Sky (app) is an application developed and published by indie studio iCandi Apps Ltd. from the UK. Night Sky is a stargazing reference app, where the user can explore a virtual representation of the night sky to identify stars, planets, constellations and satellites. The app is developed specifically for iOS, tvOS and watchOS devices. Night Sky was first released on November 1, 2011 for iOS, and has had multiple updates since launch. Night Sky was mentioned in the September 2016 Apple Keynote during the Apple Watch Series 2 announcement. In October 2016, Night Sky was featured as the Free App of The Week on the Apple App Store. == Reception == Night Sky was featured in Apple's 'Best of 2012' and has also been pre-installed onto iPads in Apple retail stores worldwide.
Grammar induction
Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of re-write rules or productions or alternatively as a finite-state machine or automaton of some kind) from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as strings, trees and graphs. == Grammar classes == Grammatical inference has often been very focused on the problem of learning finite-state machines of various types (see the article Induction of regular languages for details on these approaches), since there have been efficient algorithms for this problem since the 1980s. Since the beginning of the century, these approaches have been extended to the problem of inference of context-free grammars and richer formalisms, such as multiple context-free grammars and parallel multiple context-free grammars. Other classes of grammars for which grammatical inference has been studied are combinatory categorial grammars, stochastic context-free grammars, contextual grammars and pattern languages. == Learning models == The simplest form of learning is where the learning algorithm merely receives a set of examples drawn from the language in question: the aim is to learn the language from examples of it (and, rarely, from counter-examples, that is, example that do not belong to the language). However, other learning models have been studied. One frequently studied alternative is the case where the learner can ask membership queries as in the exact query learning model or minimally adequate teacher model introduced by Angluin. == Methodologies == There is a wide variety of methods for grammatical inference. Two of the classic sources are Fu (1977) and Fu (1982). Duda, Hart & Stork (2001) also devote a brief section to the problem, and cite a number of references. The basic trial-and-error method they present is discussed below. For approaches to infer subclasses of regular languages in particular, see Induction of regular languages. A more recent textbook is de la Higuera (2010), which covers the theory of grammatical inference of regular languages and finite state automata. D'Ulizia, Ferri and Grifoni provide a survey that explores grammatical inference methods for natural languages. === Induction of probabilistic grammars === There are several methods for induction of probabilistic context-free grammars. === Grammatical inference by trial-and-error === The method proposed in Section 8.7 of Duda, Hart & Stork (2001) suggests successively guessing grammar rules (productions) and testing them against positive and negative observations. The rule set is expanded so as to be able to generate each positive example, but if a given rule set also generates a negative example, it must be discarded. This particular approach can be characterized as "hypothesis testing" and bears some similarity to Mitchel's version space algorithm. The Duda, Hart & Stork (2001) text provide a simple example which nicely illustrates the process, but the feasibility of such an unguided trial-and-error approach for more substantial problems is dubious. === Grammatical inference by genetic algorithms === Grammatical induction using evolutionary algorithms is the process of evolving a representation of the grammar of a target language through some evolutionary process. Formal grammars can easily be represented as tree structures of production rules that can be subjected to evolutionary operators. Algorithms of this sort stem from the genetic programming paradigm pioneered by John Koza. Other early work on simple formal languages used the binary string representation of genetic algorithms, but the inherently hierarchical structure of grammars couched in the EBNF language made trees a more flexible approach. Koza represented Lisp programs as trees. He was able to find analogues to the genetic operators within the standard set of tree operators. For example, swapping sub-trees is equivalent to the corresponding process of genetic crossover, where sub-strings of a genetic code are transplanted into an individual of the next generation. Fitness is measured by scoring the output from the functions of the Lisp code. Similar analogues between the tree structured lisp representation and the representation of grammars as trees, made the application of genetic programming techniques possible for grammar induction. In the case of grammar induction, the transplantation of sub-trees corresponds to the swapping of production rules that enable the parsing of phrases from some language. The fitness operator for the grammar is based upon some measure of how well it performed in parsing some group of sentences from the target language. In a tree representation of a grammar, a terminal symbol of a production rule corresponds to a leaf node of the tree. Its parent nodes corresponds to a non-terminal symbol (e.g. a noun phrase or a verb phrase) in the rule set. Ultimately, the root node might correspond to a sentence non-terminal. === Grammatical inference by greedy algorithms === Like all greedy algorithms, greedy grammar inference algorithms make, in iterative manner, decisions that seem to be the best at that stage. The decisions made usually deal with things like the creation of new rules, the removal of existing rules, the choice of a rule to be applied or the merging of some existing rules. Because there are several ways to define 'the stage' and 'the best', there are also several greedy grammar inference algorithms. These context-free grammar generating algorithms make the decision after every read symbol: Lempel-Ziv-Welch algorithm creates a context-free grammar in a deterministic way such that it is necessary to store only the start rule of the generated grammar. Sequitur and its modifications. These context-free grammar generating algorithms first read the whole given symbol-sequence and then start to make decisions: Byte pair encoding and its optimizations. === Distributional learning === A more recent approach is based on distributional learning. Algorithms using these approaches have been applied to learning context-free grammars and mildly context-sensitive languages and have been proven to be correct and efficient for large subclasses of these grammars. === Learning of pattern languages === Angluin defines a pattern to be "a string of constant symbols from Σ and variable symbols from a disjoint set". The language of such a pattern is the set of all its nonempty ground instances i.e. all strings resulting from consistent replacement of its variable symbols by nonempty strings of constant symbols. A pattern is called descriptive for a finite input set of strings if its language is minimal (with respect to set inclusion) among all pattern languages subsuming the input set. Angluin gives a polynomial algorithm to compute, for a given input string set, all descriptive patterns in one variable x. To this end, she builds an automaton representing all possibly relevant patterns; using sophisticated arguments about word lengths, which rely on x being the only variable, the state count can be drastically reduced. Erlebach et al. give a more efficient version of Angluin's pattern learning algorithm, as well as a parallelized version. Arimura et al. show that a language class obtained from limited unions of patterns can be learned in polynomial time. === Pattern theory === Pattern theory, formulated by Ulf Grenander, is a mathematical formalism to describe knowledge of the world as patterns. It differs from other approaches to artificial intelligence in that it does not begin by prescribing algorithms and machinery to recognize and classify patterns; rather, it prescribes a vocabulary to articulate and recast the pattern concepts in precise language. In addition to the new algebraic vocabulary, its statistical approach was novel in its aim to: Identify the hidden variables of a data set using real world data rather than artificial stimuli, which was commonplace at the time. Formulate prior distributions for hidden variables and models for the observed variables that form the vertices of a Gibbs-like graph. Study the randomness and variability of these graphs. Create the basic classes of stochastic models applied by listing the deformations of the patterns. Synthesize (sample) from the models, not just analyze signals with it. Broad in its mathematical coverage, pattern theory spans algebra and statistics, as well as local topological and global entropic properties. == Applications == The principle of grammar induction has been applied to other aspects of natural language processing, and has been applied (among many other problems) to semantic parsing, natural language understanding, example-based translation, language acquisition, grammar-based compre
Glossary of machine vision
The following are common definitions related to the machine vision field. General related fields Machine vision Computer vision Image processing Signal processing == 0-9 == 1394. FireWire is Apple Inc.'s brand name for the IEEE 1394 interface. It is also known as i.Link (Sony's name) or IEEE 1394 (although the 1394 standard also defines a backplane interface). It is a personal computer (and digital audio/digital video) serial bus interface standard, offering high-speed communications and isochronous real-time data services. 1D. One-dimensional. 2D computer graphics. The computer-based generation of digital images—mostly from two-dimensional models (such as 2D geometric models, text, and digital images) and by techniques specific to them. 3D computer graphics. 3D computer graphics are different from 2D computer graphics in that a three-dimensional representation of geometric data is stored in the computer for the purposes of performing calculations and rendering 2D images. Such images may be for later display or for real-time viewing. Despite these differences, 3D computer graphics rely on many of the same algorithms as 2D computer vector graphics in the wire frame model and 2D computer raster graphics in the final rendered display. In computer graphics software, the distinction between 2D and 3D is occasionally blurred; 2D applications may use 3D techniques to achieve effects such as lighting, and primarily 3D may use 2D rendering techniques. 3D scanner. This is a device that analyzes a real-world object or environment to collect data on its shape and possibly color. The collected data can then be used to construct digital, three dimensional models useful for a wide variety of applications. == A == Aberration. Optically, defocus refers to a translation along the optical axis away from the plane or surface of best focus. In general, defocus reduces the sharpness and contrast of the image. What should be sharp, high-contrast edges in a scene become gradual transitions. Algebraic distance or algebraic error. The algebraic distance from a point xi to a curve or surface defined by f ( x , β ) = 0 {\displaystyle f(x,\beta )=0} is the value of f ( x i , β ) {\displaystyle f(x_{i},\beta )} , i.e. the residual in the least squares problem with data point (xi, 0) and model function f. This term is mainly used in computer vision.[1][2] Aperture. In context of photography or machine vision, aperture refers to the diameter of the aperture stop of a photographic lens. The aperture stop can be adjusted to control the amount of light reaching the film or image sensor. aspect ratio (image). The aspect ratio of an image is its displayed width divided by its height (usually expressed as "x:y"). Angular resolution. Describes the resolving power of any image forming device such as an optical or radio telescope, a microscope, a camera, or an eye. Automated optical inspection. == B == Barcode. A barcode (also bar code) is a machine-readable representation of information in a visual format on a surface. Blob discovery. Inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. These blobs frequently represent optical targets for machining, robotic capture, or manufacturing failure. Bitmap. A raster graphics image, digital image, or bitmap, is a data file or structure representing a generally rectangular grid of pixels, or points of color, on a computer monitor, paper, or other display device. == C == Camera. A camera is a device used to take pictures, either singly or in sequence. A camera that takes pictures singly is sometimes called a photo camera to distinguish it from a video camera. Camera Link. Camera Link is a serial communication protocol designed for computer vision applications based on the National Semiconductor interface Channel-link. It was designed for the purpose of standardizing scientific and industrial video products including cameras, cables and frame grabbers. The standard is maintained and administered by the Automated Imaging Association, or AIA, the global machine vision industry's trade group. Charge-coupled device. A charge-coupled device (CCD) is a sensor for recording images, consisting of an integrated circuit containing an array of linked, or coupled, capacitors. CCD sensors and cameras tend to be more sensitive, less noisy, and more expensive than CMOS sensors and cameras. CIE 1931 Color Space. In the study of the perception of color, one of the first mathematically defined color spaces was the CIE XYZ color space (also known as CIE 1931 color space), created by the International Commission on Illumination (CIE) in 1931. CMOS. CMOS ("see-moss")stands for complementary metal-oxide semiconductor, is a major class of integrated circuits. CMOS imaging sensors for machine vision are cheaper than CCD sensors but more noisy. CoaXPress. CoaXPress (CXP) is an asymmetric high speed serial communication standard over coaxial cable. CoaXPress combines high speed image data, low speed camera control and power over a single coaxial cable. The standard is maintained by JIIA, the Japan Industrial Imaging Association. Color. The perception of the frequency (or wavelength) of light, and can be compared to how pitch (or a musical note) is the perception of the frequency or wavelength of sound. Color blindness. Also known as color vision deficiency, in humans is the inability to perceive differences between some or all colors that other people can distinguish Color temperature. "White light" is commonly described by its color temperature. A traditional incandescent light source's color temperature is determined by comparing its hue with a theoretical, heated black-body radiator. The lamp's color temperature is the temperature in kelvins at which the heated black-body radiator matches the hue of the lamp. Color vision. CV is the capacity of an organism or machine to distinguish objects based on the wavelengths (or frequencies) of the light they reflect or emit. computer vision. The study and application of methods which allow computers to "understand" image content. Contrast. In visual perception, contrast is the difference in visual properties that makes an object (or its representation in an image) distinguishable from other objects and the background. C-Mount. Standardized adapter for optical lenses on CCD - cameras. C-Mount lenses have a back focal distance 17.5 mm vs. 12.5 mm for "CS-mount" lenses. A C-Mount lens can be used on a CS-Mount camera through the use of a 5 mm extension adapter. C-mount is a 1" diameter, 32 threads per inch mounting thread (1"-32UN-2A.) CS-Mount. Same as C-Mount but the focal point is 5 mm shorter. A CS-Mount lens will not work on a C-Mount camera. CS-mount is a 1" diameter, 32 threads per inch mounting thread. == D == Data matrix. A two dimensional Barcode. Depth of field. In optics, particularly photography and machine vision, the depth of field (DOF) is the distance in front of and behind the subject which appears to be in focus. Depth perception. DP is the visual ability to perceive the world in three dimensions. It is a trait common to many higher animals. Depth perception allows the beholder to accurately gauge the distance to an object. Diaphragm. In optics, a diaphragm is a thin opaque structure with an opening (aperture) at its centre. The role of the diaphragm is to stop the passage of light, except for the light passing through the aperture. == E == Edge detection. ED marks the points in a digital image at which the luminous intensity changes sharply. It also marks the points of luminous intensity changes of an object or spatial-taxon silhouette. Electromagnetic interference. Radio Frequency Interference (RFI) is electromagnetic radiation which is emitted by electrical circuits carrying rapidly changing signals, as a by-product of their normal operation, and which causes unwanted signals (interference or noise) to be induced in other circuits. == F == FireWire. FireWire (also known as i. Link or IEEE 1394) is a personal computer (and digital audio/video) serial bus interface standard, offering high-speed communications. It is often used as an interface for industrial cameras. Fixed-pattern noise. Flat-field correction. Frame grabber. An electronic device that captures individual, digital still frames from an analog video signal or a digital video stream. Fringe Projection Technique. 3D data acquisition technique employing projector displaying fringe pattern on a surface of measured piece, and one or more cameras recording image(s). Field of view. The field of view (FOV) is the part which can be seen by the machine vision system at one moment. The field of view depends from the lens of the system and from the working distance between object and camera. Focus. An image, or image point or region, is said to be in focus if light from object points is converged about as well as possible in the image; conversely, it is out of focus if light is not w
Tensor glyph
In scientific visualization a tensor glyph is an object that can visualize all or most of the nine degrees of freedom, such as acceleration, twist, or shear – of a 3 × 3 {\displaystyle 3\times 3} matrix. It is used for tensor field visualization, where a data-matrix is available at every point in the grid. "Glyphs, or icons, depict multiple data values by mapping them onto the shape, size, orientation, and surface appearance of a base geometric primitive." Tensor glyphs are a particular case of multivariate data glyphs. There are certain types of glyphs that are commonly used: Ellipsoid Cuboid Cylindrical Superquadrics According to Thomas Schultz and Gordon Kindlmann, specific types of tensor fields "play a central role in scientific and biomedical studies as well as in image analysis and feature-extraction methods."
Photometric stereo
Photometric stereo is a technique in computer vision for estimating the surface normals of objects by observing that object under different lighting conditions (photometry). It is based on the fact that the amount of light reflected by a surface is dependent on the orientation of the surface in relation to the light source and the observer. By measuring the amount of light reflected into a camera, the space of possible surface orientations is limited. Given enough light sources from different angles, the surface orientation may be constrained to a single orientation or even overconstrained. The technique was originally introduced by Woodham in 1980. The special case where the data is a single image is known as shape from shading, and was analyzed by B. K. P. Horn in 1989. Photometric stereo has since been generalized to many other situations, including extended light sources and non-Lambertian surface finishes. Current research aims to make the method work in the presence of projected shadows, highlights, and non-uniform lighting. Photometric stereo is widely used in various fields, including archaeology, cultural heritage conservation, and quality control. It is now integrated into widely used open-source software, such as Meshroom. == Basic method == Under Woodham's original assumptions — Lambertian reflectance, known point-like distant light sources, and uniform albedo — the problem can be solved by inverting the linear equation I = L ⋅ n {\displaystyle I=L\cdot n} , where I {\displaystyle I} is a (known) vector of m {\displaystyle m} observed intensities, n {\displaystyle n} is the (unknown) surface normal, and L {\displaystyle L} is a (known) 3 × m {\displaystyle 3\times m} matrix of normalized light directions. This model can easily be extended to surfaces with non-uniform albedo, while keeping the problem linear. Taking an albedo reflectivity of k {\displaystyle k} , the formula for the reflected light intensity becomes I = k ( L ⋅ n ) . {\displaystyle I=k(L\cdot n).} If L {\displaystyle L} is square (there are exactly 3 lights) and non-singular, it can be inverted, giving L − 1 I = k n . {\displaystyle L^{-1}I=kn.} Since the normal vector is known to have length 1, k {\displaystyle k} must be the length of the vector k n {\displaystyle kn} , and n {\displaystyle n} is the normalised direction of that vector. If L {\displaystyle L} is not square (there are more than 3 lights), a generalisation of the inverse can be obtained using the Moore–Penrose pseudoinverse, by simply multiplying both sides with L T {\displaystyle L^{T}} , giving L T I = L T k ( L ⋅ n ) , {\displaystyle L^{T}I=L^{T}k(L\cdot n),} ( L T L ) − 1 L T I = k n , {\displaystyle (L^{T}L)^{-1}L^{T}I=kn,} after which the normal vector and albedo can be solved as described above. == Non-Lambertian surfaces == The classical photometric stereo problem concerns itself only with Lambertian surfaces, with perfectly diffuse reflection. This is unrealistic for many types of materials, especially metals, glass and smooth plastics, and will lead to aberrations in the resulting normal vectors. Many methods have been developed to lift this assumption. In this section, a few of these are listed. === Specular reflections === Historically, in computer graphics, the commonly used model to render surfaces started with Lambertian surfaces and progressed first to include simple specular reflections. Computer vision followed a similar course with photometric stereo. Specular reflections were among the first deviations from the Lambertian model. These are a few adaptations that have been developed. Many techniques ultimately rely on modelling the reflectance function of the surface, that is, how much light is reflected in each direction. This reflectance function has to be invertible. The reflected light intensities towards the camera is measured, and the inverse reflectance function is fit onto the measured intensities, resulting in a unique solution for the normal vector. === General BRDFs and beyond === According to the Bidirectional reflectance distribution function (BRDF) model, a surface may distribute the amount of light it receives in any outward direction. This is the most general known model for opaque surfaces. Some techniques have been developed to model (almost) general BRDFs. In practice, all of these require many light sources to obtain reliable data. These are methods in which surfaces with general BRDFs can be measured. Determine the explicit BRDF prior to scanning. To do this, a different surface is required that has the same or a very similar BRDF, of which the actual geometry (or at least the normal vectors for many points on the surface) is already known. The lights are then individually shone upon the known surface, and the amount of reflection into the camera is measured. Using this information, a look-up table can be created that maps reflected intensities for each light source to a list of possible normal vectors. This puts constraints on the possible normal vectors the surface may have, and reduces the photometric stereo problem to an interpolation between measurements. Typical known surfaces to calibrate the look-up table with are spheres for their wide variety of surface orientations. Restricting the BRDF to be symmetrical. If the BRDF is symmetrical, the direction of the light can be restricted to a cone about the direction to the camera. Which cone this is depends on the BRDF itself, the normal vector of the surface, and the measured intensity. Given enough measured intensities and the resulting light directions, these cones can be approximated and therefore the normal vectors of the surface. Some progress has been made towards modelling an even more general surfaces, such as Spatially Varying Bidirectional Distribution Functions (SVBRDF), Bidirectional surface scattering reflectance distribution functions (BSSRDF), and accounting for interreflections. However, such methods are still fairly restrictive in photometric stereo. Better results have been achieved with structured light. == Uncalibrated photometric stereo == Uncalibrated Photometric Stereo is an approach in photometric stereo that aims to reconstruct the 3D shape of an object from images captured under unknown lighting conditions. Unlike classical methods, which often assume controlled or known lighting setups, this approach removes these constraints, making it adaptable to diverse and real-world environments. The advent of deep learning has revolutionized universal PS by replacing handcrafted assumptions with data-driven models. Recent approaches leverage Transformer-based architectures and multi-scale encoder–decoder networks to directly estimate surface normals from input images. Uncalibrated Photometric Stereo is inherently an ill-posed problem, as it attempts to recover 3D shape and lighting conditions simultaneously from images alone. This leads to fundamental ambiguities in the reconstruction process, which manifest as systematic errors in the recovered geometry, including global distortions in the object's overall shape, and misinterpretation of surface orientation, where concave regions may appear convex and vice versa. To address the challenges of uncalibrated photometric stereo, hybrid methods have emerged that combine multi-view stereo and photometric stereo. These approaches leverage the strengths of both techniques, including geometric reliability and resolution.