AI Email Client

AI Email Client — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Aikuma

    Aikuma

    Aikuma is an Android app for collecting speech recordings with time-aligned translations. The app includes a text-free interface for consecutive interpretation, designed for users who are not literate. The Aikuma won Grand Prize in the Open Source Software World Challenge (2013). == Name == Aikuma means "meeting place" in Usarufa, a Papuan language where this software was first used in 2012. == History == Aikuma was developed with sponsorship from the National Science Foundation, including a $101,501 (US) project, "to use mobile telephones to collect larger amounts of data on undocumented endangered languages than would never be possible through usual fieldwork." Aikuma and its modified version (Lig-Aikuma) have been used for collecting substantial quantities of audio in remote indigenous villages. A modified version of the app, called Lig-Aikuma, has been developed at the Université Grenoble Alpes (LIG laboratory) and implements new features such as elicitation of speech from text, images and videos. == Similar Software == Lingua Libre is an online collaborative project and tool by the Wikimedia France association, which can be used as a tool for Language Preservation. Lingua Libre enables to record words, phrases, or sentences of any language, oral (audio recording) or signed (video recording). It is a highly efficient method to record endangered languages since up to 1000 words can be recorded per hour. All the content is under Free License, and speakers of minority languages are encouraged to record their own dialects.

    Read more →
  • Level set (data structures)

    Level set (data structures)

    In computer science, a level set is a data structure designed to represent discretely sampled dynamic level sets of functions. A common use of this form of data structure is in efficient image rendering. The underlying method constructs a signed distance field that extends from the boundary, and can be used to solve the motion of the boundary in this field. == Chronological developments == The powerful level-set method is due to Osher and Sethian 1988. However, the straightforward implementation via a dense d-dimensional array of values, results in both time and storage complexity of O ( n d ) {\displaystyle O(n^{d})} , where n {\displaystyle n} is the cross sectional resolution of the spatial extents of the domain and d {\displaystyle d} is the number of spatial dimensions of the domain. === Narrow band === The narrow band level set method, introduced in 1995 by Adalsteinsson and Sethian, restricted most computations to a thin band of active voxels immediately surrounding the interface, thus reducing the time complexity in three dimensions to O ( n 2 ) {\displaystyle O(n^{2})} for most operations. Periodic updates of the narrowband structure, to rebuild the list of active voxels, were required which entailed an O ( n 3 ) {\displaystyle O(n^{3})} operation in which voxels over the entire volume were accessed. The storage complexity for this narrowband scheme was still O ( n 3 ) . {\displaystyle O(n^{3}).} Differential constructions over the narrow band domain edge require careful interpolation and domain alteration schemes to stabilise the solution. === Sparse field === This O ( n 3 ) {\displaystyle O(n^{3})} time complexity was eliminated in the approximate "sparse field" level set method introduced by Whitaker in 1998. The sparse field level set method employs a set of linked lists to track the active voxels around the interface. This allows incremental extension of the active region as needed without incurring any significant overhead. While consistently O ( n 2 ) {\displaystyle O(n^{2})} efficient in time, O ( n 3 ) {\displaystyle O(n^{3})} storage space is still required by the sparse field level set method. See for implementation details. === Sparse block grid === The sparse block grid method, introduced by Bridson in 2003, divides the entire bounding volume of size n 3 {\displaystyle n^{3}} into small cubic blocks of m 3 {\displaystyle m^{3}} voxels each. A coarse grid of size ( n / m ) 3 {\displaystyle (n/m)^{3}} then stores pointers only to those blocks that intersect the narrow band of the level set. Block allocation and deallocation occur as the surface propagates to accommodate to the deformations. This method has a suboptimal storage complexity of O ( ( n m ) 3 + m 3 n 2 ) {\displaystyle O\left((nm)3+m^{3}n^{2}\right)} , but retains the constant time access inherent to dense grids. === Octree === The octree level set method, introduced by Strain in 1999 and refined by Losasso, Gibou and Fedkiw, and more recently by Min and Gibou uses a tree of nested cubes of which the leaf nodes contain signed distance values. Octree level sets currently require uniform refinement along the interface (i.e. the narrow band) in order to obtain sufficient precision. This representation is efficient in terms of storage, O ( n 2 ) , {\displaystyle O(n^{2}),} and relatively efficient in terms of access queries, O ( log n ) . {\displaystyle O(\log \,n).} An advantage of the level method on octree data structures is that one can solve the partial differential equations associated with typical free boundary problems that use the level set method. The CASL research group has developed this line of work in computational materials, computational fluid dynamics, electrokinetics, image-guided surgery and controls. === Run-length encoded === The run-length encoding (RLE) level set method, introduced in 2004, applies the RLE scheme to compress regions away from the narrow band to just their sign representation while storing with full precision the narrow band. The sequential traversal of the narrow band is optimal and storage efficiency is further improved over the octree level set. The addition of an acceleration lookup table allows for fast O ( log ⁡ r ) {\displaystyle O(\log r)} random access, where r is the number of runs per cross section. Additional efficiency is gained by applying the RLE scheme in a dimensional recursive fashion, a technique introduced by Nielsen & Museth's similar DT-Grid. === Hash Table Local Level Set === The Hash Table Local Level Set method was introduced in 2011 by Eyiyurekli and Breen and extended in 2012 by Brun, Guittet, and Gibou, only computes the level set data in a band around the interface, as in the Narrow Band Level-Set Method, but also only stores the data in that same band. A hash table data structure is used, which provides an O ( 1 ) {\displaystyle O(1)} access to the data. However, Brun et al. conclude that their method, while being easier to implement, performs worse than a quadtree implementation. They find that as it is, [...] a quadtree data structure seems more adapted than the hash table data structure for level-set algorithms. Three main reasons for worse efficiency are listed: to obtain accurate results, a rather large band is required close to the interface, which counterbalances the absence of grid nodes far from the interface; the performances are deteriorated by extrapolation procedures on the outer edges of the local grid and the width of the band restricts the time step and slows down the method. === Point-based === Corbett in 2005 introduced the point-based level set method. Instead of using a uniform sampling of the level set, the continuous level set function is reconstructed from a set of unorganized point samples via moving least squares.

    Read more →
  • Automation

    Automation

    Automation describes a wide range of technologies that reduce human intervention in processes, mainly by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines. Automation has been achieved by various means including mechanical, hydraulic, pneumatic, electrical, electronic devices, and computers, usually in combination. Complicated systems, such as modern factories, airplanes, and ships typically use combinations of all of these techniques. The benefits of automation includes labor savings, reducing waste, savings in electricity costs, savings in material costs, and improvements to quality, accuracy, and precision. Automation includes the use of various equipment and control systems such as machinery, processes in factories, boilers, and heat-treating ovens, switching on telephone networks, steering, stabilization of ships, aircraft and other applications and vehicles with reduced human intervention. Examples range from a household thermostat controlling a boiler to a large industrial control system with tens of thousands of input measurements and output control signals. In the simplest type of an automatic control loop, a controller compares a measured value of a process with a desired set value and processes the resulting error signal to change some input to the process, in such a way that the process stays at its set point despite disturbances. This closed-loop control is an application of negative feedback to a system. The mathematical basis of control theory began in the 18th century and advanced rapidly in the 20th. The term automation, inspired by the earlier word automatic (coming from automaton), was not widely used before 1947, when Ford established an automation department. It was during this time that the industry was rapidly adopting feedback controllers, Technological advancements introduced in the 1930s revolutionized various industries significantly. The World Bank's World Development Report of 2019 shows evidence that the new industries and jobs in the technology sector outweigh the economic effects of workers being displaced by automation. Job losses and downward mobility blamed on automation have been cited as one of many factors in the resurgence of nationalist, protectionist and populist politics in the US, UK and France, among other countries since the 2010s. == History == === Early history === It was a preoccupation of the Greeks and Arabs (in the period between about 300 BC and about 1200 AD) to keep an accurate track of time. In Ptolemaic Egypt, about 270 BC, Ctesibius described a float regulator for a water clock, a device not unlike the ball and cock in a modern flush toilet. This was the earliest feedback-controlled mechanism. The appearance of the mechanical clock in the 14th century made the water clock and its feedback control system obsolete. The Persian Banū Mūsā brothers, in their Book of Ingenious Devices (850 AD), described a number of automatic controls. Two-step level controls for fluids, a form of discontinuous variable structure controls, were developed by the Banu Musa brothers. They also described a feedback controller. The design of feedback control systems up through the Industrial Revolution was by trial-and-error, together with a great deal of engineering intuition. It was not until the mid-19th century that the stability of feedback control systems was analyzed using mathematics, the formal language of automatic control theory. The centrifugal governor was invented by Christiaan Huygens in the seventeenth century, and used to adjust the gap between millstones. === Industrial Revolution in Western Europe === The introduction of prime movers, or self-driven machines advanced grain mills, furnaces, boilers, and the steam engine created a new requirement for automatic control systems including temperature regulators (invented in 1624; see Cornelius Drebbel), pressure regulators (1681), float regulators (1700) and speed control devices. Another control mechanism was used to tent the sails of windmills. It was patented by Edmund Lee in 1745. Also in 1745, Jacques de Vaucanson invented the first automated loom. Around 1800, Joseph Marie Jacquard created a punch-card system to program looms. In 1771 Richard Arkwright invented the first fully automated spinning mill driven by water power, known at the time as the water frame. An automatic flour mill was developed by Oliver Evans in 1785, making it the first completely automated industrial process. A centrifugal governor was used by Mr. Bunce of England in 1784 as part of a model steam crane. The centrifugal governor was adopted by James Watt for use on a steam engine in 1788 after Watt's partner Boulton saw one at a flour mill Boulton & Watt were building. The governor could not actually hold a set speed; the engine would assume a new constant speed in response to load changes. The governor was able to handle smaller variations such as those caused by fluctuating heat load to the boiler. Also, there was a tendency for oscillation whenever there was a speed change. As a consequence, engines equipped with this governor were not suitable for operations requiring constant speed, such as cotton spinning. Several improvements to the governor, plus improvements to valve cut-off timing on the steam engine, made the engine suitable for most industrial uses before the end of the 19th century. Advances in the steam engine stayed well ahead of science, both thermodynamics and control theory. The governor received relatively little scientific attention until James Clerk Maxwell published a paper that established the beginning of a theoretical basis for understanding control theory. === 20th century === Relay logic was introduced with factory electrification, which underwent rapid adaptation from 1900 through the 1920s. Central electric power stations were also undergoing rapid growth and the operation of new high-pressure boilers, steam turbines and electrical substations created a great demand for instruments and controls. Central control rooms became common in the 1920s, but as late as the early 1930s, most process controls were on-off. Operators typically monitored charts drawn by recorders that plotted data from instruments. To make corrections, operators manually opened or closed valves or turned switches on or off. Control rooms also used color-coded lights to send signals to workers in the plant to manually make certain changes. The development of the electronic amplifier during the 1920s, which was important for long-distance telephony, required a higher signal-to-noise ratio, which was solved by negative feedback noise cancellation. This and other telephony applications contributed to the control theory. In the 1940s and 1950s, German mathematician Irmgard Flügge-Lotz developed the theory of discontinuous automatic controls, which found military applications during the Second World War to fire control systems and aircraft navigation systems. Controllers, which were able to make calculated changes in response to deviations from a set point rather than on-off control, began being introduced in the 1930s. Controllers allowed manufacturing to continue showing productivity gains to offset the declining influence of factory electrification. Factory productivity was greatly increased by electrification in the 1920s. U.S. manufacturing productivity growth fell from 5.2%/yr 1919–29 to 2.76%/yr 1929–41. Alexander Field notes that spending on non-medical instruments increased significantly from 1929 to 1933 and remained strong thereafter. The First and Second World Wars saw major advancements in the field of mass communication and signal processing. Other key advances in automatic controls include differential equations, stability theory and system theory (1938), frequency domain analysis (1940), ship control (1950), and stochastic analysis (1941). Starting in 1958, various systems based on solid-state digital logic modules for hard-wired programmed logic controllers (the predecessors of programmable logic controllers [PLC]) emerged to replace electro-mechanical relay logic in industrial control systems for process control and automation, including early Telefunken/AEG Logistat, Siemens Simatic, Philips/Mullard/Valvo Norbit, BBC Sigmatronic, ACEC Logacec, Akkord Estacord, Krone Mibakron, Bistat, Datapac, Norlog, SSR, or Procontic systems. In 1959 Texaco's Port Arthur Refinery became the first chemical plant to use digital control. Conversion of factories to digital control began to spread rapidly in the 1970s as the price of computer hardware fell. === Significant applications === The automatic telephone switchboard was introduced in 1892 along with dial telephones. By 1929, 31.9% of the Bell system was automatic. Automatic telephone switching originally used vacuum tube amplifiers and electro-mechanical switches, which consumed a large amount of electricity. Call volume eve

    Read more →
  • Canva

    Canva

    Canva Pty Ltd. is an Australian multinational proprietary software company launched in 2013 based in Sydney, Australia. The platform provides a graphic design platform to create visual content for presentations, websites, and other digital products. Its uses include templates for presentations, posters, and social media content, as well as photo and video editing functionality. The platform uses a drag-and-drop interface designed for users without professional design training or experience. Canva operates on a freemium model and has added features such as print services and video editing tools over time. == History == === 2013–2020 === Canva was founded in Perth, Australia, by Melanie Perkins, Cliff Obrecht and Cameron Adams on 1 January 2013. One of the company's early investors was Susan Wu, an American entrepreneur. In its first year, Canva had more than 750,000 users. In 2017, the company reached profitability and had 294,000 paying customers. In January 2018, Perkins announced that the company had raised A$40 million from Sequoia Capital, Blackbird Ventures, and Felicis Ventures, and the company was valued at A$1 billion. It raised A$70 million in May 2019, followed by A$85 million in October 2019 and the launch of Canva for Enterprise. In December 2019, Canva announced Canva for Education, a free product for schools and other educational institutions intended to facilitate collaboration between students and teachers. === 2021–2025 === In June 2020, Canva announced a partnership with FedEx Office and with Office Depot the following month. As of June 2020, Canva's valuation had risen to A$6 billion, rising to A$40 billion by September 2021. In September 2021, Canva raised US$200 million, with its value peaking that year at US$40 billion. By September 2022, the valuation of the company had leveled at US$26 billion. While Canva's value declined from its 2021 peak by mid-2022, it remained one of Australia's most prominent technology companies, alongside Atlassian. In March 2022, Canva had over 75 million monthly active users. In 2023, the pair were named in the Australian Financial Review's AFR Rich List as among the 10 most wealthy people in Australia. On 7 December 2022, Canva launched Magic Write, which is the platform's AI-powered copywriting assistant. On 22 March 2023, Canva announced its new Assistant tool, which makes recommendations on graphics and styles that match the user's existing design. On 11 January 2024, Canva launched its own GPT in OpenAI's GPT Store. The company has announced it intends to compete with Google and Microsoft in the office software category with website and whiteboard products. In May 2024, the company announced the launch of Canva Enterprise, a plan designed for large organisations, alongside new tools including Work Kits, Courses and AI capabilities. In 2024, it announced a co-funded solar energy project to enhance its sustainability efforts. On 10 April 2025, Canva released Visual Suite 2. The new interface combines Canva's design and productivity tools. New features include a spreadsheets application (Canva Sheets), a generative AI coding assistant (Canva Code), a chatbot, and an updated photo editor that can modify or remove background objects. In August 2025, Canva launched a stock sale to employees, valuing the company at US$42 billion. == Acquisitions == In 2018, the company acquired presentations startup Zeetings for an undisclosed amount, as part of its expansion into the presentations space. In May 2019, the company announced the acquisitions of Pixabay and Pexels, two free stock photography sites based in Germany, which enabled Canva users to access their photos for designs. In February 2021, Canva acquired Austrian startup Kaleido.ai and the Czech-based Smartmockups. In 2022, Canva acquired Flourish, a London-based data visualization startup. In March 2024, Canva acquired UK-based Serif, the developers of the Affinity suite of graphic design software, for approximately $380 million. In August 2024, Canva acquired the AI image generation platform and startup, Leonardo AI, for an undisclosed amount. In June 2025, it was announced that Canva had acquired Australian AI marketing startup MagicBrief for an undisclosed amount. In February 2026, Canva acquired two startups: Cavalry, which specializes in animation software, and MangoAI, which focuses on improving advertising performance. In April 2026, Canva acquired Simtheory, an AI Workflow Tool, and Ortto, a marketing automation tool. == Philanthropy == Canva's co-founders, Melanie Perkins and Cliff Obrecht, have publicly stated their intention to donate a significant portion of their personal wealth to charity. In 2021, Canva started a partnership with GiveDirectly, a nonprofit organization operating in low income areas that makes unconditional cash transfers to families living in extreme poverty. Since then, the company has donated $50 million to support GiveDirectly's work across Malawi. In 2025, Canva announced an additional $100 million commitment to expand its GiveDirectly partnership. == Controversies == === Data breach === In May 2019, Canva experienced a data breach in which the data of roughly 139 million users was exposed. The exposed data included real names of users, usernames, email addresses, geographical information, and password hashes for some users. In January 2020, approximately 4 million user passwords were decrypted and shared online. Canva responded by resetting the passwords of every user who had not changed their password since the initial breach. === Russian operations === In May 2022 Canva was criticized for continuing to provide free access to its services in Russia, even after suspending payment processing in the country. Activists from the Ukrainian diaspora in Australia and others said this could be viewed as indirectly supporting Russia’s war effort. They noted the company was the only one of several major Australian firms to receive the lowest “digging in” rating on a tracker run by the Yale School of Management for failing to pull out of Russia. Canva responded that it had suspended financial transactions in Russia from March 2022 and maintained the free version to allow the continued creation and sharing of “pro-peace and anti-war” content for its 1.4 million Russian users.

    Read more →
  • Artbreeder

    Artbreeder

    Artbreeder, formerly known as Ganbreeder, is a collaborative, machine learning-based art website. Using the models StyleGAN and BigGAN, the website allows users to generate and modify images of faces, landscapes, and paintings, among other categories. == Overview == On Artbreeder, users mainly interact through the remixing - referred to as 'breeding' - of other users' images found in the publicly accessible database of images. The creation of new variations can be done by tweaking sliders on an image's page, known as "genes", which in the "Portraits" model can range from color balance to gender, facial hair, and glasses. Additionally, any image can be "crossbred" with other publicly viewable images from the database, using a slider to control how much of each image should influence the resulting "child". The site also allows for uploading new images, which the model will attempt to convert into the latent space of the network. == Notable usages == The similarly AI-driven text adventure game AI Dungeon uses Artbreeder to generate profile pictures for its users, and The Static Age's Andrew Paley has used Artbreeder to create the visuals for his music videos. Artbreeder has been used to create portraits of characters from popular novels such as Harry Potter and Twilight. They have also been used to add realistic features to ancient portraits. Artbreeder was used to create characters in the sequel to Ben Drowned with the titular villain, an AI-construct itself, created entirely using the website. == Changes to Artbreeder == ArtBreeder underwent an overhaul, introducing several features to enhance the user experience. Among these updates is the integration SD-XL, developed by stability.ai. Additionally, ArtBreeder also added a functionality known as ControlNet, which enables users to create images based on specific poses. With ControlNet, users can incorporate various poses into their AI Artworks. More features that were introduced into Artbreeder, are Pattern, which creates AI Pattern Images, Outpainting or Uncropping was also an added feature to Artbreeder, that allows the user to expand the image beyond the normal dimensions of the image. == Reception == The artwork generated by users of the website has been described as "beautiful" and "surreal," drawing comparisons to "weird, incomprehensible dreams" that "somehow touch the deep, unconscious parts of [the] mind". However, the generated faces were noted as "creepy and 'off'", and still nowhere near the quality attained by actual digital artists. Additionally, the site faced criticism for perceived confusing aspects of the AI's behavior. Jonathan Bartlett of Mind Matters News noted that "As is always the case with AI, sometimes the [gene] knobs don't work as expected and sometimes the results are... strange," while conceding that Artbreeder was still "probably the start of a new future of made-to-order stock images." Writers from Hyperallergic also took issue with perceived racial biases in the Portraits model, citing a comment from a user who faced difficulty from the neural network while attempting to darken the skin of a portrait to match a source image.

    Read more →
  • ACLU Mobile Justice

    ACLU Mobile Justice

    ACLU Mobile Justice was a video live streaming application developed for smartphones by various state chapters of the American Civil Liberties Union. It was intended to allow instant, secure video recording and transmission of interactions with, and perceived abuses by, law enforcement officers. Since its release by the ACLU of California for California residents, other versions of the app have been released for 16 other states and the District of Columbia by their ACLU chapters. It was discontinued in February 2025.

    Read more →
  • Optical sorting

    Optical sorting

    Optical sorting (sometimes called digital sorting) is the automated process of sorting solid products using cameras and/or lasers. Depending on the types of sensors used and the software-driven intelligence of the image processing system, optical sorters can recognize an object's color, size, shape, structural properties and chemical composition. The sorter compares objects to user-defined accept/reject criteria to identify and remove defective products and foreign material (FM) from the production line, or to separate product of different grades or types of materials. Optical sorters are in widespread use in the food industry worldwide, with the highest adoption in processing harvested foods such as potatoes, fruits, vegetables and nuts where it achieves non-destructive, 100 percent inspection in-line at full production volumes. The technology is also used in pharmaceutical manufacturing and nutraceutical manufacturing, tobacco processing, waste recycling and other industries. Compared to manual sorting, which is subjective and inconsistent, optical sorting helps improve product quality, maximize throughput and increase yields while reducing labor costs. == History == Optical sorting is an idea that first came out of the desire to automate industrial sorting of agricultural goods like fruits and vegetables. Before automated optical sorting technology was conceived in the 1930s, companies like Unitec were producing wooden machinery to assist in the mechanical sorting of fruit processing. In 1931, a company known as “the Electric Sorting Company” was incorporated and began the creation of the world’s first color sorters, which were being installed and used in Michigan’s bean industry by 1932. In 1937, optical sorting technology had advanced to allow for systems based on a two-color principle of selection. The next few decades saw the installation of new and improved sorting mechanisms, like gravity feed systems and the implementation of optical sorting in more agricultural industries. In the late 1960s, optical sorting began to be implemented to new industries beyond agriculture, like the sorting of ferrous and non-ferrous metals. By the 1990s, optical sorting was being used heavily in the sorting of solid wastes. With the large technological revolution happening in the late 1990s and early 2000s, optical sorters were being made more efficient via the implementation of new optical sensors, like CCD, UV, and IR cameras. Today, optical sorting is used in a wide variety of industries and, as such, is implemented with a varying selection of mechanisms to assist in that specific sorter’s task. == The sorting system == In general, optical sorters feature four major components: the feed system, the optical system, image processing software, and the separation system. The objective of the feed system is to spread products into a uniform monolayer so products are presented to the optical system evenly, without clumps, at a constant velocity. The optical system includes lights and sensors housed above and/or below the flow of the objects being inspected. The image processing system compares objects to user-defined accept/reject thresholds to classify objects and actuate the separation system. The separation system — usually compressed air for small products and mechanical devices for larger products, like whole potatoes — pinpoints objects while in-air and deflects the objects to remove into a reject chute while the good product continues along its normal trajectory. The ideal sorter to use depends on the application. Therefore, the product's characteristics and the user's objectives determine the ideal sensors, software-driven capabilities and mechanical platform. == Sensors == Optical sorters require a combination of lights and sensors to illuminate and capture images of the objects so the images can be processed. The processed images will determine if the material should be accepted or rejected. There are camera sorters, laser sorters and sorters that feature a combination of the two on one platform. Lights, cameras, lasers and laser sensors can be designed to function within visible light wavelengths as well as the infrared (IR) and ultraviolet (UV) spectrums. The optimal wavelengths for each application maximize the contrast between the objects to be separated. Cameras and laser sensors can differ in spatial resolution, with higher resolutions enabling the sorter to detect and remove smaller defects. === Cameras === Monochromatic cameras detect shades of gray from black to white and can be effective when sorting products with high-contrast defects. Sophisticated color cameras with high color resolution are capable of detecting millions of colors to better distinguish more subtle color defects. Trichromatic color cameras (also called three-channel cameras) divide light into three bands, which can include red, green and/or blue within the visible spectrum as well as IR and UV. The interaction of different materials with parts of the electromagnetic spectrum make these contrasts more evident than how they appear to the naked human eye. Coupled with intelligent software, sorters that feature cameras are capable of recognizing each object's color, size and shape; as well as the color, size, shape and location of a defect on a product. Some intelligent sorters even allow the user to define a defective product based on the total defective surface area of any given object. === Lasers === While cameras capture product information based primarily on material reflectance, lasers and their sensors are able to distinguish a material's structural properties along with their color. This structural property inspection allows lasers to detect a wide range of organic and inorganic foreign material such as insects, glass, metal, sticks, rocks and plastic; even if they are the same color as the good product. Lasers can be designed to operate within specific wavelengths of light; whether on the visible spectrum or beyond. For example, lasers can detect chlorophyll by stimulating fluorescence using specific wavelengths; which is a process that is very effective for removing foreign material from green vegetables. === Camera/laser combinations === Sorters equipped with cameras and lasers on one platform are generally capable of identifying the widest variety of attributes. Cameras are often better at recognizing color, size and shape while laser sensors identify differences in structural properties to maximize foreign material detection and removal. === Hyperspectral Imaging === Driven by the need to solve previously impossible sorting challenges, a new generation of sorters that feature multispectral and hyperspectral imaging Optical Sorters. Like trichromatic cameras, multispectral and hyperspectral cameras collect data from the electromagnetic spectrum. Unlike trichromatic cameras, which divide light into three bands, hyperspectral systems can divide light into hundreds of narrow bands over a continuous range that covers a vast portion of the electromagnetic spectrum. This opens the door for more detailed analysis that leads to a more consistent product. Using IR alone might detect some defects, but combining it with a broader range of the spectrum makes it more effective. Compared to the three data points per pixel collected by trichromatic cameras, hyperspectral cameras can collect hundreds of data points per pixel, which are combined to create a unique spectral signature (also called a fingerprint) for each object. When complemented by capable software intelligence, a hyperspectral sorter processes those fingerprints to enable sorting on the chemical composition of the product. This is an emerging area of chemometrics. == Software-driven intelligence == Once the sensors capture the object's response to the energy source, image processing is used to manipulate the raw data. The image processing extracts and categorizes information about specific features. The user then defines accept/reject thresholds that are used to determine what is good and bad in the raw data flow. The art and science of image processing lies in developing algorithms that maximize the effectiveness of the sorter while presenting a simple user-interface to the operator. Object-based recognition is a classic example of software-driven intelligence. It allows the user to define a defective product based on where a defect lies on the product and/or the total defective surface area of an object. It offers more control in defining a wider range of defective products. When used to control the sorter's ejection system, it can improve the accuracy of ejecting defective products. This improves product quality and increases yields. New software-driven capabilities are constantly being developed to address the specific needs of various applications. As computing hardware becomes more powerful, new software-driven advancements become possible. Some of these advancements enhance the effectivene

    Read more →
  • Alias Eclipse

    Alias Eclipse

    Eclipse was a professional 2D image editing program available on Silicon Graphics and Windows workstations. Designed to manipulate high-resolution images like digitized movie frames and photographs for print, it offered color correction tools, image processing effects, rudimentary paint features, and spline-based drawing and masking. == History == Eclipse was originally developed in the late 1980s by Full Color Computing, an early provider of photo retouch and color prepress software for Silicon Graphics workstations. Alias Research (later Alias Systems Corporation), a developer of professional 3D graphics applications for the SGI platform, purchased the rights to Eclipse in fall 1990. Alias developed Eclipse through the early to mid-1990s, releasing version 2.5 in 1995 with improvements to the speed of color correction, effects, and rendering. Xyvision's Contex Prepress division purchased exclusive rights to Eclipse from Alias in 1996, and released version 3.0 the following year. Eclipse was subsequently sold to German developer Form & Vision GmbH, which continued development and ported it to the Windows platform. In 1999, Form & Vision released a demo of Eclipse 3.1.3 on the SGI platform which was limited to 1600 x 1600 pixel images, then ceased development of Eclipse on the SGI platform. Eclipse was thereafter developed exclusively for the Windows platform, culminating with version 3.1.4 in 2001. In the same year the firm went bankrupt. == Features == Eclipse was designed to work with very large images that could not be manipulated in real time on contemporary computer systems due to memory limitations, and thus allowed the user to make modifications to a lower-resolution copy of the original image in "proxy mode." Brush strokes, color corrections, and other edits were saved in proxy mode, then applied to the full-size image in post processing. This method also allowed for batch processing of a high-resolution image sequence using the edits applied to the original proxy image. Other features included color correction and separation, warping, special effects, text, and shape masking. Wavelet image compression created by LuraTech was added to Eclipse 3.1.4

    Read more →
  • Hyperparameter optimization

    Hyperparameter optimization

    In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process, which must be configured before the process starts. Hyperparameter optimization determines the set of hyperparameters that yields an optimal model which minimizes a predefined loss function on a given data set. The objective function takes a set of hyperparameters and returns the associated loss. Cross-validation is often used to estimate this generalization performance, and therefore choose the set of values for hyperparameters that maximize it. == Approaches == === Grid search === The traditional method for hyperparameter optimization has been grid search, or a parameter sweep, which is simply an exhaustive searching through a manually specified subset of the hyperparameter space of a learning algorithm. A grid search algorithm must be guided by some performance metric, typically measured by cross-validation on the training set or evaluation on a hold-out validation set. Since the parameter space of a machine learner may include real-valued or unbounded value spaces for certain parameters, manually set bounds and discretization may be necessary before applying grid search. For example, a typical soft-margin SVM classifier equipped with an RBF kernel has at least two hyperparameters that need to be tuned for good performance on unseen data: a regularization constant C and a kernel hyperparameter γ. Both parameters are continuous, so to perform grid search, one selects a finite set of "reasonable" values for each, say C ∈ { 10 , 100 , 1000 } {\displaystyle C\in \{10,100,1000\}} γ ∈ { 0.1 , 0.2 , 0.5 , 1.0 } {\displaystyle \gamma \in \{0.1,0.2,0.5,1.0\}} Grid search then trains an SVM with each pair (C, γ) in the Cartesian product of these two sets and evaluates their performance on a held-out validation set (or by internal cross-validation on the training set, in which case multiple SVMs are trained per pair). Finally, the grid search algorithm outputs the settings that achieved the highest score in the validation procedure. Grid search suffers from the curse of dimensionality, but is often embarrassingly parallel because the hyperparameter settings it evaluates are typically independent of each other. === Random search === Random Search replaces the exhaustive enumeration of all combinations by selecting them randomly. This can be simply applied to the discrete setting described above, but also generalizes to continuous and mixed spaces. A benefit over grid search is that random search can explore many more values than grid search could for continuous hyperparameters. It can outperform Grid search, especially when only a small number of hyperparameters affects the final performance of the machine learning algorithm. In this case, the optimization problem is said to have a low intrinsic dimensionality. Random Search is also embarrassingly parallel, and additionally allows the inclusion of prior knowledge by specifying the distribution from which to sample. Despite its simplicity, random search remains one of the important base-lines against which to compare the performance of new hyperparameter optimization methods. === Bayesian optimization === Bayesian optimization is a global optimization method for noisy black-box functions. Applied to hyperparameter optimization, Bayesian optimization builds a probabilistic model of the function mapping from hyperparameter values to the objective evaluated on a validation set. By iteratively evaluating a promising hyperparameter configuration based on the current model, and then updating it, Bayesian optimization aims to gather observations revealing as much information as possible about this function and, in particular, the location of the optimum. It tries to balance exploration (hyperparameters for which the outcome is most uncertain) and exploitation (hyperparameters expected close to the optimum). In practice, Bayesian optimization has been shown to obtain better results in fewer evaluations compared to grid search and random search, due to the ability to reason about the quality of experiments before they are run. === Gradient-based optimization === For specific learning algorithms, it is possible to compute the gradient with respect to hyperparameters and then optimize the hyperparameters using gradient descent. The first usage of these techniques was focused on neural networks. Since then, these methods have been extended to other models such as support vector machines or logistic regression. A different approach in order to obtain a gradient with respect to hyperparameters consists in differentiating the steps of an iterative optimization algorithm using automatic differentiation. A more recent work along this direction uses the implicit function theorem to calculate hypergradients and proposes a stable approximation of the inverse Hessian. The method scales to millions of hyperparameters and requires constant memory. In a different approach, a hypernetwork is trained to approximate the best response function. One of the advantages of this method is that it can handle discrete hyperparameters as well. Self-tuning networks offer a memory efficient version of this approach by choosing a compact representation for the hypernetwork. More recently, Δ-STN has improved this method further by a slight reparameterization of the hypernetwork which speeds up training. Δ-STN also yields a better approximation of the best-response Jacobian by linearizing the network in the weights, hence removing unnecessary nonlinear effects of large changes in the weights. Apart from hypernetwork approaches, gradient-based methods can be used to optimize discrete hyperparameters also by adopting a continuous relaxation of the parameters. Such methods have been extensively used for the optimization of architecture hyperparameters in neural architecture search. === Evolutionary optimization === Evolutionary optimization is a methodology for the global optimization of noisy black-box functions. In hyperparameter optimization, evolutionary optimization uses evolutionary algorithms to search the space of hyperparameters for a given algorithm. Evolutionary hyperparameter optimization follows a process inspired by the biological concept of evolution: Create an initial population of random solutions (i.e., randomly generate tuples of hyperparameters, typically 100+) Evaluate the hyperparameter tuples and acquire their fitness function (e.g., 10-fold cross-validation accuracy of the machine learning algorithm with those hyperparameters) Rank the hyperparameter tuples by their relative fitness Replace the worst-performing hyperparameter tuples with new ones generated via crossover and mutation Repeat steps 2-4 until satisfactory algorithm performance is reached or is no longer improving. Evolutionary optimization has been used in hyperparameter optimization for statistical machine learning algorithms, automated machine learning, typical neural network and deep neural network architecture search, as well as training of the weights in deep neural networks. === Population-based === Population Based Training (PBT) learns both hyperparameter values and network weights. Multiple learning processes operate independently, using different hyperparameters. As with evolutionary methods, poorly performing models are iteratively replaced with models that adopt modified hyperparameter values and weights based on the better performers. This replacement model warm starting is the primary differentiator between PBT and other evolutionary methods. PBT thus allows the hyperparameters to evolve and eliminates the need for manual hypertuning. The process makes no assumptions regarding model architecture, loss functions or training procedures. PBT and its variants are adaptive methods: they update hyperparameters during the training of the models. On the contrary, non-adaptive methods have the sub-optimal strategy to assign a constant set of hyperparameters for the whole training. === Early stopping-based === A class of early stopping-based hyperparameter optimization algorithms is purpose-built for large search spaces of continuous and discrete hyperparameters, particularly when the computational cost to evaluate the performance of a set of hyperparameters is high. Irace implements the iterated racing algorithm, that focuses the search around the most promising configurations, using statistical tests to discard the ones that perform poorly. Another early stopping hyperparameter optimization algorithm is successive halving (SHA), which begins as a random search but periodically prunes low-performing models, thereby focusing computational resources on more promising models. Asynchronous successive halving (ASHA) further improves upon SHA's resource utilization profile by removing the need to synchronously evaluate a

    Read more →
  • Verbal overshadowing

    Verbal overshadowing

    Verbal overshadowing is a phenomenon where giving a verbal description of sensory input impairs formation of memories of that input. This was first reported by Schooler and Engstler-Schooler (1990) where it was shown that the effects can be observed across multiple domains of cognition which are known to rely on non-verbal knowledge and perceptual expertise. One example of this is memory, which has been known to be influenced by language. Seminal work by Carmichael and collaborators (1932) demonstrated that when verbal labels are connected to non-verbal forms during an individual's encoding process, it could potentially bias the way those forms are reproduced. Because of this, memory performance relying on reportable aspects of memory that encode visual forms should be vulnerable to the effects of verbalization. == Initial findings == Schooler and Engstler-Schooler (1990) were the first to report findings of verbal overshadowing. In their study, participants watched a video of a simulated robbery and were instructed to either verbally describe the robber or engage in a control task. Those who engaged in giving a verbal description were less likely to correctly identify the robber from a test lineup, compared to those who engaged in the control task. A larger effect was detected when the verbal description was provided 20, rather than 5, minutes after the video, and immediately before the test lineup. A meta-analysis by Meissner and Brigham (2001) supported the effects of verbal overshadowing, showing a small but reliably negative effect. == General effects of verbal overshadowing == The effects of verbal overshadowing have been generalized across multiple domains of cognition that are known to rely on non-verbal knowledge and perceptual expertise, such as memory. Memory has been known to be influenced by language. Seminal work by Carmichael and collaborators (1932) demonstrated that labels attached to, or associated with, non-verbal forms during memory encoding can affect the way the forms were subsequently reproduced. Because of this, memory performance that relies on reportable aspects of memory that encode visual forms should be vulnerable to the effects of verbalization. Pelizzon, Brandimonte, and Luccio (2002) found that visual memory representations appear to incorporate visual, spatial, and temporal characteristics. It is explained as follows: With the temporal code (where the only information available is the sequence of the stimuli), performance levels remain high, unless participants are required to retrieve the stimuli in a different order from that used at encoding (visual cue). In this case, performance is significantly impaired, even in the presence of a visual cue. The study showed that order information acts as a link between the two separate representations of figure and background, hence preventing verbal overshadowing at encoding (temporal component) or attenuating its influence at retrieval (spatial component).(p. 960) Hatano, Ueno, Kitagami, and Kawaguchi found that verbal overshadowing is likely to occur when participants verbally described targets in detail. Detailed verbal descriptions resulted in more frequently inaccurate descriptions that in turn created inaccurate representations in the memories of participants. Inaccuracies are also likely to occur when face recognition comes immediately after verbalization. Other forms of non-verbal knowledge affected by verbal overshadowing include the following: [Verbal overshadowing] has also been observed when participants attempt to generate descriptions of other 'difficult-to-describe' stimuli such as colors (Schooler and Engstler-Schooler, 1990) or abstract figures (Brandimonte et al., 1997), or other non-visual tasks such as wine tasting (Melcher and Schooler, 1996), decision making (Wilson and Schooler, 1991), and insight problem-solving. (p. 871) (Schooler et al., 1993) Verbalization of stimuli leads to the disruption of non-reportable processes that are necessary for achieving insight solutions, which are distinct from language processes. Schooler, Ohlsson, and Brooks (1993) found that face recognition requires information that cannot be adequately verbalized, giving rise to difficulty in describing factors in recognition judgments. Subjects were less effective in solving insight problems when compelled to put their thoughts in words, which suggests that language may interfere with thought. The verbal overshadowing effect was not seen when participants engaged in articulatory suppression. Performance was reduced in both the verbal and non-verbal description conditions. This is evidence that verbal encoding plays a role in face recognition. By testing with distracting faces presented between study and test, Lloyd-Jones and Brown (2008) suggested a dual-process approach to recognition memory took place, that verbalization influenced familiarity-based processes at first, but its effects were later seen on recollection, when discrimination between items became more difficult. == Verbal overshadowing in facial recognition == The verbal overshadowing effect can be found for facial recognition because faces are predominately processed in a holistic or configurable manner. (Tanaka & Farah, 1993; Tanaka & Sengco, 1997) Verbalizing one's memory for a face is done using a featural or analytic strategy, leading to a drift from the configurable information about the face and to impaired recognition performance. However, Fallshore & Schooler (1995) found that the verbal overshadowing effect was not found when participants described faces of races different from their own. A study by Brown and Lloyd-Jones (2003) found that there was no verbal overshadowing effect found in car descriptions; it was only seen in facial descriptions. The authors noted that descriptions were no different on any measure including accuracy. It is suggested that less expertise in verbalizing faces rather than cars invokes a stronger shift in verbal and featural processing. This supports the concept of a transfer inappropriate retrieval framework and addresses some limitations of the effect. Wickham and Swift (2006) suggested that the verbal overshadowing effect is not seen in describing all faces, and one aspect that determines this is distinctiveness. Results showed that typical faces produce verbal overshadowing, while distinctive faces did not. In studies of eyewitness reports, variation in response criteria given by participants influenced the quality of the descriptions generated and accuracy on identification task, known as the retrieval-based effect. Face recognition was also impaired when subjects described a familiar face, such as a parent, or when describing a previously seen but novel face. Dodson, Johnson, and Schooler (1997) found that recognition was also impaired when participants were provided with a description of a previously seen face, and they were able to ignore provided versus self-generated descriptions more easily. This finding of verbal overshadowing suggested that eyewitness recognition is not only affected by their own descriptions, but of descriptions heard from others, such other eyewitness testimonies. == Voice recognition == The verbal overshadowing effect has also been found to affect voice identification. Research shows that describing a non-verbal stimuli leads to a decrease in recognition accuracy. In an unpublished study by Schooler, Fiore, Melcher, and Ambadar (1996), participants listened to a tape-recorded voice, after which they were asked either to verbally describe it or to not do so, and then asked to distinguish the voice from 3 similar distractor voices. The results showed that verbal overshadowing impaired accuracy of recognition based on gut feeling, suggesting an overall verbal overshadowing for voice recognition. Due to the forensic relevance of voices heard over the telephone and harassing phone calls that are often a problem for police, Perfect, Hunt, and Harris (2002) examined the influence of three factors on accuracy and confidence in voice recognition from a line-up. They expected to find an effect, because voice represents a class of stimuli that is difficult to describe verbally. This meets Schooler et al.'s (1997) modality mismatch criterion, meaning that describing the speakers age, gender, or accent is difficult, making voice recognition susceptible to the verbal overshadowing phenomenon. It was found that the method of memory encoding had no impact on performance, and that hearing a telephone voice reduced confidence but did not affect accuracy. They also found that providing a verbal description impaired accuracy but had no effect on confidence. The data showed an effect of verbal overshadowing in voice recognition and provided yet another disassociation between confidence and performance. Although there was a difference in confidence level, witnesses were able to identify voices over the telephone as accurately as voices heard direc

    Read more →
  • Pixel shift

    Pixel shift

    Pixel shift is a method in digital cameras for producing a super-resolution image. The method works by taking several images, after each such capture moving ("shifting") the sensor to a new position. In digital colour cameras that employ pixel shift, this avoids a major limitation inherent in using Bayer pattern for obtaining colour, and instead produces an image with increased colour resolution and, assuming a static subject or additional computational steps, an image free of colour moiré. Taking this idea further, sub-pixel shifting may increase the resolution of the final image beyond that suggested by the specified resolution of the image sensor. Additionally, assuming that the various individual captures are taken at the same sensitivity, the final combined image will have less image noise than a single capture. This can be thought of as an averaging effect (for instance, in a pixel shift image composed of four individual frames with a classic Bayer pattern, every pixel in the final colour image is based on two measurements of the green channel). == List of cameras implementing pixel shift == All of the following cameras are fabricated with one imaging sensor, thus any kind of pixel shift requires a movement of the whole sensor. === Canon === Canon R5: Contains a 45 Mpixel sensor. The High-Resolution Mode shifts the sensor by one pixel to obtain a sequence of nine images that are merged into a 400 Mpixel image. === Fujifilm === Fujifilm GFX50S II: contains a 51 Mpixel sensor. The Pixel Shift Multi-Shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 16 images that are subsequently merged into a 200 Mpixel image. Fujifilm GFX100, Fujifilm GFX100 II: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image. Fujifilm GFX100S, Fujifilm GFX100S II: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image Fujifilm GFX100IR: contains a 102 Mpixel sensor. A sequence of 16 pixel shifted images are merged into a 400 Mpixel image Fujifilm X-H2: contains a 40 Mpixel sensor. A sequence of 20 shifted images are merged into a 160 Mpixel image. Fujifilm X-T5: contains a 40 Mpixel sensor. A sequence of 20 shifted images are merged into a 160 Mpixel image. === Nikon === Nikon Z8: contains a 47.5 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of up to 32 images that can be merged in Nikon's NX studio software. Nikon Zf: contains a 24 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of up to 32 images that can be merged in Nikon's NX studio software. === Olympus === Olympus OM-D E-M1 Mark II: contains a 20.4 Mpixel sensor. The High Res shot mode produces a 50 Mpixel image. Olympus OM-D E-M5 Mark II: contains a 16 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 8 images that are subsequently merged into a 40 Mpixel image. Olympus OM-D E-M5 Mark III: contains a 20.4 Mpixel sensor. The High Res shot mode shifts the imaging sensor by 0.5-pixel movements to obtain a sequence of 8 images that are subsequently merged into a 50 Mpixel image. Olympus OM-D E-M1X: contains a 20.4 Mpixel sensor. The camera sports two pixel shift mode: (a) the 80Mp Tripod mode produces an 80 Mpixel image, (b) the Handheld High Res shot mode produces a 50 Mpixel image. Olympus PEN-F: contains a 20.4 Mpixel sensor. The High Res Shot mode takes multiple images, continually shifting the position of the sensor in sub-pixel increments. Combining these images results in either a 50MP JPEG or an 80MP Raw file. ==== OM System ==== OM System OM-1: contains a 20MPix sensor. The High Res Shot mode takes multiple images, and it can be used handheld or on a tripod. Handheld it will internally produce 50 Mpix files and 80 Mpix when mounted on a tripod. OM System OM-5: contains a 20MPix sensor. The High Res Shot mode takes multiple images, and it can be used handheld or on a tripod. Handheld it will internally produce 50 Mpix files and 80 Mpix when mounted on a tripod. === Panasonic === Panasonic Lumix DC-G9: contains a 20.3 Mpixel sensor. The High Resolution Mode takes a sequence of 8 shots in quick succession between which the sensor is shifted by 0.5 pixel for each image. These are subsequently merged into an 80 Mpixel image. Panasonic Lumix DC-S1: contains a 24.2 Mpixel sensor. The High Resolution Mode takes a sequence of shots in quick succession between which the sensor is shifted by a small amount. These are subsequently merged into a 96 Mpixel image. Panasonic Lumix DC-S1R: contains a 47.3 Mpixel sensor. The High Resolution Mode shifts the imaging sensor by a small increments to obtain a sequence of 8 images that are subsequently merged into a 187 Mpixel image. Panasonic Lumix DC-S1H Panasonic Lumix DC-S5 === Pentax === Pentax K-70: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'all color data in each pixel to deliver super-high-resolution images'. Pentax KP: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'high-resolution images with more accurate colours and much finer details'. Pentax K-3 II: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'super-high-resolution images with far more truthful color reproduction and much finer details'. Pentax K-3 III: contains a 25.7 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'a cancelling out of the Bayer pattern and removal of the need for sharpness-sapping demosaicing'. Pentax K-1: contains a 36.4 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'improved detail and colour resolution'. Pentax K-1 II: contains a 36.4 Mpixel sensor. The camera sports two pixel shift mode: (a) a series of 4 tripod-stabilised images shifted by 1 pixel each are subsequently combined into a 47.3 Mpixel image, (b) a series of images taken in handheld mode are combined into a 47.3 Mpixel image that is, within limits, able to cope even with moving subjects. === Sony === Sony a6600: contains a 24.3 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into an image sporting 'all color data in each pixel to deliver super-high-resolution images'. Sony α7R III: contains a 42.4 Mpixel sensor. The pixel shift mode takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 42.4 Mpixel image with improved tonal resolution. Sony α7R IV: contains a 61 Mpixel sensor. The camera has two pixel shift modes, (a) the first takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 61 Mpixel image with improved tonal resolution, (b) the other takes a sequence of 16 shots between which the sensor is shifted by 0.5 pixel. These are subsequently merged into a 240 Mpixel image with both enhanced detail and improved tonal resolution. Sony α1: contains a 50 Mpixel sensor. The camera has two pixel shift modes, (a) the first takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a 50 Mpixel image with improved tonal resolution, (b) the other takes a sequence of 16 shots between which the sensor is shifted by 0.5 pixel. These are subsequently merged into a 200 Mpixel image with both enhanced detail and improved tonal resolution. === Hasselblad === Hasselblad H3DII: the model H3DII-39 sports a 39 Mpixel sensor, the model H3DII-50 a 50 Mpixel sensor. Both enable a pixel shift mode which takes a sequence of 4 shots between which the sensor is shifted by 1 pixel. These are subsequently merged into a single image. Hasselblad H4D series: the model H4D-200MS contains a 50 Mpixel sensor. The sensor sports 3 different pixel shift modes which take (a) a sequence of 6 shots taken at slight offsets, (b) a sequence of 4 shots between which the sensor is shifted by 1 pixel, (c) a sequence of 4 shots between which the sensor is shifted by 0.5 pixels. Images obtained by all three modes are subsequently merged into 200 Mpixel images. Hasselblad H5D series: both models H5D-50c MS and H5D-200c MS contain a 50 Mpixel sensor. This sensor sports 2 different pixel shift modes which take (a) a sequence of 6 shots with full and half pixel moveme

    Read more →
  • Aphelion (software)

    Aphelion (software)

    The Aphelion Imaging Software Suite is a software suite that includes three base products - Aphelion Lab, Aphelion Dev, and Aphelion SDK for addressing image processing and image analysis applications. The suite also includes a set of extension programs to implement specific vertical applications that benefit from imaging techniques. The Aphelion software products can be used to prototype and deploy applications, or can be integrated, in whole or in part, into a user's system as processing and visualization libraries whose components are available as both DLLs or .Net components. == History and evolution == The development of Aphelion started in 1995 as a joint project of a French company, ADCIS S.A., and an American company, Amerinex Applied Imaging, Inc. (AAI) Aphelion's image processing and analysis functions were made from operators available from the KBVision software developed and sold by Amerinex's predecessor, Amerinex Artificial Intelligence Inc. In the 1990s, the XLim software library was developed at the Center of Mathematical Morphology of Mines ParisTech, and both companies carried out its development tasks. The first version of Aphelion was completed and released in April 1996. Successive versions were released before the first official stable release in December 1996 at the Photonics East conference in Boston and the Solutions Vision show in Paris in January 1997, where at the latter it competed with Stemmer Imaging's CVB imaging toolbox. In 1998, version 2.3 of Aphelion for Windows 98 was released, and its user base was growing in both France and the United States. Version 3.0, totally rewritten to take advantage of Microsoft's then-recent ActiveX technology, was officially released in 2000. It also became available as a « Developer » version, for rapid prototyping of applications using its intuitive GUI and the macro recording capability, and a « Core » version, including the full library as a set of ActiveX components to be used by software developers, integrators and original equipment manufacturers (OEM). As AAI turned its focus to security, in 2001, ADCIS took the lead on developing Aphelion. AAI focused on millimeter wave scanners for concealed weapon detection at airports, and eventually merged with Millimetrics to become Millivision. In 2004, ADCIS specified version 4.0 of Aphelion. The set of image processing/analysis functions was rewritten one more time to be compatible with the .NET technology and the emergence of 64 bit architecture PCs. In addition, the GUI was redesigned to address two usage types: a semi-automatic use where the user is guided through the different steps of functions, and a fully automatic use where the expert user can quickly invoke imaging functions. Its first release was presented at the IPOT exhibition in Birmingham, UK the same year. During the Vision Show in Paris in October 2008, the new Aphelion Lab product was launched for users that are not specialists in image processing. It is easier to use, and only includes fewer image processing functions. It was then included in the Aphelion Image Processing Suite, consisting of Aphelion Dev (replacing Aphelion Developer), Aphelion Lab, Aphelion SDK (replacing Aphelion Core), and a set of extensions. Nowadays, ADCIS is still working on the suite, and updated versions with new extensions and functionalities continually become available from the websites of both companies. In 2015, support was added for very large images and scan microscope images (virtual slides compound into a very large JPEG 2000 image) for high throughput imaging, and new specific extensions were also added. In late 2015, ADCIS announced Aphelion's port for tablets and smartphones, for vertical applications. The name "Aphelion" comes from the astronomical term of the same name, meaning the point on a planet rotating around the Sun where it lies farthest from it, applying the term in a metaphorical sense. Unix was the operating system used on scientific workstations in the 1990s, such as on the workstations manufactured by market leader Sun Microsystems, which Windows suite Aphelion was quite removed from. == Description == Aphelion is a software suite to be used for image processing and image analysis. It supports 2D and 3D, monochrome, color, and multi-band images. It is developed by ADCIS, a French software house located in Saint-Contest, Calvados, Normandy. Aphelion is widely used in the scientific/industry community to solve basic and complex imaging applications. First, the imaging application is quickly developed from the Graphical User Interface, involving a set of functions that can be automatically recorded into a macro command. The macro languages available in Aphelion (i.e. BasicScript, Python, and C#) help to process batch of images, and prompt the user if needed for specific parameters that are applied to the imaging functions. All Aphelion image processing functions are written in C++, and the Aphelion user interface is written in C#. C++ functions can be called from the C# language thanks the use of dedicated wrappers. The main principle of image processing is to automatically process pixels of a digital image, then extract one or more objects of interest (i.e. cells in the field of biology, inclusions in the field of material science) and compute one or more measurements on those objects to quantify the image and generate a verdict (good image, image with defects, cancerous cells). In other words, starting from an image, pixels are processed by a set of successive functions or operators until only measurements are computed and used as the input of a 3rd party system or a classification software that will classify objects of interest that have been extracted during the imaging process. An acquisition system such as a digital camera, a video camera, an optical or electron microscope, a medical scanner, or a smartphone can be used to capture images. The set of values or pixels can be processed as a 1D image (1D signal), a 2D image (array of pixel values corresponding to a monochrome or color image), or a 3D image displayed using volume rendering (array of voxels in the 3D space) or displaying surfaces by using 3D rendering. A 2D color image is made of 3 value pixels (typically Red, Green, and Blue information or another color space), and a 3D image is made of monochrome, color (indexed color are often used), multispectral, or hyperspectral data. When dealing with videos, an additional band is added corresponding to temporal information. The Aphelion Software Suite includes three base products, and a set of optional extensions for specific applications: Aphelion Lab: Entry-level package for non-experts in image processing. It helps to quickly segment an image in a semi-automatic or manual ways, and compute a set of measurements computed on objects of interest that have been extracted during the segmentation process. A set of wizards guides the user from image acquisition to report generation. Aphelion Dev: Full imaging environment including over 450 functions to develop and deploy an application that involves image processing and analysis. It also includes a set of macro-command languages to automate any application to be invoked from the user interface. It also helps to run the imaging algorithm on more than one image that are stored on disk, available on the network, or captured by an acquisition device. Aphelion libraries for image processing and visualization are provided in Aphelion Dev as DLLs and .Net components. Aphelion SDK: A set of libraries to develop a stand-alone application with a custom interface based on the Aphelion libraries. This software development kit including display, processing and analysis functions that can be used by software developers and OEMs. It is provided as DLLs and .Net components. The stand-alone application is typically developed in C# on one computer, and then deployed on multiple PCs and systems. A set of optional extensions can be added to the « Aphelion Dev » product, depending on the application. An evaluation version of Aphelion can be run on a PC for 30 days. A permanent version of Aphelion is available based on a perpetual license. Upgrades are available through a maintenance agreement based on a yearly fee. Technical support is provided by the engineers who are developing the product. The goal of image processing is usually to extract object(s) of interest in an image, and then to classify them based on some characteristics such as shape, density, position, etc. Using Aphelion, this goal is achieved by performing the following tasks: Load an image from disk or acquire an image using an acquisition device. Enhance the image removing noise or modifying its contrast. Segment the image extracting objects of interest to be measured and analyzed. Typically, for simple applications, a threshold is performed to generate a binary image. Then, morphological operators are applied to clean the image and only keep obj

    Read more →
  • Application-release automation

    Application-release automation

    Application-release automation (ARA) refers to the process of packaging and deploying an application or update of an application from development, across various environments, and ultimately to production. ARA solutions must combine the capabilities of deployment automation, environment management and modeling, and release coordination. == Relationship with DevOps == ARA tools help cultivate DevOps best practices by providing a combination of automation, environment modeling and workflow-management capabilities. These practices help teams deliver software rapidly, reliably and responsibly. ARA tools achieve a key DevOps goal of implementing continuous delivery with a large quantity of releases quickly. == Relationship with deployment == ARA is more than just software-deployment automation – it deploys applications using structured release-automation techniques that allow for an increase in visibility for the whole team. It combines workload automation and release-management tools as they relate to release packages, as well as movement through different environments within the DevOps pipeline. ARA tools help regulate deployments, how environments are created and deployed, and how and when releases are deployed. == ARA Solutions == All ARA solutions must include capabilities in automation, environment modeling, and release coordination. Additionally, the solution must provide this functionality without reliance on other tools.

    Read more →
  • Image stitching

    Image stitching

    Image stitching or photo stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Commonly performed through the use of computer software, most approaches to image stitching require nearly exact overlaps between images and identical exposures to produce seamless results, although some stitching algorithms actually benefit from differently exposed images by doing high-dynamic-range imaging in regions of overlap. Some digital cameras can stitch their photos internally. == Applications == Image stitching is widely used in modern applications, such as the following: Document mosaicing Image stabilization feature in camcorders that use frame-rate image alignment High-resolution image mosaics in digital maps and satellite imagery Medical imaging Multiple-image super-resolution imaging Video stitching Object insertion == Process == The image stitching process can be divided into three main components: image registration, calibration, and blending. === Image stitching algorithms === In order to estimate image alignment, algorithms are needed to determine the appropriate mathematical model relating pixel coordinates in one image to pixel coordinates in another. Algorithms that combine direct pixel-to-pixel comparisons with gradient descent (and other optimization techniques) can be used to estimate these parameters. Distinctive features can be found in each image and then efficiently matched to rapidly establish correspondences between pairs of images. When multiple images exist in a panorama, techniques have been developed to compute a globally consistent set of alignments and to efficiently discover which images overlap one another. A final compositing surface onto which to warp or projectively transform and place all of the aligned images is needed, as are algorithms to seamlessly blend the overlapping images, even in the presence of parallax, lens distortion, scene motion, and exposure differences. === Image stitching issues === Since the illumination in two views cannot be guaranteed to be identical, stitching two images could create a visible seam. Other reasons for seams could be the background changing between two images for the same continuous foreground. Other major issues to deal with are the presence of parallax, lens distortion, scene motion, and exposure differences. In a non-ideal real-life case, the intensity varies across the whole scene, and so does the contrast and intensity across frames. Additionally, the aspect ratio of a panorama image needs to be taken into account to create a visually pleasing composite. For panoramic stitching, the ideal set of images will have a reasonable amount of overlap (at least 15–30%) to overcome lens distortion and have enough detectable features. The set of images will have consistent exposure between frames to minimize the probability of seams occurring. === Keypoint detection === Feature detection is necessary to automatically find correspondences between images. Robust correspondences are required in order to estimate the necessary transformation to align an image with the image it is being composited on. Corners, blobs, Harris corners, and differences of Gaussians of Harris corners are good features since they are repeatable and distinct. One of the first operators for interest point detection was developed by Hans Moravec in 1977 for his research involving the automatic navigation of a robot through a clustered environment. Moravec also defined the concept of "points of interest" in an image and concluded these interest points could be used to find matching regions in different images. The Moravec operator is considered to be a corner detector because it defines interest points as points where there are large intensity variations in all directions. This often is the case at corners. However, Moravec was not specifically interested in finding corners, just distinct regions in an image that could be used to register consecutive image frames. Harris and Stephens improved upon Moravec's corner detector by considering the differential of the corner score with respect to direction directly. They needed it as a processing step to build interpretations of a robot's environment based on image sequences. Like Moravec, they needed a method to match corresponding points in consecutive image frames, but were interested in tracking both corners and edges between frames. SIFT and SURF are recent key-point or interest point detector algorithms but a point to note is that SURF is patented and its commercial usage restricted. Once a feature has been detected, a descriptor method like SIFT descriptor can be applied to later match them. === Registration === Image registration involves matching features in a set of images or using direct alignment methods to search for image alignments that minimize the sum of absolute differences between overlapping pixels. When using direct alignment methods one might first calibrate one's images to get better results. Additionally, users may input a rough model of the panorama to help the feature matching stage, so that e.g. only neighboring images are searched for matching features. Since there are smaller group of features for matching, the result of the search is more accurate and execution of the comparison is faster. To estimate a robust model from the data, a common method used is known as RANSAC. The name RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative method for robust parameter estimation to fit mathematical models from sets of observed data points which may contain outliers. The algorithm is non-deterministic in the sense that it produces a reasonable result only with a certain probability, with this probability increasing as more iterations are performed. It being a probabilistic method means that different results will be obtained for every time the algorithm is run. The RANSAC algorithm has found many applications in computer vision, including the simultaneous solving of the correspondence problem and the estimation of the fundamental matrix related to a pair of stereo cameras. The basic assumption of the method is that the data consists of "inliers", i.e., data whose distribution can be explained by some mathematical model, and "outliers" which are data that do not fit the model. Outliers are considered points which come from noise, erroneous measurements, or simply incorrect data. For the problem of homography estimation, RANSAC works by trying to fit several models using some of the point pairs and then checking if the models were able to relate most of the points. The best model – the homography, which produces the highest number of correct matches – is then chosen as the answer for the problem; thus, if the ratio of number of outliers to data points is very low, the RANSAC outputs a decent model fitting the data. === Calibration === Image calibration aims to minimize differences between an ideal lens models and the camera-lens combination that was used, optical defects such as distortions, exposure differences between images, vignetting, camera response and chromatic aberrations. If feature detection methods were used to register images and absolute positions of the features were recorded and saved, stitching software may use the data for geometric optimization of the images in addition to placing the images on the panosphere. Panotools and its various derivative programs use this method. ==== Alignment ==== Alignment may be necessary to transform an image to match the view point of the image it is being composited with. Alignment, in simple terms, is a change in the coordinates system so that it adopts a new coordinate system which outputs image matching the required viewpoint. The types of transformations an image may go through are pure translation, pure rotation, a similarity transform which includes translation, rotation and scaling of the image which needs to be transformed, Affine or projective transform. Projective transformation is the farthest an image can transform (in the set of two dimensional planar transformations), where only visible features that are preserved in the transformed image are straight lines whereas parallelism is maintained in an affine transform. Projective transformation can be mathematically described as x ′ = H ⋅ x , {\displaystyle x'=H\cdot x,} where x {\displaystyle x} is points in the old coordinate system, x ′ {\displaystyle x'} is the corresponding points in the transformed image and H {\displaystyle H} is the homography matrix. Expressing the points x {\displaystyle x} and x ′ {\displaystyle x'} using the camera intrinsics ( K {\displaystyle K} and K ′ {\displaystyle K'} ) and its rotation and translation [ R t ] {\displaystyle [R\,t]} to the real-world coordinates X {\displaystyle X} and < m a t h > x {\displaystyle x} and x ′ {\displaystyle x'} ', we get Using the abo

    Read more →
  • Automation

    Automation

    Automation describes a wide range of technologies that reduce human intervention in processes, mainly by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines. Automation has been achieved by various means including mechanical, hydraulic, pneumatic, electrical, electronic devices, and computers, usually in combination. Complicated systems, such as modern factories, airplanes, and ships typically use combinations of all of these techniques. The benefits of automation includes labor savings, reducing waste, savings in electricity costs, savings in material costs, and improvements to quality, accuracy, and precision. Automation includes the use of various equipment and control systems such as machinery, processes in factories, boilers, and heat-treating ovens, switching on telephone networks, steering, stabilization of ships, aircraft and other applications and vehicles with reduced human intervention. Examples range from a household thermostat controlling a boiler to a large industrial control system with tens of thousands of input measurements and output control signals. In the simplest type of an automatic control loop, a controller compares a measured value of a process with a desired set value and processes the resulting error signal to change some input to the process, in such a way that the process stays at its set point despite disturbances. This closed-loop control is an application of negative feedback to a system. The mathematical basis of control theory began in the 18th century and advanced rapidly in the 20th. The term automation, inspired by the earlier word automatic (coming from automaton), was not widely used before 1947, when Ford established an automation department. It was during this time that the industry was rapidly adopting feedback controllers, Technological advancements introduced in the 1930s revolutionized various industries significantly. The World Bank's World Development Report of 2019 shows evidence that the new industries and jobs in the technology sector outweigh the economic effects of workers being displaced by automation. Job losses and downward mobility blamed on automation have been cited as one of many factors in the resurgence of nationalist, protectionist and populist politics in the US, UK and France, among other countries since the 2010s. == History == === Early history === It was a preoccupation of the Greeks and Arabs (in the period between about 300 BC and about 1200 AD) to keep an accurate track of time. In Ptolemaic Egypt, about 270 BC, Ctesibius described a float regulator for a water clock, a device not unlike the ball and cock in a modern flush toilet. This was the earliest feedback-controlled mechanism. The appearance of the mechanical clock in the 14th century made the water clock and its feedback control system obsolete. The Persian Banū Mūsā brothers, in their Book of Ingenious Devices (850 AD), described a number of automatic controls. Two-step level controls for fluids, a form of discontinuous variable structure controls, were developed by the Banu Musa brothers. They also described a feedback controller. The design of feedback control systems up through the Industrial Revolution was by trial-and-error, together with a great deal of engineering intuition. It was not until the mid-19th century that the stability of feedback control systems was analyzed using mathematics, the formal language of automatic control theory. The centrifugal governor was invented by Christiaan Huygens in the seventeenth century, and used to adjust the gap between millstones. === Industrial Revolution in Western Europe === The introduction of prime movers, or self-driven machines advanced grain mills, furnaces, boilers, and the steam engine created a new requirement for automatic control systems including temperature regulators (invented in 1624; see Cornelius Drebbel), pressure regulators (1681), float regulators (1700) and speed control devices. Another control mechanism was used to tent the sails of windmills. It was patented by Edmund Lee in 1745. Also in 1745, Jacques de Vaucanson invented the first automated loom. Around 1800, Joseph Marie Jacquard created a punch-card system to program looms. In 1771 Richard Arkwright invented the first fully automated spinning mill driven by water power, known at the time as the water frame. An automatic flour mill was developed by Oliver Evans in 1785, making it the first completely automated industrial process. A centrifugal governor was used by Mr. Bunce of England in 1784 as part of a model steam crane. The centrifugal governor was adopted by James Watt for use on a steam engine in 1788 after Watt's partner Boulton saw one at a flour mill Boulton & Watt were building. The governor could not actually hold a set speed; the engine would assume a new constant speed in response to load changes. The governor was able to handle smaller variations such as those caused by fluctuating heat load to the boiler. Also, there was a tendency for oscillation whenever there was a speed change. As a consequence, engines equipped with this governor were not suitable for operations requiring constant speed, such as cotton spinning. Several improvements to the governor, plus improvements to valve cut-off timing on the steam engine, made the engine suitable for most industrial uses before the end of the 19th century. Advances in the steam engine stayed well ahead of science, both thermodynamics and control theory. The governor received relatively little scientific attention until James Clerk Maxwell published a paper that established the beginning of a theoretical basis for understanding control theory. === 20th century === Relay logic was introduced with factory electrification, which underwent rapid adaptation from 1900 through the 1920s. Central electric power stations were also undergoing rapid growth and the operation of new high-pressure boilers, steam turbines and electrical substations created a great demand for instruments and controls. Central control rooms became common in the 1920s, but as late as the early 1930s, most process controls were on-off. Operators typically monitored charts drawn by recorders that plotted data from instruments. To make corrections, operators manually opened or closed valves or turned switches on or off. Control rooms also used color-coded lights to send signals to workers in the plant to manually make certain changes. The development of the electronic amplifier during the 1920s, which was important for long-distance telephony, required a higher signal-to-noise ratio, which was solved by negative feedback noise cancellation. This and other telephony applications contributed to the control theory. In the 1940s and 1950s, German mathematician Irmgard Flügge-Lotz developed the theory of discontinuous automatic controls, which found military applications during the Second World War to fire control systems and aircraft navigation systems. Controllers, which were able to make calculated changes in response to deviations from a set point rather than on-off control, began being introduced in the 1930s. Controllers allowed manufacturing to continue showing productivity gains to offset the declining influence of factory electrification. Factory productivity was greatly increased by electrification in the 1920s. U.S. manufacturing productivity growth fell from 5.2%/yr 1919–29 to 2.76%/yr 1929–41. Alexander Field notes that spending on non-medical instruments increased significantly from 1929 to 1933 and remained strong thereafter. The First and Second World Wars saw major advancements in the field of mass communication and signal processing. Other key advances in automatic controls include differential equations, stability theory and system theory (1938), frequency domain analysis (1940), ship control (1950), and stochastic analysis (1941). Starting in 1958, various systems based on solid-state digital logic modules for hard-wired programmed logic controllers (the predecessors of programmable logic controllers [PLC]) emerged to replace electro-mechanical relay logic in industrial control systems for process control and automation, including early Telefunken/AEG Logistat, Siemens Simatic, Philips/Mullard/Valvo Norbit, BBC Sigmatronic, ACEC Logacec, Akkord Estacord, Krone Mibakron, Bistat, Datapac, Norlog, SSR, or Procontic systems. In 1959 Texaco's Port Arthur Refinery became the first chemical plant to use digital control. Conversion of factories to digital control began to spread rapidly in the 1970s as the price of computer hardware fell. === Significant applications === The automatic telephone switchboard was introduced in 1892 along with dial telephones. By 1929, 31.9% of the Bell system was automatic. Automatic telephone switching originally used vacuum tube amplifiers and electro-mechanical switches, which consumed a large amount of electricity. Call volume eve

    Read more →