AI For Kids Course

AI For Kids Course — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Computer security compromised by hardware failure

    Computer security compromised by hardware failure

    Computer security compromised by hardware failure is a branch of computer security applied to hardware. The objective of computer security includes protection of information and property from theft, corruption, or natural disaster, while allowing the information and property to remain accessible and productive to its intended users. Such secret information could be retrieved by different ways. This article focus on the retrieval of data thanks to misused hardware or hardware failure. Hardware could be misused or exploited to get secret data. This article collects main types of attack that can lead to data theft. Computer security can be compromised by devices, such as keyboards, monitors or printers (thanks to electromagnetic or acoustic emanation for example) or by components of the computer, such as the memory, the network card or the processor (thanks to time or temperature analysis for example). == Devices == === Monitor === The monitor is the main device used to access data on a computer. It has been shown that monitors radiate or reflect data on their environment, potentially giving attackers access to information displayed on the monitor. ==== Electromagnetic emanations ==== Video display units radiate: narrowband harmonics of the digital clock signals; broadband harmonics of the various 'random' digital signals such as the video signal. Known as compromising emanations or TEMPEST radiation, a code word for a U.S. government programme aimed at attacking the problem, the electromagnetic broadcast of data has been a significant concern in sensitive computer applications. Eavesdroppers can reconstruct video screen content from radio frequency emanations. Each (radiated) harmonic of the video signal shows a remarkable resemblance to a broadcast TV signal. It is therefore possible to reconstruct the picture displayed on the video display unit from the radiated emission by means of a normal television receiver. If no preventive measures are taken, eavesdropping on a video display unit is possible at distances up to several hundreds of meters, using only a normal black-and-white TV receiver, a directional antenna and an antenna amplifier. It is even possible to pick up information from some types of video display units at a distance of over 1 kilometer. If more sophisticated receiving and decoding equipment is used, the maximum distance can be much greater. ==== Compromising reflections ==== What is displayed by the monitor is reflected on the environment. The time-varying diffuse reflections of the light emitted by a CRT monitor can be exploited to recover the original monitor image. This is an eavesdropping technique for spying at a distance on data that is displayed on an arbitrary computer screen, including the currently prevalent LCD monitors. The technique exploits reflections of the screen's optical emanations in various objects that one commonly finds close to the screen and uses those reflections to recover the original screen content. Such objects include eyeglasses, tea pots, spoons, plastic bottles, and even the eye of the user. This attack can be successfully mounted to spy on even small fonts using inexpensive, off-the-shelf equipment (less than 1500 dollars) from a distance of up to 10 meters. Relying on more expensive equipment allowed to conduct this attack from over 30 meters away, demonstrating that similar attacks are feasible from the other side of the street or from a close by building. Many objects that may be found at a usual workplace can be exploited to retrieve information on a computer's display by an outsider. Particularly good results were obtained from reflections in a user's eyeglasses or a tea pot located on the desk next to the screen. Reflections that stem from the eye of the user also provide good results. However, eyes are harder to spy on at a distance because they are fast-moving objects and require high exposure times. Using more expensive equipment with lower exposure times helps to remedy this problem. The reflections gathered from curved surfaces on close by objects indeed pose a substantial threat to the confidentiality of data displayed on the screen. Fully invalidating this threat without at the same time hiding the screen from the legitimate user seems difficult, without using curtains on the windows or similar forms of strong optical shielding. Most users, however, will not be aware of this risk and may not be willing to close the curtains on a nice day. The reflection of an object, a computer display, in a curved mirror creates a virtual image that is located behind the reflecting surface. For a flat mirror this virtual image has the same size and is located behind the mirror at the same distance as the original object. For curved mirrors, however, the situation is more complex. === Keyboard === ==== Electromagnetic emanations ==== Computer keyboards are often used to transmit confidential data such as passwords. Since they contain electronic components, keyboards emit electromagnetic waves. These emanations could reveal sensitive information such as keystrokes. Electromagnetic emanations have turned out to constitute a security threat to computer equipment. The figure below presents how a keystroke is retrieved and what material is necessary. The approach is to acquire the raw signal directly from the antenna and to process the entire captured electromagnetic spectrum. Thanks to this method, four different kinds of compromising electromagnetic emanations have been detected, generated by wired and wireless keyboards. These emissions lead to a full or a partial recovery of the keystrokes. The best practical attack fully recovered 95% of the keystrokes of a PS/2 keyboard at a distance up to 20 meters, even through walls. Because each keyboard has a specific fingerprint based on the clock frequency inconsistencies, it can determine the source keyboard of a compromising emanation, even if multiple keyboards from the same model are used at the same time. The four different kinds way of compromising electromagnetic emanations are described below. ===== The Falling Edge Transition Technique ===== When a key is pressed, released or held down, the keyboard sends a packet of information known as a scan code to the computer. The protocol used to transmit these scan codes is a bidirectional serial communication, based on four wires: Vcc (5 volts), ground, data and clock. Clock and data signals are identically generated. Hence, the compromising emanation detected is the combination of both signals. However, the edges of the data and the clock lines are not superposed. Thus, they can be easily separated to obtain independent signals. ===== The Generalized Transition Technique ===== The Falling Edge Transition attack is limited to a partial recovery of the keystrokes. This is a significant limitation. The GTT is a falling edge transition attack improved, which recover almost all keystrokes. Indeed, between two traces, there is exactly one data rising edge. If attackers are able to detect this transition, they can fully recover the keystrokes. ===== The Modulation Technique ===== Harmonics compromising electromagnetic emissions come from unintentional emanations such as radiations emitted by the clock, non-linear elements, crosstalk, ground pollution, etc. Determining theoretically the reasons of these compromising radiations is a very complex task. These harmonics correspond to a carrier of approximately 4 MHz which is very likely the internal clock of the micro-controller inside the keyboard. These harmonics are correlated with both clock and data signals, which describe modulated signals (in amplitude and frequency) and the full state of both clock and data signals. This means that the scan code can be completely recovered from these harmonics. ===== The Matrix Scan Technique ===== Keyboard manufacturers arrange the keys in a matrix. The keyboard controller, often an 8-bit processor, parses columns one-by-one and recovers the state of 8 keys at once. This matrix scan process can be described as 192 keys (some keys may not be used, for instance modern keyboards use 104/105 keys) arranged in 24 columns and 8 rows. These columns are continuously pulsed one-by-one for at least 3μs. Thus, these leads may act as an antenna and generate electromagnetic emanations. If an attacker is able to capture these emanations, he can easily recover the column of the pressed key. Even if this signal does not fully describe the pressed key, it still gives partial information on the transmitted scan code, i.e. the column number. Note that the matrix scan routine loops continuously. When no key is pressed, we still have a signal composed of multiple equidistant peaks. These emanations may be used to remotely detect the presence of powered computers. Concerning wireless keyboards, the wireless data burst transmission can be used as an electromagnetic trigger to detect exactly when a key is pressed, while the matrix s

    Read more →
  • Text simplification

    Text simplification

    Text simplification is an aspect of natural language processing that involves modifying, organizing, or categorizing existing text to make it easier to understand while retaining its original meaning. This process is essential in today's world, where communication is increasingly complex due to advancements in science, technology, and media. Human languages are inherently intricate, with extensive vocabularies and complex structures that can be challenging for machines to handle efficiently. Researchers have found that semantic compression techniques can help streamline and simplify text by reducing linguistic diversity and simplifying the vocabulary used in a given context. == Example == Text simplification involves modifying complex sentences into simpler ones to enhance readability and comprehension. Siddharthan (2006) provides an example to illustrate this process. The original sentence contains multiple clauses and phrases, which can be broken down into simpler sentences for better understanding. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today. An approach to text simplification involves lexical simplification via lexical substitution, a process that replaces complex words with simpler synonyms. Identifying complex words is a challenge addressed by machine learning classifiers trained on labeled data. Researchers have found that asking labelers to sort words by complexity levels yields more consistent results than the traditional method of categorizing words as simple or complex.

    Read more →
  • SimSimi

    SimSimi

    SimSimi is an artificial intelligence conversation program created in 2002 by ISMaker. It grows its artificial intelligence day by day assisted by a feature that allows users to teach it to respond correctly. SimSimi, pronounced as "shim-shimi", is from a Korean word simsim (심심) which means "bored". It has an application designed for Android, Windows Phone and iOS. The application was banned in Thailand in 2012 after users taught it to make responses containing profanity, and to criticise leading politicians. In April 2018, SimSimi was suspended in Brazil due to accusations of sending inappropriate messages, such as sexual language, bullying and even death threats, being labeled as "dangerous" mainly due to its popularity among children, and according to its developer, the suspension of the app in the country "was inevitable because the SimSimi app, at least in the last few days, had a significant negative social impact in Brazil.”

    Read more →
  • Stixel

    Stixel

    In computer vision, a stixel (portmanteau of "stick" and "pixel") is a superpixel representation of depth information in an image, in the form of a vertical stick that approximates the closest obstacles within a certain vertical slice of the scene. Introduced in 2009, stixels have applications in robotic navigation and advanced driver-assistance systems, where they can be used to define a representation of robotic environments and traffic scenes with a medium level of abstraction. == Definition == One of the problems of scene understanding in computer vision is to determine horizontal freespace around the camera, where the agent can move, and the vertical obstacles delimiting it. An image can be paired with depth information (produced e.g. from stereo disparity, lidar, or monocular depth estimation), allowing a dense tridimensional reconstruction of the observed scene. One drawback of dense reconstruction is the large amount of data involved, since each pixel in the image is mapped to an element of a point cloud. Vision problems characterised by planar freespace delimited by mostly vertical obstacles, such as traffic scenes or robotic navigation, can benefit from a condensed representation that allows to save memory and processing time. Stixels are thin vertical rectangles representing a slice of a vertical surface belonging to the closest obstacle in the observed scene. They allow to dramatically reduce the amount of information needed to represent a scene in such problems. A stixel is characterised by three parameters: vertical coordinate of the bottom, height of the stick, and depth. Stixels have fixed width, with each stixel spanning over a certain number of image columns, allowing downsampling of the horizontal image resolution. In the original formulation, each column of the image would contain at most one stixel, and later extensions were developed to allow multiple stixels on each column, allowing to represent multiple objects at different distances. == Stixel estimation == The input to stixel estimation is a dense depth map, that can be computed from stereo disparity or other means. The original approach computes an occupancy grid that can be segmented to estimate the freespace, with dynamic programming providing an efficient method to find an optimal segmentation. Alternative approaches can be used instead of occupancy grid mapping, such as manifold-based methods. The freespace boundary provides the base points of the obstacles at closest longitudinal distance, however multiple objects at different distances might appear in each column of the image. To fully define the obstacles, their height should be estimated, and this is accomplished by segmenting the depth of the object from the depth of the background. A membership function over the pixels can be defined based on the depth value, where the membership represents the confidence of a pixel belonging to the closest vertical obstacle or to the background, and a cut separating the obstacles from the background can again be computed effectively with dynamic programming. Once both the freespace and the obstacle height are known, the stixels can be estimated by fusing the information over the columns spanned by each stixel, and finally a refined depth of the stixel can be estimated via model fitting over the depth of the pixels covered by the stixel, possibly paired with confidence information (e.g. disparity confidence produced by methods such as semi-global matching).

    Read more →
  • Advanced automation functions

    Advanced automation functions

    In automation production technology the actions performed by an automated process are executed by a program of instructions which is run during a work cycle. To execute work cycle programs, an automated system should be available to execute these advanced functions. == Safety monitoring == If there is a need for workers in an automated system, a safety monitoring is required for the occupational safety and health of the workers. In a safety monitoring various steps can take place including a complete stop of the system, sounding an alarm or reducing the operating speed. Usually, limiting switches are sensors like temperature probes, heat and smoke detectors or pressure sensitive floor pads. == Maintenance and repair diagnostics == There are three modes of operations which are used in a cycle of maintenance and repair diagnostics: status monitoring, failure diagnostics and recommendation of the repair procedure. In the status monitoring mode, the current system status is displayed. The failure diagnostics mode takes place when a failure occurs. The system will then suggest an adequate repair procedure to a team of experts. == Error detection and recovery == The error detection mode is a step to determine if and when a failure occurs in automated system. The possible errors can be divided into three categories. random errors, systematic errors and aberrations. While in the error recovery mode, remedy actions take place for all detected errors.

    Read more →
  • GeneRIF

    GeneRIF

    A GeneRIF or Gene Reference Into Function is a short (255 characters or fewer) statement about the function of a gene. GeneRIFs provide a simple mechanism for allowing scientists to add to the functional annotation of genes described in the Entrez Gene database. In practice, function is constructed quite broadly. For example, there are GeneRIFs that discuss the role of a gene in a disease, GeneRIFs that point the viewer towards a review article about the gene, and GeneRIFs that discuss the structure of a gene. However, the stated intent is for GeneRIFs to be about gene function. Currently over half a million geneRIFs have been created for genes from almost 1000 different species. GeneRIFs are always associated with specific entries in the Entrez Gene database. Each GeneRIF has a pointer to the PubMed ID (a type of document identifier) of a scientific publication that provides evidence for the statement made by the GeneRIF. GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence. GeneRIFs are usually produced by NCBI indexers, but anyone may submit a GeneRIF. To be processed, a valid Gene ID must exist for the specific gene, or the Gene staff must have assigned an overall Gene ID to the species. The latter case is implemented via records in Gene with the symbol NEWENTRY. Once the Gene ID is identified, only three types of information are required to complete a submission: a concise phrase describing a function or functions (less than 255 characters in length, preferably more than a restatement of the title of the paper); a published paper describing that function, implemented by supplying the PubMed ID of a citation in PubMed; a valid e-mail address (which will remain confidential). == Example == Here are some GeneRIFs taken from Entrez Gene for GeneID 7157, the human gene TP53. The PubMed document identifiers have been omitted from the examples. Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters. p53 and c-erbB-2 may have independent role in carcinogenesis of gall bladder cancer Degradation of endogenous HIPK2 depends on the presence of a functional p53 protein. p53 codon 72 alleles influence the response to anticancer drugs in cells from aged people by regulating the cell cycle inhibitor p21WAF1 Logistic regression analysis showed p53 and COX-2 as dependent predictors in pancreatic carcinogenesis, and a reciprocal relationship to neoplastic progression between p53 and COX-2. GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.

    Read more →
  • Nanosemantics

    Nanosemantics

    Nanosemantics Lab is a Russian IT company specializing in natural language processing (NLP), computer vision (CV), speech technologies (ASR/TTS) and creation of interactive dialog interfaces, particularly chatbots and virtual assistants, based on artificial intelligence (AI). The company uses neural network platforms, including its own-made platform PuzzleLib which works on Russian-made microprocessor architecture Elbrus and Russia-based Astra Linux operating system. The company was founded in 2005 by Igor Ashmanov and Natalya Kaspersky. == Profile == The company was one of the first on Russian market to develop dialog interfaces for different branches of businesses, as well as to support community of AI developers. The company's most demanded product, as for beginning of the 2020s, is the automated "online advisers", functioning as chat bots, made for helping customers with usage of commercial products. In 2009 the company released an online service called iii.ru, where visitors were able to create their own AI-based virtual personalities entitles "infs" (for free). A visitor was able to train its own "inf" and let them chat to other "live" visitors as well with other "infs". More than 2.3 million of "infs" were created and trained by visitors over several years. Nanosemantics Lab maintains its own linguistic programming language for AI development called Dialog Language (DL). Popular social networks and instant messaging services may be used as base platforms. Nanosemantics' AI bots support different types of businesses: banks and financial services, telecommunications, retail, travel and automobile industry, home appliances production, etc. Among its solutions, Nanosemantics lists projects for various companies and institutions, among them VTB, Beeline, MTS, Sberbank, Higher School of Economics, Webmoney, Gazpromneft, Rostelecom, Ford Motors, Ministry of Health of the Russian Federation and others. The company uses the term "inf" for naming its numerous types of chat bots. The term was coined by co-founder Igor Ashmanov, head of Ashmanov & Partners. A 2014 scholarly research at Higher School of Economics, called "Basics of Business Informatics", states that such "infs", when used at business, may lower load on employees, collect statistics useful for understanding market demand and also may increase customer loyalty by providing fast and informative answers due to usage of large databases. The same research describes Nanosemantics' project for Russian branch of Ford Motors company, when AI capabilities were used for promoting the car model Ford Kuga. The research pointed out that within 2 months since beginning, the promo-website conducted 47774 talks of visitors with the specialized "inf", which indicated several hundred thousand of questions and the longest chat lasted for 3 hours 10 minutes. One-year promo campaign showed that 28.6% of people who made pre-orders talked to an "inf". In 2016 Nanosemantics launched a SaaS platform aimed at creating customized virtual assistants by users. The company's flagship product is considered to be Dialog Operating System (DialogOS), a professional corporate platform for creating intellectual voice and textual bots. It has its own linguistic programming language for creation of flexible scenarios and ready-studied neural natural language processing modules that are able to understand human interlocutors. In 2021 the company presented technology called NLab Speech ASR which contains a set of neural-networking algorithms for processing audio signals and analysis of texts that were trained and calibrated using speech-based big data marked up manually. The technology allows speed of processing of data up to "6 real-time factor" and precision values in noisy audio data may exceed 82%. In March 2022 the technology was included in Russia's Joint Registry for Russian Programs for Computers and Databases. As well, another technology was included: NLab Speech TTS, which is text-to-speech system that produces synthesized speech from printed text. == Joint projects == Nanosemantics participates in Ashmanov & Partners' projects related to AI. Since 2014, it helps in development of hardware "personal assistant" called Lexy, a solution similar to Amazon Alexa and the analogues. In August 2019 it was announced that Nanosemantics is going to participate in creation of open operating system for creating automated voice assistants. The project was called SOVA (Smart Open Virtual Assistant) and received investment of 300 million roubles (~$4,6 million) from Russian state-maintained National Technological Initiative. The company maintains long-term partnerships with Skolkovo Innovation Center (resident of IT cluster), branch association "Neuronet" and Yandex. Together with USA-based startup Remedy Logic, Nanosemantics has developed a medical diagnostic system for finding, using AI, spinal pathologies in tomography images of human bodies. Among them: central, foraminal and lateral lumbar stenosis, hernias, arthrosis. The system offers options of treatment. Since August 2021 the company is the resident of Technology Valley of Moscow State University. Also in 2021, Nanosemantics became a member of Committee on Artificial Intelligence within the Russian Association of Software Developers "Native Soft". The company states as one of its missions support of initiatives aimed at preservation and development of the Russian language. In May 2021, together with Pushkin Institute, the company created a chat bot called Phil, that explains to Russian people meaning of different Russian neologisms, and offers synonyms for them. Bot's vocabulary contains more than 500 neologisms, as well the bot can give advice on jargonisms and other types of specific words. Also in 2021, Nanosemanics Lab has signed the first-ever Russian "Codex of ethics of artificial intelligence". It establishes guidelines for ethical behavior of businesses that implement AI-based solutions. === IT contests === The company regularly organizes All-Russian Turing Test competitions for IT developers. Some of these events are co-organized with Microsoft. During the competitions, judges randomly choose virtual interlocutor and have a short conversation with them. They have to determine if a human or a machine is talking to them. An interlocutor may be either a bot or its human creator or operator. The results are measured in per cent of judges that were successfully convinced by a machine that it was a human. In 2021 Nanosemantics took part in federal project "Artificial Intelligence" by National Technological Initiative. In December 2021 the company together with state enterprise "Resource Center of Universal Design and Rehabilitation Technologies" (RCUD-RT) held an all-Russian hackathon aimed at development of AI solutions for medicine. During 3 days, participants created several training programs for patients with speech disorders. In April 2022, another hackathon by Nanosemantics was held together with MIREA – Russian Technological University. Students were participating and trying to generate algorithms for voice deepfakes. 17 teams contested in creation of software that generated artificial voice of a certain person. == Recognition == Since its foundation, Nanosemantics Lab has received a number of recognitions and awards. Among them are several professional ROTOR awards for the website iii.ru (created in 2009). The website gives the general public the means to create and train virtual assistants, which can then be used on a website or integrated into social networks. In 2013, a virtual assistant called Dana, created for Beeline Kazakhstan, was awarded with professional prize "Crystal Headset" in nomination "the best applying of technology". In 2015, the RBTH international media service included Nanosemantics in its list of "Top 50 Startups" in Russia. In 2016, the company received Russian state-maintained award called Runet Prize in two nominations: "State and Society" and "Technology and Innovation". In 2021, in Velikiy Novgorod, Nanosemantics team has won a hackathon aimed at finding means of discovering corruption schemes in Russian laws. In February 2022 the company won another contest by National Technological Initiative, called "Prochtenie", aimed at creation of AI systems for checking schoolchildren's school essays. The Nanosemantics team was awarded 20 million rubles for "overcoming technological barrier" in contest dedicated to English language, and 12 million for 1st place in special nomination "Structure" in Russian-language essay contest.

    Read more →
  • Scale space

    Scale space

    Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theory for handling image structures at different scales, by representing an image as a one-parameter family of smoothed images, the scale-space representation, parametrized by the size of the smoothing kernel used for suppressing fine-scale structures. The parameter t {\displaystyle t} in this family is referred to as the scale parameter, with the interpretation that image structures of spatial size smaller than about t {\displaystyle {\sqrt {t}}} have largely been smoothed away in the scale-space level at scale t {\displaystyle t} . The main type of scale space is the linear (Gaussian) scale space, which has wide applicability as well as the attractive property of being possible to derive from a small set of scale-space axioms. The corresponding scale-space framework encompasses a theory for Gaussian derivative operators, which can be used as a basis for expressing a large class of visual operations for computerized systems that process visual information. This framework also allows visual operations to be made scale invariant, which is necessary for dealing with the size variations that may occur in image data, because real-world objects may be of different sizes and in addition the distance between the object and the camera may be unknown and may vary depending on the circumstances. == Definition == The notion of scale space applies to signals of arbitrary numbers of variables. The most common case in the literature applies to two-dimensional images, which is what is presented here. Consider a given image f {\displaystyle f} where f ( x , y ) {\displaystyle f(x,y)} is the greyscale value of the pixel at position ( x , y ) {\displaystyle (x,y)} . The linear (Gaussian) scale-space representation of f {\displaystyle f} is a family of derived signals L ( x , y ; t ) {\displaystyle L(x,y;t)} defined by the convolution of f ( x , y ) {\displaystyle f(x,y)} with the two-dimensional Gaussian kernel g ( x , y ; t ) = 1 2 π t e − ( x 2 + y 2 ) / 2 t {\displaystyle g(x,y;t)={\frac {1}{2\pi t}}e^{-(x^{2}+y^{2})/2t}\,} such that L ( ⋅ , ⋅ ; t ) = g ( ⋅ , ⋅ ; t ) ∗ f ( ⋅ , ⋅ ) , {\displaystyle L(\cdot ,\cdot ;t)\ =g(\cdot ,\cdot ;t)f(\cdot ,\cdot ),} where the semicolon in the argument of L {\displaystyle L} implies that the convolution is performed only over the variables x , y {\displaystyle x,y} , while the scale parameter t {\displaystyle t} after the semicolon just indicates which scale level is being defined. This definition of L {\displaystyle L} works for a continuum of scales t ≥ 0 {\displaystyle t\geq 0} , but typically only a finite discrete set of levels in the scale-space representation would be actually considered. The scale parameter t = σ 2 {\displaystyle t=\sigma ^{2}} is the variance of the Gaussian filter and as a limit for t = 0 {\displaystyle t=0} the filter g {\displaystyle g} becomes an impulse function such that L ( x , y ; 0 ) = f ( x , y ) , {\displaystyle L(x,y;0)=f(x,y),} that is, the scale-space representation at scale level t = 0 {\displaystyle t=0} is the image f {\displaystyle f} itself. As t {\displaystyle t} increases, L {\displaystyle L} is the result of smoothing f {\displaystyle f} with a larger and larger filter, thereby removing more and more of the details that the image contains. Since the standard deviation of the filter is σ = t {\displaystyle \sigma ={\sqrt {t}}} , details that are significantly smaller than this value are to a large extent removed from the image at scale parameter t {\displaystyle t} , see the following figures and for graphical illustrations. === Why a Gaussian filter? === When faced with the task of generating a multi-scale representation one may ask: could any filter g of low-pass type and with a parameter t which determines its width be used to generate a scale space? The answer is no, as it is of crucial importance that the smoothing filter does not introduce new spurious structures at coarse scales that do not correspond to simplifications of corresponding structures at finer scales. In the scale-space literature, a number of different ways have been expressed to formulate this criterion in precise mathematical terms. The conclusion from several different axiomatic derivations that have been presented is that the Gaussian scale space constitutes the canonical way to generate a linear scale space, based on the essential requirement that new structures must not be created when going from a fine scale to any coarser scale. Conditions, referred to as scale-space axioms, that have been used for deriving the uniqueness of the Gaussian kernel include linearity, shift invariance, semi-group structure, non-enhancement of local extrema, scale invariance and rotational invariance. In the works, the uniqueness claimed in the arguments based on scale invariance has been criticized, and alternative self-similar scale-space kernels have been proposed. The Gaussian kernel is, however, a unique choice according to the scale-space axiomatics based on causality or non-enhancement of local extrema. === Alternative definition === Equivalently, the scale-space family can be defined as the solution of the diffusion equation (for example in terms of the heat equation), ∂ t L = 1 2 ∇ 2 L , {\displaystyle \partial _{t}L={\frac {1}{2}}\nabla ^{2}L,} with initial condition L ( x , y ; 0 ) = f ( x , y ) {\displaystyle L(x,y;0)=f(x,y)} . This formulation of the scale-space representation L means that it is possible to interpret the intensity values of the image f as a "temperature distribution" in the image plane and that the process that generates the scale-space representation as a function of t corresponds to heat diffusion in the image plane over time t (assuming the thermal conductivity of the material equal to the arbitrarily chosen constant ⁠1/2⁠). Although this connection may appear superficial for a reader not familiar with differential equations, it is indeed the case that the main scale-space formulation in terms of non-enhancement of local extrema is expressed in terms of a sign condition on partial derivatives in the 2+1-D volume generated by the scale space, thus within the framework of partial differential equations. Furthermore, a detailed analysis of the discrete case shows that the diffusion equation provides a unifying link between continuous and discrete scale spaces, which also generalizes to nonlinear scale spaces, for example, using anisotropic diffusion. Hence, one may say that the primary way to generate a scale space is by the diffusion equation, and that the Gaussian kernel arises as the Green's function of this specific partial differential equation. == Motivations == The motivation for generating a scale-space representation of a given data set originates from the basic observation that real-world objects are composed of different structures at different scales. This implies that real-world objects, in contrast to idealized mathematical entities such as points or lines, may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a computer vision system analysing an unknown scene, there is no way to know a priori what scales are appropriate for describing the interesting structures in the image data. Hence, the only reasonable approach is to consider descriptions at multiple scales in order to be able to capture the unknown scale variations that may occur. Taken to the limit, a scale-space representation considers representations at all scales. Another motivation to the scale-space concept originates from the process of performing a physical measurement on real-world data. In order to extract any information from a measurement process, one has to apply operators of non-infinitesimal size to the data. In many branches of computer science and applied mathematics, the size of the measurement operator is disregarded in the theoretical modelling of a problem. The scale-space theory on the other hand explicitly incorporates the need for a non-infinitesimal size of the image operators as an integral part of any measurement as well as any other operation that depends on a real-world measurement. There is a close link between scale-space theory and biological vision. Many scale-space operations show a high degree of similarity with receptive field profiles recorded from the mammalian retina and the first stages in the visual cortex. In these respects, the scale-space framework can be seen as a theoretically well-founded paradigm for early vision, which in addition has been thoroughly tested by algorithms and experiments. == Gaussian derivatives == At any scale in scale space, we c

    Read more →
  • Gallery software

    Gallery software

    Gallery software is software that helps the user publish or share photos, pictures, videos or other digital media. Most galleries are located on Web servers, where users are allowed to register and publish their pictures. Gallery software usually features automatic image resizing, allows digital media be categorized into sets, and allows comments. == Types == Early digital media publishing and sharing was done with imageboards. The boards are by topics, sometimes called "chan". Each discussion in a "chan" are started with a piece of digital media, and follow-up discussions can contain another piece too. Software works in this way: Futallaby, Danbooru. Traditionally, galleries are managed. An administrator maintains a set of or hierarchy of albums. The users can upload their digital media in one of the existing albums defined by an administrator, or create their own albums. The users with sufficient permission can re-categorise the digital media others uploaded. Often, the site's administrator can define which album the users are allowed to categorise their media into, or delete other user's content. Examples are open source galleries Coppermine, Gallery Project. There are decentralised gallery software that does not have an administrator for managing contents. Pinterest, Flickr and DeviantArt has been successful with this model. Open source gallery software MediaGoblin works in this way. Each user can create their own "collections", to categorise theirs or other users' media. However users cannot put media into other user's collections. Each user's category is separate. There is no centralised theme or hierarchy for the media.

    Read more →
  • GPTs

    GPTs

    GPTs are custom versions of ChatGPT with added instructions and extra knowledge. GPTs can be used and created from the GPT Store. Any user can easily create them without any programming knowledge. GPTs can be tailored for specific writing styles, topics, or tasks. The ability to create GPTs was introduced in November 2023, and by January 2024, more than 3 million GPTs had been published. == Features and uses == GPTs can be configured to answer complex questions in specific fields, solve problems, provide image-based information, or create digital content. They can be programmed as educational tools, purchasing guides, or technical advisors, as well as for many others applications. GPTs are accessed from the GPT Store section of the ChatGPT web page. The “Explore GPT” link opens the store where the most popular GPTs in each section are highlighted. The GPTs are organized by categories. The store also uses a rating system based on user experiences similar to that used by other app stores such as Apple's App Store or Google Play. Those with the best ratings appear at the top of each category. According to La Vanguardia, the most popular categories are: Personal assistants Learning to program Image generation Creative writing Gaming Entertainment It is expected that in the future the creators of GPTs will be able to monetize them. Companies like Moderna are using GPTs to assist in various specific business tasks. The company has created 750 GPTs for its own internal use. == Configuration == Creating GPTs does not require prior programming knowledge. Free users can use existing GPTs but cannot create their own. Paying subscribers can use the editor on the ChatGPT site to configure the GPT's name, image and description, instructions and access to APIs, along with visibility options. == Criticism == The implementation and use of GPTs has not been without criticism. The GPT Store has been criticized for the proliferation of low-quality GPTs and spam due to a lack of effective moderation. There are also concerns about data privacy and security, as GPTs may collect and use personal information in ways that are not always transparent to users.

    Read more →
  • Stixel

    Stixel

    In computer vision, a stixel (portmanteau of "stick" and "pixel") is a superpixel representation of depth information in an image, in the form of a vertical stick that approximates the closest obstacles within a certain vertical slice of the scene. Introduced in 2009, stixels have applications in robotic navigation and advanced driver-assistance systems, where they can be used to define a representation of robotic environments and traffic scenes with a medium level of abstraction. == Definition == One of the problems of scene understanding in computer vision is to determine horizontal freespace around the camera, where the agent can move, and the vertical obstacles delimiting it. An image can be paired with depth information (produced e.g. from stereo disparity, lidar, or monocular depth estimation), allowing a dense tridimensional reconstruction of the observed scene. One drawback of dense reconstruction is the large amount of data involved, since each pixel in the image is mapped to an element of a point cloud. Vision problems characterised by planar freespace delimited by mostly vertical obstacles, such as traffic scenes or robotic navigation, can benefit from a condensed representation that allows to save memory and processing time. Stixels are thin vertical rectangles representing a slice of a vertical surface belonging to the closest obstacle in the observed scene. They allow to dramatically reduce the amount of information needed to represent a scene in such problems. A stixel is characterised by three parameters: vertical coordinate of the bottom, height of the stick, and depth. Stixels have fixed width, with each stixel spanning over a certain number of image columns, allowing downsampling of the horizontal image resolution. In the original formulation, each column of the image would contain at most one stixel, and later extensions were developed to allow multiple stixels on each column, allowing to represent multiple objects at different distances. == Stixel estimation == The input to stixel estimation is a dense depth map, that can be computed from stereo disparity or other means. The original approach computes an occupancy grid that can be segmented to estimate the freespace, with dynamic programming providing an efficient method to find an optimal segmentation. Alternative approaches can be used instead of occupancy grid mapping, such as manifold-based methods. The freespace boundary provides the base points of the obstacles at closest longitudinal distance, however multiple objects at different distances might appear in each column of the image. To fully define the obstacles, their height should be estimated, and this is accomplished by segmenting the depth of the object from the depth of the background. A membership function over the pixels can be defined based on the depth value, where the membership represents the confidence of a pixel belonging to the closest vertical obstacle or to the background, and a cut separating the obstacles from the background can again be computed effectively with dynamic programming. Once both the freespace and the obstacle height are known, the stixels can be estimated by fusing the information over the columns spanned by each stixel, and finally a refined depth of the stixel can be estimated via model fitting over the depth of the pixels covered by the stixel, possibly paired with confidence information (e.g. disparity confidence produced by methods such as semi-global matching).

    Read more →
  • Pandorabots

    Pandorabots

    Pandorabots, Inc. is an artificial intelligence company that runs a web service for building and deploying chatbots. Pandorabots implements and supports development of the Artificial Intelligence Markup Language and makes portions of its code accessible for free. The Pandorabots Platform is "one of the oldest and largest chatbot hosting services in the world", allowing creation of virtual agents to hold human-like text or voice chats with consumers. The platform is written in Allegro Common LISP. == Use Cases == Common use cases include advertising, virtual assistance, e-learning, entertainment and education. The platform has also been used by academics and universities use the platform for teaching and research.

    Read more →
  • List of assembly software and tools

    List of assembly software and tools

    This is a list of assembly software and tools, including software used for assembly language programming, machine code generation, disassembly, debugging, binary analysis, reverse engineering, and instruction-set simulation. == Assemblers and machine-code generators == == Disassemblers and binary-analysis tools == == Debuggers with assembly-level features == == Educational IDEs, simulators and emulators == == Portable and intermediate assembly-like languages == == Assembly language families == Assembly language is not a single programming language, but a family of low-level languages associated with particular instruction set architectures and processor families. Examples include:

    Read more →
  • GPTs

    GPTs

    GPTs are custom versions of ChatGPT with added instructions and extra knowledge. GPTs can be used and created from the GPT Store. Any user can easily create them without any programming knowledge. GPTs can be tailored for specific writing styles, topics, or tasks. The ability to create GPTs was introduced in November 2023, and by January 2024, more than 3 million GPTs had been published. == Features and uses == GPTs can be configured to answer complex questions in specific fields, solve problems, provide image-based information, or create digital content. They can be programmed as educational tools, purchasing guides, or technical advisors, as well as for many others applications. GPTs are accessed from the GPT Store section of the ChatGPT web page. The “Explore GPT” link opens the store where the most popular GPTs in each section are highlighted. The GPTs are organized by categories. The store also uses a rating system based on user experiences similar to that used by other app stores such as Apple's App Store or Google Play. Those with the best ratings appear at the top of each category. According to La Vanguardia, the most popular categories are: Personal assistants Learning to program Image generation Creative writing Gaming Entertainment It is expected that in the future the creators of GPTs will be able to monetize them. Companies like Moderna are using GPTs to assist in various specific business tasks. The company has created 750 GPTs for its own internal use. == Configuration == Creating GPTs does not require prior programming knowledge. Free users can use existing GPTs but cannot create their own. Paying subscribers can use the editor on the ChatGPT site to configure the GPT's name, image and description, instructions and access to APIs, along with visibility options. == Criticism == The implementation and use of GPTs has not been without criticism. The GPT Store has been criticized for the proliferation of low-quality GPTs and spam due to a lack of effective moderation. There are also concerns about data privacy and security, as GPTs may collect and use personal information in ways that are not always transparent to users.

    Read more →
  • Deep image prior

    Deep image prior

    Deep image prior is a type of convolutional neural network used to enhance a given image with no prior training data other than the image itself. A neural network is randomly initialized and used as prior to solve inverse problems such as noise reduction, super-resolution, and inpainting. Image statistics are captured by the structure of a convolutional image generator rather than by any previously learned capabilities. == Method == === Background === Inverse problems such as noise reduction, super-resolution, and inpainting can be formulated as the optimization task x ∗ = m i n x E ( x ; x 0 ) + R ( x ) {\displaystyle x^{}=min_{x}E(x;x_{0})+R(x)} , where x {\displaystyle x} is an image, x 0 {\displaystyle x_{0}} a corrupted representation of that image, E ( x ; x 0 ) {\displaystyle E(x;x_{0})} is a task-dependent data term, and R(x) is the regularizer. Deep neural networks learn a generator/decoder x = f θ ( z ) {\displaystyle x=f_{\theta }(z)} which maps a random code vector z {\displaystyle z} to an image x {\displaystyle x} . The image corruption method used to generate x 0 {\displaystyle x_{0}} is selected for the specific application. === Specifics === In this approach, the R ( x ) {\displaystyle R(x)} prior is replaced with the implicit prior captured by the neural network (where R ( x ) = 0 {\displaystyle R(x)=0} for images that can be produced by a deep neural networks and R ( x ) = + ∞ {\displaystyle R(x)=+\infty } otherwise). This yields the equation for the minimizer θ ∗ = a r g m i n θ E ( f θ ( z ) ; x 0 ) {\displaystyle \theta ^{}=argmin_{\theta }E(f_{\theta }(z);x_{0})} and the result of the optimization process x ∗ = f θ ∗ ( z ) {\displaystyle x^{}=f_{\theta ^{}}(z)} . The minimizer θ ∗ {\displaystyle \theta ^{}} (typically a gradient descent) starts from a randomly initialized parameters and descends into a local best result to yield the x ∗ {\displaystyle x^{}} restoration function. ==== Overfitting ==== A parameter θ may be used to recover any image, including its noise. However, the network is reluctant to pick up noise because it contains high impedance while useful signal offers low impedance. This results in the θ parameter approaching a good-looking local optimum so long as the number of iterations in the optimization process remains low enough not to overfit data. === Deep Neural Network Model === Typically, the deep neural network model for deep image prior uses a U-Net like model without the skip connections that connect the encoder blocks with the decoder blocks. The authors in their paper mention that "Our findings here (and in other similar comparisons) seem to suggest that having deeper architecture is beneficial, and that having skip-connections that work so well for recognition tasks (such as semantic segmentation) is highly detrimental." == Applications == === Denoising === The principle of denoising is to recover an image x {\displaystyle x} from a noisy observation x 0 {\displaystyle x_{0}} , where x 0 = x + ϵ {\displaystyle x_{0}=x+\epsilon } . The distribution ϵ {\displaystyle \epsilon } is sometimes known (e.g.: profiling sensor and photon noise) and may optionally be incorporated into the model, though this process works well in blind denoising. The quadratic energy function E ( x , x 0 ) = | | x − x 0 | | 2 {\displaystyle E(x,x_{0})=||x-x_{0}||^{2}} is used as the data term, plugging it into the equation for θ ∗ {\displaystyle \theta ^{}} yields the optimization problem m i n θ | | f θ ( z ) − x 0 | | 2 {\displaystyle min_{\theta }||f_{\theta }(z)-x_{0}||^{2}} . === Super-resolution === Super-resolution is used to generate a higher resolution version of image x. The data term is set to E ( x ; x 0 ) = | | d ( x ) − x 0 | | 2 {\displaystyle E(x;x_{0})=||d(x)-x_{0}||^{2}} where d(·) is a downsampling operator such as Lanczos that decimates the image by a factor t. === Inpainting === Inpainting is used to reconstruct a missing area in an image x 0 {\displaystyle x_{0}} . These missing pixels are defined as the binary mask m ∈ { 0 , 1 } H × V {\displaystyle m\in \{0,1\}^{H\times V}} . The data term is defined as E ( x ; x 0 ) = | | ( x − x 0 ) ⊙ m | | 2 {\displaystyle E(x;x_{0})=||(x-x_{0})\odot m||^{2}} (where ⊙ {\displaystyle \odot } is the Hadamard product). The intuition behind this is that the loss is computed only on the known pixels in the image, and the network is going to learn enough about the image to fill in unknown parts of the image even though the computed loss doesn't include those pixels. This strategy is used to remove image watermarks by treating the watermark as missing pixels in the image. === Flash–no-flash reconstruction === This approach may be extended to multiple images. A straightforward example mentioned by the author is the reconstruction of an image to obtain natural light and clarity from a flash–no-flash pair. Video reconstruction is possible but it requires optimizations to take into account the spatial differences. == Implementations == A reference implementation rewritten in Python 3.6 with the PyTorch 0.4.0 library was released by the author under the Apache 2.0 license: deep-image-prior A TensorFlow-based implementation written in Python 2 and released under the CC-SA 3.0 license: deep-image-prior-tensorflow A Keras-based implementation written in Python 2 and released under the GPLv3: machine_learning_denoising == Example == See Astronomy Picture of the Day (APOD) of 2024-02-18

    Read more →