AI Generator Text Free

AI Generator Text Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Optical sorting

    Optical sorting

    Optical sorting (sometimes called digital sorting) is the automated process of sorting solid products using cameras and/or lasers. Depending on the types of sensors used and the software-driven intelligence of the image processing system, optical sorters can recognize an object's color, size, shape, structural properties and chemical composition. The sorter compares objects to user-defined accept/reject criteria to identify and remove defective products and foreign material (FM) from the production line, or to separate product of different grades or types of materials. Optical sorters are in widespread use in the food industry worldwide, with the highest adoption in processing harvested foods such as potatoes, fruits, vegetables and nuts where it achieves non-destructive, 100 percent inspection in-line at full production volumes. The technology is also used in pharmaceutical manufacturing and nutraceutical manufacturing, tobacco processing, waste recycling and other industries. Compared to manual sorting, which is subjective and inconsistent, optical sorting helps improve product quality, maximize throughput and increase yields while reducing labor costs. == History == Optical sorting is an idea that first came out of the desire to automate industrial sorting of agricultural goods like fruits and vegetables. Before automated optical sorting technology was conceived in the 1930s, companies like Unitec were producing wooden machinery to assist in the mechanical sorting of fruit processing. In 1931, a company known as “the Electric Sorting Company” was incorporated and began the creation of the world’s first color sorters, which were being installed and used in Michigan’s bean industry by 1932. In 1937, optical sorting technology had advanced to allow for systems based on a two-color principle of selection. The next few decades saw the installation of new and improved sorting mechanisms, like gravity feed systems and the implementation of optical sorting in more agricultural industries. In the late 1960s, optical sorting began to be implemented to new industries beyond agriculture, like the sorting of ferrous and non-ferrous metals. By the 1990s, optical sorting was being used heavily in the sorting of solid wastes. With the large technological revolution happening in the late 1990s and early 2000s, optical sorters were being made more efficient via the implementation of new optical sensors, like CCD, UV, and IR cameras. Today, optical sorting is used in a wide variety of industries and, as such, is implemented with a varying selection of mechanisms to assist in that specific sorter’s task. == The sorting system == In general, optical sorters feature four major components: the feed system, the optical system, image processing software, and the separation system. The objective of the feed system is to spread products into a uniform monolayer so products are presented to the optical system evenly, without clumps, at a constant velocity. The optical system includes lights and sensors housed above and/or below the flow of the objects being inspected. The image processing system compares objects to user-defined accept/reject thresholds to classify objects and actuate the separation system. The separation system — usually compressed air for small products and mechanical devices for larger products, like whole potatoes — pinpoints objects while in-air and deflects the objects to remove into a reject chute while the good product continues along its normal trajectory. The ideal sorter to use depends on the application. Therefore, the product's characteristics and the user's objectives determine the ideal sensors, software-driven capabilities and mechanical platform. == Sensors == Optical sorters require a combination of lights and sensors to illuminate and capture images of the objects so the images can be processed. The processed images will determine if the material should be accepted or rejected. There are camera sorters, laser sorters and sorters that feature a combination of the two on one platform. Lights, cameras, lasers and laser sensors can be designed to function within visible light wavelengths as well as the infrared (IR) and ultraviolet (UV) spectrums. The optimal wavelengths for each application maximize the contrast between the objects to be separated. Cameras and laser sensors can differ in spatial resolution, with higher resolutions enabling the sorter to detect and remove smaller defects. === Cameras === Monochromatic cameras detect shades of gray from black to white and can be effective when sorting products with high-contrast defects. Sophisticated color cameras with high color resolution are capable of detecting millions of colors to better distinguish more subtle color defects. Trichromatic color cameras (also called three-channel cameras) divide light into three bands, which can include red, green and/or blue within the visible spectrum as well as IR and UV. The interaction of different materials with parts of the electromagnetic spectrum make these contrasts more evident than how they appear to the naked human eye. Coupled with intelligent software, sorters that feature cameras are capable of recognizing each object's color, size and shape; as well as the color, size, shape and location of a defect on a product. Some intelligent sorters even allow the user to define a defective product based on the total defective surface area of any given object. === Lasers === While cameras capture product information based primarily on material reflectance, lasers and their sensors are able to distinguish a material's structural properties along with their color. This structural property inspection allows lasers to detect a wide range of organic and inorganic foreign material such as insects, glass, metal, sticks, rocks and plastic; even if they are the same color as the good product. Lasers can be designed to operate within specific wavelengths of light; whether on the visible spectrum or beyond. For example, lasers can detect chlorophyll by stimulating fluorescence using specific wavelengths; which is a process that is very effective for removing foreign material from green vegetables. === Camera/laser combinations === Sorters equipped with cameras and lasers on one platform are generally capable of identifying the widest variety of attributes. Cameras are often better at recognizing color, size and shape while laser sensors identify differences in structural properties to maximize foreign material detection and removal. === Hyperspectral Imaging === Driven by the need to solve previously impossible sorting challenges, a new generation of sorters that feature multispectral and hyperspectral imaging Optical Sorters. Like trichromatic cameras, multispectral and hyperspectral cameras collect data from the electromagnetic spectrum. Unlike trichromatic cameras, which divide light into three bands, hyperspectral systems can divide light into hundreds of narrow bands over a continuous range that covers a vast portion of the electromagnetic spectrum. This opens the door for more detailed analysis that leads to a more consistent product. Using IR alone might detect some defects, but combining it with a broader range of the spectrum makes it more effective. Compared to the three data points per pixel collected by trichromatic cameras, hyperspectral cameras can collect hundreds of data points per pixel, which are combined to create a unique spectral signature (also called a fingerprint) for each object. When complemented by capable software intelligence, a hyperspectral sorter processes those fingerprints to enable sorting on the chemical composition of the product. This is an emerging area of chemometrics. == Software-driven intelligence == Once the sensors capture the object's response to the energy source, image processing is used to manipulate the raw data. The image processing extracts and categorizes information about specific features. The user then defines accept/reject thresholds that are used to determine what is good and bad in the raw data flow. The art and science of image processing lies in developing algorithms that maximize the effectiveness of the sorter while presenting a simple user-interface to the operator. Object-based recognition is a classic example of software-driven intelligence. It allows the user to define a defective product based on where a defect lies on the product and/or the total defective surface area of an object. It offers more control in defining a wider range of defective products. When used to control the sorter's ejection system, it can improve the accuracy of ejecting defective products. This improves product quality and increases yields. New software-driven capabilities are constantly being developed to address the specific needs of various applications. As computing hardware becomes more powerful, new software-driven advancements become possible. Some of these advancements enhance the effectivene

    Read more →
  • Mathematical knowledge management

    Mathematical knowledge management

    Mathematical knowledge management (MKM) is the study of how society can effectively make use of the vast and growing literature on mathematics. It studies approaches such as databases of mathematical knowledge, automated processing of formulae and the use of semantic information, and artificial intelligence. Mathematics is particularly suited to a systematic study of automated knowledge processing due to the high degree of interconnectedness between different areas of mathematics.

    Read more →
  • Communication-avoiding algorithm

    Communication-avoiding algorithm

    Communication-avoiding algorithms minimize movement of data within a memory hierarchy for improving its running-time and energy consumption. These minimize the total of two costs (in terms of time and energy): arithmetic and communication. Communication, in this context refers to moving data, either between levels of memory or between multiple processors over a network. It is much more expensive than arithmetic. == Formal theory == === Two-level memory model === A common computational model in analyzing communication-avoiding algorithms is the two-level memory model: There is one processor and two levels of memory. Level 1 memory is infinitely large. Level 0 memory ("cache") has size M {\displaystyle M} . In the beginning, input resides in level 1. In the end, the output resides in level 1. Processor can only operate on data in cache. The goal is to minimize data transfers between the two levels of memory. === Matrix multiplication === Corollary 6.2: More general results for other numerical linear algebra operations can be found in. The following proof is from. == Motivation == Consider the following running-time model: Measure of computation = Time per FLOP = γ Measure of communication = No. of words of data moved = β ⇒ Total running time = γ·(no. of FLOPs) + β·(no. of words) From the fact that β >> γ as measured in time and energy, communication cost dominates computation cost. Technological trends indicate that the relative cost of communication is increasing on a variety of platforms, from cloud computing to supercomputers to mobile devices. The report also predicts that gap between DRAM access time and FLOPs will increase 100× over coming decade to balance power usage between processors and DRAM. Energy consumption increases by orders of magnitude as we go higher in the memory hierarchy. United States president Barack Obama cited communication-avoiding algorithms in the FY 2012 Department of Energy budget request to Congress: New Algorithm Improves Performance and Accuracy on Extreme-Scale Computing Systems. On modern computer architectures, communication between processors takes longer than the performance of a floating-point arithmetic operation by a given processor. ASCR researchers have developed a new method, derived from commonly used linear algebra methods, to minimize communications between processors and the memory hierarchy, by reformulating the communication patterns specified within the algorithm. This method has been implemented in the TRILINOS framework, a highly-regarded suite of software, which provides functionality for researchers around the world to solve large scale, complex multi-physics problems. == Objectives == Communication-avoiding algorithms are designed with the following objectives: Reorganize algorithms to reduce communication across all memory hierarchies. Attain the lower-bound on communication when possible. The following simple example demonstrates how these are achieved. === Matrix multiplication example === Let A, B and C be square matrices of order n × n. The following naive algorithm implements C = C + A B: for i = 1 to n for j = 1 to n for k = 1 to n C(i,j) = C(i,j) + A(i,k) B(k,j) Arithmetic cost (time-complexity): n2(2n − 1) for sufficiently large n or O(n3). Rewriting this algorithm with communication cost labelled at each step for i = 1 to n {read row i of A into fast memory} - n2 reads for j = 1 to n {read C(i,j) into fast memory} - n2 reads {read column j of B into fast memory} - n3 reads for k = 1 to n C(i,j) = C(i,j) + A(i,k) B(k,j) {write C(i,j) back to slow memory} - n2 writes Fast memory may be defined as the local processor memory (CPU cache) of size M and slow memory may be defined as the DRAM. Communication cost (reads/writes): n3 + 3n2 or O(n3) Since total running time = γ·O(n3) + β·O(n3) and β >> γ the communication cost is dominant. The blocked (tiled) matrix multiplication algorithm reduces this dominant term: ==== Blocked (tiled) matrix multiplication ==== Consider A, B and C to be n/b-by-n/b matrices of b-by-b sub-blocks where b is called the block size; assume three b-by-b blocks fit in fast memory. for i = 1 to n/b for j = 1 to n/b {read block C(i,j) into fast memory} - b2 × (n/b)2 = n2 reads for k = 1 to n/b {read block A(i,k) into fast memory} - b2 × (n/b)3 = n3/b reads {read block B(k,j) into fast memory} - b2 × (n/b)3 = n3/b reads C(i,j) = C(i,j) + A(i,k) B(k,j) - {do a matrix multiply on blocks} {write block C(i,j) back to slow memory} - b2 × (n/b)2 = n2 writes Communication cost: 2n3/b + 2n2 reads/writes << 2n3 arithmetic cost Making b as large possible: 3b2 ≤ M we achieve the following communication lower bound: 31/2n3/M1/2 + 2n2 or Ω (no. of FLOPs / M1/2) == Previous approaches for reducing communication == Most of the approaches investigated in the past to address this problem rely on scheduling or tuning techniques that aim at overlapping communication with computation. However, this approach can lead to an improvement of at most a factor of two. Ghosting is a different technique for reducing communication, in which a processor stores and computes redundantly data from neighboring processors for future computations. Cache-oblivious algorithms represent a different approach introduced in 1999 for fast Fourier transforms, and then extended to graph algorithms, dynamic programming, etc. They were also applied to several operations in linear algebra as dense LU and QR factorizations. The design of architecture specific algorithms is another approach that can be used for reducing the communication in parallel algorithms, and there are many examples in the literature of algorithms that are adapted to a given communication topology.

    Read more →
  • Voiceverse NFT plagiarism scandal

    Voiceverse NFT plagiarism scandal

    In January 2022, 15—the pseudonymous Massachusetts Institute of Technology (MIT) artificial intelligence researcher and creator of the non-commercial generative artificial intelligence voice synthesis research project 15.ai—discovered that the blockchain-based technology company Voiceverse had plagiarized from their platform. Voiceverse marketed itself as a service that offered AI voice cloning technology that could be purchased and traded as non-fungible tokens (NFTs). Amid heightened controversy over NFTs in the gaming industry, voice actor Troy Baker (who has been described as one of the most famous voice actors in video games) announced his partnership with Voiceverse on January 14, 2022, triggering immediate backlash over concerns about the environmental impact of NFTs, potential for fraud, predatory monetization in video games, and the potential of AI displacing jobs for human voice actors. Later that same day, 15 revealed through server logs that Voiceverse had generated voice lines using 15's free text-to-speech platform, pitch-shifted the audio to make them unrecognizable, and falsely marketed the samples as their own technology before selling them as NFTs. Within an hour of being confronted with evidence, Voiceverse confessed and stated that their marketing team had used 15.ai without proper attribution while rushing to create a technology demo to coincide with Baker's partnership announcement, further exacerbating the already negative reception to the original announcement. In response, 15 replied "Go fuck yourself"; the interaction went viral and garnered a large amount of support for the developer. News publications universally characterized this incident as Voiceverse having "stolen" from 15.ai. The next day, Baker appeared on a podcast and stated that his motivation had been to help independent creators who were unable to afford professional voice actors. Following continued backlash and the plagiarism revelation, Baker ended his partnership with Voiceverse on January 31, 2022. Subsequently, the incident was documented in multiple AI ethics databases, criticisms of predatory monetization in video games, and retrospectives as one of the earliest instances of plagiarism and theft stemming from artificial intelligence during the AI boom. == Background == === Troy Baker === Troy Baker is a prominent voice actor in the video game industry best known for his performances as Joel Miller in The Last of Us franchise. Baker has been described as "ubiquitous" by Polygon, "one of the most high-profile and prolific voice actors in video games" by Eurogamer, and "arguably the most famous voice actor in the gaming industry" by GameGuru. His other prominent roles include voicing Agent John "Jonesy" Jones in Fortnite, Booker DeWitt in BioShock Infinite, and both Batman and Joker in multiple Batman video games. As of October 2025, Baker holds the record for the most acting nominations at the BAFTA Games Awards, with five between 2013 and 2021. === Voiceverse === Voiceverse is a blockchain-based startup founded by the Bored Ape Yacht Club that marketed itself as offering AI voice cloning technology in the form of NFTs. Prior to the announcement of their partnership with Baker, Voiceverse had partnered with LOVO, Inc., an AI voice platform that, according to LOVO, could generate human-like voices. Voiceverse stated that any user who purchases a voice NFT would have unlimited and perpetual access to the voice model, which could be used to create content such as audiobooks, YouTube videos, podcasts, e-learning materials, in-game voice chat, and Zoom calls. Voiceverse promised that buyers would "OWN [sic] all of the IP" of content they created using these voices. Voiceverse's roadmap included plans to release 8,888 initial voice NFTs, a feature to add emotions to existing voices, and the ability for users to mint their own voices as NFTs. Prior to Baker's partnership, Voiceverse had also partnered with voice actors Charlet Chung, who voices D.Va in Overwatch, and Andy Milonakis of The Andy Milonakis Show. === 15.ai === 15.ai is a free web application launched in 2020 that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at MIT, it was an early example of an application of generative artificial intelligence during the initial stages of the AI boom. The platform showed that deep neural networks could generate emotionally expressive speech with only 15 seconds of speech; the name "15.ai" references the creator's statement that a voice can be convincingly cloned with just 15 seconds of audio, as opposed to the tens of hours of data previously required. 15.ai became an Internet phenomenon in early 2021 when content utilizing it went viral on social media and quickly gained widespread use among various Internet fandoms. 15 has emphasized that it remain free and non-commercial; it only requires users to give proper credit when using the service for content creation. === NFTs in the video game industry === By early 2022, NFTs had become highly controversial within the gaming industry. Critics raised concerns about their environmental impact due to the significant energy consumption of blockchain technology. In addition, the prevalence of scams, fraud, and potential money laundering associated with NFT sales, as well as fears that NFTs were a new form of predatory monetization following the increasing frequency of loot boxes, caused vocal pushback from the gaming community. Several major gaming companies had begun exploring NFT integration into their products, though fan backlash had already forced some projects to be cancelled. On December 16, 2021, the developers of S.T.A.L.K.E.R. 2: Heart of Chernobyl announced that they would be including NFTs in the game, but cancelled within an hour of the announcement due to immediate universal backlash. Simultaneously, the rise of AI voice technology raised concerns among voice actors about potential job displacement and the devaluation of their work amidst the voice acting industry's ongoing struggles for better compensation and working conditions. == Partnership announcement and backlash == On January 14, 2022, 1:02 a.m. EST, Baker announced on Twitter that he was partnering with Voiceverse "to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create." The announcement concluded with the statement "You can hate. Or you can create." Baker's specific role with Voiceverse remained unclear at the time of the announcement. Along with Baker's announcement, Voiceverse promoted their supposed voice AI technology on Twitter by posting animated videos that featured a cat character created by NFT firm Chubbiverse. The videos concluded with text that read "The Voice Powered By Voiceverse"; Voiceverse stated on Twitter that the voices in the animations had been generated using their own AI voice synthesis technology and presented the videos as a technology demonstration of their voice NFT capabilities. The announcement provoked immediate and widespread backlash from the gaming community. Baker's tweet received thousands of replies and quote retweets (the vast majority of which were negative), far more than the number of likes; Michael McWhertor of Polygon described it as a "textbook example of being ratioed" and commented that reactions had been amplified by the final part of Baker's announcement. Michael Beckwith of Metro called Baker's approach "bizarrely aggressive". Later that day, Baker responded to the backlash by apologizing for his choice of words. He said he appreciated people's thoughts and acknowledged that the "hate/create part might have been a bit antagonistic," calling it a "bad attempt to bring levity". Despite the apology, Baker and his fellow voice actors did not distance themselves from Voiceverse at this point. At the same time, Voiceverse attempted to address the criticisms, stating that they were working to move to more environmentally friendly blockchain technology and that voice actors would receive royalties from NFT sales, with actors benefiting from any increase in NFT value. == Plagiarism revelation == On December 13, 2021, amidst the increasingly negative reactions toward NFTs among the general public, the creator of 15.ai (known pseudonymously as 15) announced that they had "no interest in incorporating NFTs into any aspect of [their] work." On January 14, 2022, 11:17 a.m. EST (10 hours after Baker's initial announcement), 15 commented on the Voiceverse venture, stating that it "sounds like a scam". Two hours later, at 1:20 p.m., 15 explicitly accused Voiceverse of "actively attempting to appropriate [15's] work for [Voiceverse's] own benefit." 15 provided evidence through

    Read more →
  • Apache OpenNLP

    Apache OpenNLP

    The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing and coreference resolution. These tasks are usually required to build more advanced text processing services.

    Read more →
  • Taxonomic database

    Taxonomic database

    A taxonomic database is a database created to hold information on biological taxa – for example groups of organisms organized by species name or other taxonomic identifier – for efficient data management and information retrieval. Taxonomic databases are routinely used for the automated construction of biological checklists such as floras and faunas, both for print publication and online; to underpin the operation of web-based species information systems; as a part of biological collection management (for example in museums and herbaria); as well as providing, in some cases, the taxon management component of broader science or biology information systems. They are also a fundamental contribution to the discipline of biodiversity informatics. == Goals == Taxonomic databases digitize scientific biodiversity data and provide access to taxonomic data for research. Taxonomic databases vary in breadth of the groups of taxa and geographical space they seek to include, for example: beetles in a defined region, mammals globally, or all described taxa in the tree of life. A taxonomic database may incorporate organism identifiers (scientific name, author, and – for zoological taxa – year of original publication), synonyms, taxonomic opinions, literature sources or citations, illustrations or photographs, and biological attributes for each taxon (such as geographic distribution, ecology, descriptive information, threatened or vulnerable status, etc.). Some databases, such as the Global Biodiversity Information Facility(GBIF) database and the Barcode of Life Data System, store the DNA barcode of a taxon if one exists (also called the Barcode Index Number (BIN) which may be assigned, for example, by the International Barcode of Life project (iBOL) or UNITE, a database for fungal DNA barcoding). A taxonomic database aims to accurately model the characteristics of interest that are relevant to the organisms which are in scope for the intended coverage and usage of the system. For example, databases of fungi, algae, bryophytes and vascular plants ("higher plants") encode conventions from the International Code of Botanical Nomenclature while their counterparts for animals and most protists encode equivalent rules from the International Code of Zoological Nomenclature. Modelling the relevant taxonomic hierarchy for any taxon is a natural fit with the relational model employed in almost all database systems. Scientific consensus is not reached for all taxon groups, and new species continue to be described; therefore, another goal of taxonomic databases is to aid in resolving conflicts of scientific opinion and unify taxonomy. == History == Possibly the earliest documented management of taxonomic information in computerised form comprised the taxonomic coding system developed by Richard Swartz et al. at the Virginia Institute of Marine Science for the Biota of Chesapeake Bay and described in a published report in 1972. This work led directly or indirectly to other projects with greater profile including the NODC Taxonomic Code system which went through 8 versions before being discontinued in 1996, to be subsumed and transformed into the still current Integrated Taxonomic Information System (ITIS). A number of other taxonomic databases specializing in particular groups of organisms that appeared in the 1970s through to the present jointly contribute to the Species 2000 project, which since 2001 has been partnering with ITIS to produce a combined product, the Catalogue of Life. While the Catalogue of Life currently concentrates on assembling basic name information as a global species checklist, numerous other taxonomic database projects such as Fauna Europaea, the Australian Faunal Directory, and more supply rich ancillary information including descriptions, illustrations, maps, and more. Many taxonomic database projects are currently listed at the TDWG "Biodiversity Information Projects of the World" site. == Issues == The representation of taxonomic information in machine-encodable form raises a number of issues not encountered in other domains, such as variant ways to cite the same species or other taxon name, the same name used for multiple taxa (homonyms), multiple non-current names for the same taxon (synonyms), changes in name and taxon concept definition through time, and more. Non-standardized categories and metadata in taxonomic databases hampers the ability for researchers to analyze the data. One forum that has promoted discussion and possible solutions to these and related problems since 1985 is the Biodiversity Information Standards (TDWG), originally called the Taxonomic Database Working Group. While online databases have great benefits (for example, increased access to taxonomic information), they also have issues such as data integrity risks due to on- and off-line versions and continuous updates, technical access issues due to server or internet outage, and differing capacities for complex queries to extract taxonomic data into lists. As the quantity of information in online taxonomic databases rapidly expands, data aggregation, and the integration and alignment of non-standardized data across databases, is a big challenge in taxonomy and biodiversity informatics.

    Read more →
  • HAKMEM

    HAKMEM

    HAKMEM, alternatively known as AI Memo 239, is a February 1972 "memo" (technical report) of the MIT AI Lab containing a wide variety of hacks, including useful and clever algorithms for mathematical computation, some number theory and schematic diagrams for hardware – in Guy L. Steele's words, "a bizarre and eclectic potpourri of technical trivia". Contributors included about two dozen members and associates of the AI Lab. The title of the report is short for "hacks memo", abbreviated to six upper case characters that would fit in a single PDP-10 machine word (using a six-bit character set). == History == HAKMEM is notable as an early compendium of algorithmic technique, particularly for its practical bent, and as an illustration of the wide-ranging interests of AI Lab people of the time, which included almost anything other than AI research. HAKMEM contains original work in some fields, notably continued fractions. == Introduction == Compiled with the hope that a record of the random things people do around here can save some duplication of effort -- except for fun. Here is some little known data which may be of interest to computer hackers. The items and examples are so sketchy that to decipher them may require more sincerity and curiosity than a non-hacker can muster. Doubtless, little of this is new, but nowadays it's hard to tell. So we must be content to give you an insight, or save you some cycles, and to welcome further contributions of items, new or used.

    Read more →
  • KiSAO

    KiSAO

    The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. KiSAO is part of the BioModels.net project and of the COMBINE initiative. == Structure == KiSAO consists of three main branches: simulation algorithm simulation algorithm characteristic simulation algorithm parameter The elements of each algorithm branch are linked to characteristic and parameter branches using has characteristic and has parameter relationships accordingly. The algorithm branch itself is hierarchically structured using relationships which denote that the descendant algorithms were derived from, or specify, more general ancestors.

    Read more →
  • Developmental robotics

    Developmental robotics

    Developmental robotics (DevRob), sometimes called epigenetic robotics, is a scientific field which aims at studying the developmental mechanisms, architectures and constraints that allow lifelong and open-ended learning of new skills and new knowledge in embodied machines. As in human children, learning is expected to be cumulative and of progressively increasing complexity, and to result from self-exploration of the world in combination with social interaction. The typical methodological approach consists in starting from theories of human and animal development elaborated in fields such as developmental psychology, neuroscience, developmental and evolutionary biology, and linguistics, then to formalize and implement them in robots, sometimes exploring extensions or variants of them. The experimentation of those models in robots allows researchers to confront them with reality, and as a consequence, developmental robotics also provides feedback and novel hypotheses on theories of human and animal development. Developmental robotics is related to but differs from evolutionary robotics (ER). ER uses populations of robots that evolve over time, whereas DevRob is interested in how the organization of a single robot's control system develops through experience, over time. DevRob is also related to work done in the domains of robotics and artificial life. == Background == Can a robot learn like a child? Can it learn a variety of new skills and new knowledge unspecified at design time and in a partially unknown and changing environment? How can it discover its body and its relationships with the physical and social environment? How can its cognitive capacities continuously develop without the intervention of an engineer once it is "out of the factory"? What can it learn through natural social interactions with humans? These are the questions at the center of developmental robotics. Alan Turing, as well as a number of other pioneers of cybernetics, already formulated those questions and the general approach in 1950, but it is only since the end of the 20th century that they began to be investigated systematically. Because the concept of adaptive intelligent machines is central to developmental robotics, it has relationships with fields such as artificial intelligence, machine learning, cognitive robotics or computational neuroscience. Yet, while it may reuse some of the techniques elaborated in these fields, it differs from them from many perspectives. It differs from classical artificial intelligence because it does not assume the capability of advanced symbolic reasoning and focuses on embodied and situated sensorimotor and social skills rather than on abstract symbolic problems. It differs from cognitive robotics because it focuses on the processes that allow the formation of cognitive capabilities rather than these capabilities themselves. It differs from computational neuroscience because it focuses on functional modeling of integrated architectures of development and learning. More generally, developmental robotics is uniquely characterized by the following three features: It targets task-independent architectures and learning mechanisms, i.e. the machine/robot has to be able to learn new tasks that are unknown by the engineer; It emphasizes open-ended development and lifelong learning, i.e. the capacity of an organism to acquire continuously novel skills. This should not be understood as a capacity for learning "anything" or even “everything”, but just that the set of skills that is acquired can be infinitely extended at least in some (not all) directions; The complexity of acquired knowledge and skills shall increase (and the increase be controlled) progressively. Developmental robotics emerged at the crossroads of several research communities including embodied artificial intelligence, enactive and dynamical systems cognitive science, connectionism. Starting from the essential idea that learning and development happen as the self-organized result of the dynamical interactions among brains, bodies and their physical and social environment, and trying to understand how this self-organization can be harnessed to provide task-independent lifelong learning of skills of increasing complexity, developmental robotics strongly interacts with fields such as developmental psychology, developmental and cognitive neuroscience, developmental biology (embryology), evolutionary biology, and cognitive linguistics. As many of the theories coming from these sciences are verbal and/or descriptive, this implies a crucial formalization and computational modeling activity in developmental robotics. These computational models are then not only used as ways to explore how to build more versatile and adaptive machines but also as a way to evaluate their coherence and possibly explore alternative explanations for understanding biological development. == Research directions == === Skill domains === Due to the general approach and methodology, developmental robotics projects typically focus on having robots develop the same types of skills as human infants. A first category that is important being investigated is the acquisition of sensorimotor skills. These include the discovery of one's own body, including its structure and dynamics such as hand-eye coordination, locomotion, and interaction with objects as well as tool use, with a particular focus on the discovery and learning of affordances. A second category of skills targeted by developmental robots are social and linguistic skills: the acquisition of simple social behavioural games such as turn-taking, coordinated interaction, lexicons, syntax and grammar, and the grounding of these linguistic skills into sensorimotor skills (sometimes referred as symbol grounding). In parallel, the acquisition of associated cognitive skills are being investigated such as the emergence of the self/non-self distinction, the development of attentional capabilities, of categorization systems and higher-level representations of affordances or social constructs, of the emergence of values, empathy, or theories of mind. === Mechanisms and constraints === The sensorimotor and social spaces in which humans and robot live are so large and complex that only a small part of potentially learnable skills can actually be explored and learnt within a life-time. Thus, mechanisms and constraints are necessary to guide developmental organisms in their development and control of the growth of complexity. There are several important families of these guiding mechanisms and constraints which are studied in developmental robotics, all inspired by human development: Motivational systems, generating internal reward signals that drive exploration and learning, which can be of two main types: extrinsic motivations push robots/organisms to maintain basic specific internal properties such as food and water level, physical integrity, or light (e.g. in phototropic systems); intrinsic motivations push robot to search for novelty, challenge, compression or learning progress per se, thus generating what is sometimes called curiosity-driven learning and exploration, or alternatively active learning and exploration; Social guidance: as humans learn a lot by interacting with their peers, developmental robotics investigates mechanisms that can allow robots to participate to human-like social interaction. By perceiving and interpreting social cues, this may allow robots both to learn from humans (through diverse means such as imitation, emulation, stimulus enhancement, demonstration, etc. ...) and to trigger natural human pedagogy. Thus, social acceptance of developmental robots is also investigated; Statistical inference biases and cumulative knowledge/skill reuse: biases characterizing both representations/encodings and inference mechanisms can typically allow considerable improvement of the efficiency of learning and are thus studied. Related to this, mechanisms allowing to infer new knowledge and acquire new skills by reusing previously learnt structures is also an essential field of study; The properties of embodiment, including geometry, materials, or innate motor primitives/synergies often encoded as dynamical systems, can considerably simplify the acquisition of sensorimotor or social skills, and is sometimes referred as morphological computation. The interaction of these constraints with other constraints is an important axis of investigation; Maturational constraints: In human infants, both the body and the neural system grow progressively, rather than being full-fledged already at birth. This implies, for example, that new degrees of freedom, as well as increases of the volume and resolution of available sensorimotor signals, may appear as learning and development unfold. Transposing these mechanisms in developmental robots, and understanding how it may hinder or on the contrary ease the acquisition of novel complex skills is a central questi

    Read more →
  • Emotion recognition

    Emotion recognition

    Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables. == Human == Humans show a great deal of variability in their abilities to recognize emotion. A key point to keep in mind when learning about automated emotion recognition is that there are several sources of "ground truth", or truth about what the real emotion is. Suppose we are trying to recognize the emotions of Alex. One source is "what would most people say that Alex is feeling?" In this case, the 'truth' may not correspond to what Alex feels, but may correspond to what most people would say it looks like Alex feels. For example, Alex may actually feel sad, but he puts on a big smile and then most people say he looks happy. If an automated method achieves the same results as a group of observers it may be considered accurate, even if it does not actually measure what Alex truly feels. Another source of 'truth' is to ask Alex what he truly feels. This works if Alex has a good sense of his internal state, and wants to tell you what it is, and is capable of putting it accurately into words or a number. However, some people are alexithymic and do not have a good sense of their internal feelings, or they are not able to communicate them accurately with words and numbers. In general, getting to the truth of what emotion is actually present can take some work, can vary depending on the criteria that are selected, and will usually involve maintaining some level of uncertainty. == Automatic == Decades of scientific research have been conducted developing and evaluating methods for automated emotion recognition. There is now an extensive literature proposing and evaluating hundreds of different kinds of methods, leveraging techniques from multiple areas, such as signal processing, machine learning, computer vision, and speech processing. Different methodologies and techniques may be employed to interpret emotion such as Bayesian networks. , Gaussian Mixture models and Hidden Markov Models and deep neural networks. === Approaches === The accuracy of emotion recognition is usually improved when it combines the analysis of human expressions from multimodal forms such as texts, physiology, audio, or video. Different emotion types are detected through the integration of information from facial expressions, body movement and gestures, and speech. The technology is said to contribute in the emergence of the so-called emotional or emotive Internet. The existing approaches in emotion recognition to classify certain emotion types can be generally classified into three main categories: knowledge-based techniques, statistical methods, and hybrid approaches. ==== Knowledge-based techniques ==== Knowledge-based techniques (sometimes referred to as lexicon-based techniques), utilize domain knowledge and the semantic and syntactic characteristics of text and potentially spoken language in order to detect certain emotion types. In this approach, it is common to use knowledge-based resources during the emotion classification process such as WordNet, SenticNet, ConceptNet, and EmotiNet, to name a few. One of the advantages of this approach is the accessibility and economy brought about by the large availability of such knowledge-based resources. A limitation of this technique on the other hand, is its inability to handle concept nuances and complex linguistic rules. Knowledge-based techniques can be mainly classified into two categories: dictionary-based and corpus-based approaches. Dictionary-based approaches find opinion or emotion seed words in a dictionary and search for their synonyms and antonyms to expand the initial list of opinions or emotions. Corpus-based approaches on the other hand, start with a seed list of opinion or emotion words, and expand the database by finding other words with context-specific characteristics in a large corpus. While corpus-based approaches take into account context, their performance still vary in different domains since a word in one domain can have a different orientation in another domain. ==== Statistical methods ==== Statistical methods commonly involve the use of different supervised machine learning algorithms in which a large set of annotated data is fed into the algorithms for the system to learn and predict the appropriate emotion types. Machine learning algorithms generally provide more reasonable classification accuracy compared to other approaches, but one of the challenges in achieving good results in the classification process, is the need to have a sufficiently large training set. Some of the most commonly used machine learning algorithms include Support Vector Machines (SVM), Naive Bayes, and Maximum Entropy. Deep learning, which is under the unsupervised family of machine learning, is also widely employed in emotion recognition. Well-known deep learning algorithms include different architectures of Artificial Neural Network (ANN) such as Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), and Extreme Learning Machine (ELM). The popularity of deep learning approaches in the domain of emotion recognition may be mainly attributed to its success in related applications such as in computer vision, speech recognition, and Natural Language Processing (NLP). ==== Hybrid approaches ==== Hybrid approaches in emotion recognition are essentially a combination of knowledge-based techniques and statistical methods, which exploit complementary characteristics from both techniques. Some of the works that have applied an ensemble of knowledge-driven linguistic elements and statistical methods include sentic computing and iFeel, both of which have adopted the concept-level knowledge-based resource SenticNet. The role of such knowledge-based resources in the implementation of hybrid approaches is highly important in the emotion classification process. Since hybrid techniques gain from the benefits offered by both knowledge-based and statistical approaches, they tend to have better classification performance as opposed to employing knowledge-based or statistical methods independently. A downside of using hybrid techniques however, is the computational complexity during the classification process. === Datasets === Data is an integral part of the existing approaches in emotion recognition and in most cases it is a challenge to obtain annotated data that is necessary to train machine learning algorithms. For the task of classifying different emotion types from multimodal sources in the form of texts, audio, videos or physiological signals, the following datasets are available: HUMAINE: provides natural clips with emotion words and context labels in multiple modalities Belfast database: provides clips with a wide range of emotions from TV programs and interview recordings SEMAINE: provides audiovisual recordings between a person and a virtual agent and contains emotion annotations such as angry, happy, fear, disgust, sadness, contempt, and amusement IEMOCAP: provides recordings of dyadic sessions between actors and contains emotion annotations such as happiness, anger, sadness, frustration, and neutral state eNTERFACE: provides audiovisual recordings of subjects from seven nationalities and contains emotion annotations such as happiness, anger, sadness, surprise, disgust, and fear DEAP: provides electroencephalography (EEG), electrocardiography (ECG), and face video recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips DREAMER: provides electroencephalography (EEG) and electrocardiography (ECG) recordings, as well as emotion annotations in terms of valence, dominance of people watching film clips MELD: is a multiparty conversational dataset where each utterance is labeled with emotion and sentiment. MELD provides conversations in video format and hence suitable for multimodal emotion recognition and sentiment analysis. MELD is useful for multimodal sentiment analysis and emotion recognition, dialogue systems and emotion recognition in conversations. MuSe: provides audiovisual recordings of natural interactions between a person and an object. It has discrete and continuous emotion annotations in terms of valence, arousal and trustworthiness as well as speech topics useful for multimodal sentiment analysis and emotion recognition. UIT-VSMEC: is a standard Vietnamese Social Media Emotion Corpus (UIT-VSMEC) with about 6,927 human-annotated sentences with six emotion labels, contributing to emotion recognition research in Vietnamese

    Read more →
  • Least-squares spectral analysis

    Least-squares spectral analysis

    Least-squares spectral analysis (LSSA) is a class of methods for estimating a frequency spectrum by fitting sinusoids to data using a least-squares fit. Unlike Fourier analysis, the most widely used spectral method in science, data need not be equally spaced to use LSSA. Furthermore, while Fourier analysis generally amplifies long-period noise in long or gapped records, LSSA mitigates such problems. The first strictly least-squares LSSA method was developed in 1969 and 1971, and is known as the Vaníček method or the Gauss–Vaniček method, after its inventor Petr Vaníček and Carl Friedrich Gauss, the inventor of the least-squares method for error minimization. A widely known LSSA variant is the Lomb method or the Lomb–Scargle periodogram, based on dated computational simplifications of the Vaníček method introduced in the 1970s and 1980s, first by Nicholas R. Lomb and later by Jeffrey D. Scargle. Other LSSA variants have been subsequently developed. == Historical background == The close connections between Fourier analysis, the periodogram, and the least-squares fitting of sinusoids have been known for a long time. However, most developments are restricted to complete data sets of equally spaced samples. In 1963, Freek J. M. Barning of Mathematisch Centrum, Amsterdam, handled unequally spaced data by similar techniques, including both a periodogram analysis equivalent to what nowadays is called the Lomb method and least-squares fitting of selected frequencies of sinusoids determined from such periodograms — and connected by a procedure known today as the matching pursuit with post-back fitting or the orthogonal matching pursuit. Petr Vaníček, a Canadian geophysicist and geodesist of the University of New Brunswick, proposed in 1969 also the matching-pursuit approach for equally and unequally spaced data, which he called "successive spectral analysis" and the result a "least-squares periodogram". He generalized this method to account for any systematic components beyond a simple mean, such as a "predicted linear (quadratic, exponential, ...) secular trend of unknown magnitude", and applied it to a variety of samples, in 1971. Vaníček's strictly least-squares method was then simplified in 1976 by Nicholas R. Lomb of the University of Sydney, who pointed out its close connection to periodogram analysis. Subsequently, the definition of a periodogram of unequally spaced data was modified and analyzed by Jeffrey D. Scargle of NASA Ames Research Center, who showed that, with minor changes, it becomes identical to Lomb's least-squares formula for fitting individual sinusoid frequencies. Scargle states that his paper "does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced," and further points out regarding least-squares fitting of sinusoids compared to periodogram analysis, that his paper "establishes, apparently for the first time, that (with the proposed modifications) these two methods are exactly equivalent." Press summarizes the development this way: A completely different method of spectral analysis for unevenly sampled data, one that mitigates these difficulties and has some other very desirable properties, was developed by Lomb, based in part on earlier work by Barning and Vanicek, and additionally elaborated by Scargle. In 1989, Michael J. Korenberg of Queen's University in Kingston, Ontario, developed the "fast orthogonal search" method of more quickly finding a near-optimal decomposition of spectra or other problems, similar to the technique that later became known as the orthogonal matching pursuit. == Development of LSSA and variants == === The Vaníček method === In the Vaníček method, a discrete data set is approximated by a weighted sum of sinusoids of progressively determined frequencies using a standard linear regression or least-squares fit. The frequencies are chosen using a method similar to Barning's, but going further in optimizing the choice of each successive new frequency by picking the frequency that minimizes the residual after least-squares fitting (equivalent to the fitting technique now known as matching pursuit with pre-backfitting). The number of sinusoids must be less than or equal to the number of data samples (counting sines and cosines of the same frequency as separate sinusoids). The relationship between the DFT and the approximation of trigonometric functions using the least-squares method is well explained in (Strutz, 2017). A data vector Φ is represented as a weighted sum of sinusoidal basis functions, tabulated in a matrix A by evaluating each function at the sample times, with weight vector x: ϕ ≈ A x , {\displaystyle \phi \approx {\textbf {A}}x,} where the weights vector x is chosen to minimize the sum of squared errors in approximating Φ. The solution for x is closed-form, using standard linear regression: x = ( A T A ) − 1 A T ϕ . {\displaystyle x=({\textbf {A}}^{\mathrm {T} }{\textbf {A}})^{-1}{\textbf {A}}^{\mathrm {T} }\phi .} Here the matrix A can be based on any set of functions mutually independent (not necessarily orthogonal) when evaluated at the sample times; functions used for spectral analysis are typically sines and cosines evenly distributed over the frequency range of interest. If we choose too many frequencies in a too-narrow frequency range, the functions will be insufficiently independent, the matrix ill-conditioned, and the resulting spectrum meaningless. When the basis functions in A are orthogonal (that is, not correlated, meaning the columns have zero pair-wise dot products), the matrix ATA is diagonal; when the columns all have the same power (sum of squares of elements), then that matrix is an identity matrix times a constant, so the inversion is trivial. The latter is the case when the sample times are equally spaced and sinusoids chosen as sines and cosines equally spaced in pairs on the frequency interval 0 to a half cycle per sample (spaced by 1/N cycles per sample, omitting the sine phases at 0 and maximum frequency where they are identically zero). This case is known as the discrete Fourier transform, slightly rewritten in terms of measurements and coefficients. x = A T ϕ {\displaystyle x={\textbf {A}}^{\mathrm {T} }\phi } — DFT case for N equally spaced samples and frequencies, within a scalar factor. === The Lomb method === Trying to lower the computational burden of the Vaníček method in 1976 (no longer an issue), Lomb proposed using the above simplification in general, except for pair-wise correlations between sine and cosine bases of the same frequency, since the correlations between pairs of sinusoids are often small, at least when they are not tightly spaced. This formulation is essentially that of the traditional periodogram but adapted for use with unevenly spaced samples. The vector x is a reasonably good estimate of an underlying spectrum, but since we ignore any correlations, Ax is no longer a good approximation to the signal, and the method is no longer a least-squares method — yet in the literature continues to be referred to as such. Rather than just taking dot products of the data with sine and cosine waveforms directly, Scargle modified the standard periodogram formula so to find a time delay τ {\displaystyle \tau } first, such that this pair of sinusoids would be mutually orthogonal at sample times t j {\displaystyle t_{j}} and also adjusted for the potentially unequal powers of these two basis functions, to obtain a better estimate of the power at a frequency. This procedure made his modified periodogram method exactly equivalent to Lomb's method. Time delay τ {\displaystyle \tau } by definition equals to tan ⁡ 2 ω τ = ∑ j sin ⁡ 2 ω t j ∑ j cos ⁡ 2 ω t j . {\displaystyle \tan {2\omega \tau }={\frac {\sum _{j}\sin 2\omega t_{j}}{\sum _{j}\cos 2\omega t_{j}}}.} Then the periodogram at frequency ω {\displaystyle \omega } is estimated as: P x ( ω ) = 1 2 [ [ ∑ j X j cos ⁡ ω ( t j − τ ) ] 2 ∑ j cos 2 ⁡ ω ( t j − τ ) + [ ∑ j X j sin ⁡ ω ( t j − τ ) ] 2 ∑ j sin 2 ⁡ ω ( t j − τ ) ] , {\displaystyle P_{x}(\omega )={\frac {1}{2}}\left[{\frac {\left[\sum _{j}X_{j}\cos \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\cos ^{2}\omega (t_{j}-\tau )}}+{\frac {\left[\sum _{j}X_{j}\sin \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\sin ^{2}\omega (t_{j}-\tau )}}\right],} which, as Scargle reports, has the same statistical distribution as the periodogram in the evenly sampled case. At any individual frequency ω {\displaystyle \omega } , this method gives the same power as does a least-squares fit to sinusoids of that frequency and of the form: ϕ ( t ) = A sin ⁡ ω t + B cos ⁡ ω t . {\displaystyle \phi (t)=A\sin \omega t+B\cos \omega t.} In practice, it is always difficult to judge if a given Lomb peak is significant or not, especially when the nature of the noise is unknown, so for example a false-alarm spectr

    Read more →
  • Metadirectory

    Metadirectory

    A metadirectory system provides for the flow of data between one or more directory services and databases in order to maintain synchronization of that data. It is an important part of identity management systems. The data being synchronized typically are collections of entries that contain user profiles and possibly authentication or policy information. Most metadirectory deployments synchronize data into at least one LDAP-based directory server, to ensure that LDAP-based applications such as single sign-on and portal servers have access to recent data, even if the data is mastered in a non-LDAP data source. Metadirectory products support filtering and transformation of data in transit. Most identity management suites from commercial vendors include a metadirectory product, or a user provisioning product.

    Read more →
  • Uniform convergence in probability

    Uniform convergence in probability

    Uniform convergence in probability is a form of convergence in probability in statistical asymptotic theory and probability theory. It means that, under certain conditions, the empirical frequencies of all events in a certain event-family uniformly converge to their theoretical probabilities. Uniform convergence in probability has applications to statistics as well as machine learning as part of statistical learning theory. Specifically, the Glivenko-Cantelli theorem and the homonymous classes of functions are fundamentally related to uniform convergence. The law of large numbers says that, for each single event A {\displaystyle A} , its empirical frequency in a sequence of independent trials converges (with high probability) to its theoretical probability. In many application however, the need arises to judge simultaneously the probabilities of events of an entire class S {\displaystyle S} from one and the same sample. Moreover, it, is required that the relative frequency of the events converge to the probability uniformly over the entire class of events S {\displaystyle S} . The Uniform Convergence Theorem gives a sufficient condition for this convergence to hold. Roughly, if the event-family is sufficiently simple (its VC dimension is sufficiently small) then uniform convergence holds. == Definitions == For a class of predicates H {\displaystyle H} defined on a set X {\displaystyle X} and a set of samples x = ( x 1 , x 2 , … , x m ) {\displaystyle x=(x_{1},x_{2},\dots ,x_{m})} , where x i ∈ X {\displaystyle x_{i}\in X} , the empirical frequency of h ∈ H {\displaystyle h\in H} on x {\displaystyle x} is Q ^ x ( h ) = 1 m | { i : 1 ≤ i ≤ m , h ( x i ) = 1 } | . {\displaystyle {\widehat {Q}}_{x}(h)={\frac {1}{m}}|\{i:1\leq i\leq m,h(x_{i})=1\}|.} The theoretical probability of h ∈ H {\displaystyle h\in H} is defined as Q P ( h ) = P { y ∈ X : h ( y ) = 1 } . {\displaystyle Q_{P}(h)=P\{y\in X:h(y)=1\}.} The Uniform Convergence Theorem states, roughly, that if H {\displaystyle H} is "simple" and we draw samples independently (with replacement) from X {\displaystyle X} according to any distribution P {\displaystyle P} , then with high probability, the empirical frequency will be close to its expected value, which is the theoretical probability. Here "simple" means that the Vapnik–Chervonenkis dimension of the class H {\displaystyle H} is small relative to the size of the sample. In other words, a sufficiently simple collection of functions behaves roughly the same on a small random sample as it does on the distribution as a whole. The Uniform Convergence Theorem was first proved by Vapnik and Chervonenkis using the concept of growth function. == Uniform Convergence Theorem == The statement of the Uniform Convergence Theorem is as follows: If H {\displaystyle H} is a set of { 0 , 1 } {\displaystyle \{0,1\}} -valued functions defined on a set X {\displaystyle X} and P {\displaystyle P} is a probability distribution on X {\displaystyle X} then for ε > 0 {\displaystyle \varepsilon >0} and m {\displaystyle m} a positive integer, we have: P m { | Q P ( h ) − Q x ^ ( h ) | ≥ ε for some h ∈ H } ≤ 4 Π H ( 2 m ) e − ε 2 m / 8 . {\displaystyle P^{m}\{|Q_{P}(h)-{\widehat {Q_{x}}}(h)|\geq \varepsilon {\text{ for some }}h\in H\}\leq 4\Pi _{H}(2m)e^{-\varepsilon ^{2}m/8}.} In the above, for any x ∈ X m , {\displaystyle x\in X^{m},} Q P ( h ) = P { ( y ∈ X : h ( y ) = 1 } , {\displaystyle Q_{P}(h)=P\{(y\in X:h(y)=1\},} Q ^ x ( h ) = 1 m | { i : 1 ≤ i ≤ m , h ( x i ) = 1 } | {\displaystyle {\widehat {Q}}_{x}(h)={\frac {1}{m}}|\{i:1\leq i\leq m,h(x_{i})=1\}|} and | x | = m . {\displaystyle |x|=m.} P m {\displaystyle P^{m}} indicates that the probability is taken over x {\displaystyle x} consisting of m {\displaystyle m} i.i.d. draws from the distribution P . {\displaystyle P.} Finally, the growth function Π H {\displaystyle \Pi _{H}} is defined in the following way, for any { 0 , 1 } {\displaystyle \{0,1\}} -valued functions H {\displaystyle H} over X {\displaystyle X} and for any natural number m {\displaystyle m} : Π H ( m ) = max | { h ∩ D : D ⊆ X , | D | = m , h ∈ H } | . {\displaystyle \Pi _{H}(m)=\max |\{h\cap D:D\subseteq X,|D|=m,h\in H\}|.} From the point of view of Learning Theory one can consider H {\displaystyle H} to be the Concept/Hypothesis class defined over the instance set X {\displaystyle X} . Crucially, the Sauer–Shelah lemma implies that Π H ( m ) ≤ m d {\displaystyle \Pi _{H}(m)\leq m^{d}} , where d {\displaystyle d} is the VC dimension of H {\displaystyle H} . == Proof of the Uniform Convergence Theorem == and are the sources of the proof below. Before we get into the details of the proof of the Uniform Convergence Theorem we will present a high level overview of the proof. Symmetrization: We transform the problem of analyzing | Q P ( h ) − Q ^ x ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{x}(h)|\geq \varepsilon } into the problem of analyzing | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} , where r {\displaystyle r} and s {\displaystyle s} are i.i.d samples of size m {\displaystyle m} drawn according to the distribution P {\displaystyle P} . One can view r {\displaystyle r} as the original randomly drawn sample of length m {\displaystyle m} , while s {\displaystyle s} may be thought as the testing sample which is used to estimate Q P ( h ) {\displaystyle Q_{P}(h)} . Permutation: Since r {\displaystyle r} and s {\displaystyle s} are picked identically and independently, so swapping elements between them will not change the probability distribution on r {\displaystyle r} and s {\displaystyle s} . So, we will try to bound the probability of | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} for some h ∈ H {\displaystyle h\in H} by considering the effect of a specific collection of permutations of the joint sample x = r | | s {\displaystyle x=r||s} . Specifically, we consider permutations σ ( x ) {\displaystyle \sigma (x)} which swap x i {\displaystyle x_{i}} and x m + i {\displaystyle x_{m+i}} in some subset of 1 , 2 , . . . , m {\displaystyle {1,2,...,m}} . The symbol r | | s {\displaystyle r||s} means the concatenation of r {\displaystyle r} and s {\displaystyle s} . Reduction to a finite class: We can now restrict the function class H {\displaystyle H} to a fixed joint sample and hence, if H {\displaystyle H} has finite VC Dimension, it reduces to the problem to one involving a finite function class. We present the technical details of the proof. It should be stressed that this proof glosses over details like the measurability of the events V {\displaystyle V} and R {\displaystyle R} ; measurability is granted in the case of H {\displaystyle H} being finite or countable, but this is not normally the case in standard applications of the theorem (e.g. for statistical learning theory or to prove the Glivenko-Cantelli theorem). To get measurability, one needs to use a notion of separability of the underlying space, possibly related to H {\displaystyle H} . === Symmetrization === Lemma: Let V = { x ∈ X m : | Q P ( h ) − Q ^ x ( h ) | ≥ ε for some h ∈ H } {\displaystyle V=\{x\in X^{m}:|Q_{P}(h)-{\widehat {Q}}_{x}(h)|\geq \varepsilon {\text{ for some }}h\in H\}} and R = { ( r , s ) ∈ X m × X m : | Q r ^ ( h ) − Q ^ s ( h ) | ≥ ε / 2 for some h ∈ H } . {\displaystyle R=\{(r,s)\in X^{m}\times X^{m}:|{\widehat {Q_{r}}}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2{\text{ for some }}h\in H\}.} Then for m ≥ 2 ε 2 {\displaystyle m\geq {\frac {2}{\varepsilon ^{2}}}} , P m ( V ) ≤ 2 P 2 m ( R ) {\displaystyle P^{m}(V)\leq 2P^{2m}(R)} . Proof: By the triangle inequality, if | Q P ( h ) − Q ^ r ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon } and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2} then | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} . Therefore, P 2 m ( R ) ≥ P 2 m { ∃ h ∈ H , | Q P ( h ) − Q ^ r ( h ) | ≥ ε and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 } = ∫ V P m { s : ∃ h ∈ H , | Q P ( h ) − Q ^ r ( h ) | ≥ ε and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 } d P m ( r ) = A {\displaystyle {\begin{aligned}&P^{2m}(R)\\[5pt]\geq {}&P^{2m}\{\exists h\in H,|Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon {\text{ and }}|Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2\}\\[5pt]={}&\int _{V}P^{m}\{s:\exists h\in H,|Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon {\text{ and }}|Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2\}\,dP^{m}(r)\\[5pt]={}&A\end{aligned}}} since r {\displaystyle r} and s {\displaystyle s} are independent. Now for r ∈ V {\displaystyle r\in V} fix an h ∈ H {\displaystyle h\in H} such that | Q P ( h ) − Q ^ r ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon } . For this h {\displaystyle h} , we shall

    Read more →
  • Navigational database

    Navigational database

    A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential. One of the earliest navigational databases was Integrated Data Store (IDS), which was developed by Bachman for General Electric in the 1960s. IDS became the basis for the CODASYL database model in 1969. Although Bachman described the concept of navigation in abstract terms, the idea of navigational access came to be associated strongly with the procedural design of the CODASYL Data Manipulation Language. Writing in 1982, for example, Tsichritzis and Lochovsky state that "The notion of currency is central to the concept of navigation." By the notion of currency, they refer to the idea that a program maintains (explicitly or implicitly) a current position in any sequence of records that it is processing, and that operations such as GET NEXT and GET PRIOR retrieve records relative to this current position, while also changing the current position to the record that is retrieved. Navigational database programming thus came to be seen as intrinsically procedural; and moreover to depend on the maintenance of an implicit set of global variables (currency indicators) holding the current state. As such, the approach was seen as diametrically opposed to the declarative programming style used by the relational model. The declarative nature of relational languages such as SQL offered better programmer productivity and a higher level of data independence (that is, the ability of programs to continue working as the database structure evolves.) Navigational interfaces, as a result, were gradually eclipsed during the 1980s by declarative query languages. During the 1990s it started becoming clear that for certain applications handling complex data (for example, spatial databases and engineering databases), the relational calculus had limitations. At that time, a reappraisal of the entire database market began, with several companies describing the new systems using the marketing term NoSQL. Many of these systems introduced data manipulation languages which, while far removed from the CODASYL DML with its currency indicators, could be understood as implementing Bachman's "navigational" vision. Some of these languages are procedural; others (such as XPath) are entirely declarative. Offshoots of the navigational concept, such as the graph database, found new uses in modern transaction processing workloads. == Description == Navigational access is traditionally associated with the network model and hierarchical model of database, and conventionally describes data manipulation APIs in which records (or objects) are processed one at a time, iteratively. The essential characteristic as described by Bachman, however, is finding records by virtue of their relationship to other records: so an interface can still be navigational if it has set-oriented features. From this viewpoint, the key difference between navigational data manipulation languages and relational languages is the use of explicit named relationships rather than value-based joins: for department with name="Sales", find all employees in set department-employees versus find employees, departments where employee.department-code = department.code and department.name="Sales". In practice, however, most navigational APIs have been procedural: the above query would be executed using procedural logic along the lines of the following pseudo-code: On this viewpoint, the key difference between navigational APIs and the relational model (implemented in relational databases) is that relational APIs use "declarative" or logic programming techniques that ask the system what to fetch, while navigational APIs instruct the system in a sequence of steps how to reach the required records. Most criticisms of navigational APIs fall into one of two categories: Usability: application code quickly becomes unreadable and difficult to debug Data independence: application code needs to change whenever the data structure changes For many years the primary defence of navigational APIs was performance. Database systems that support navigational APIs often use internal storage structures that contain physical links or pointers from one record to another. While such structures may allow very efficient navigation, they have disadvantages because it becomes difficult to reorganize the physical placement of data. It is quite possible to implement navigational APIs without low-level pointer chasing (Bachman's paper envisaged logical relationships being implemented just as in relational systems, using primary keys and foreign keys), so the two ideas should not be conflated. But without the performance benefits of low-level pointers, navigational APIs become harder to justify. Hierarchical models often construct primary keys for records by concatenating the keys that appear at each level in the hierarchy. Such composite identifiers are found in computer file names (/usr/david/docs/index.txt), in URIs, in the Dewey decimal system, and for that matter in postal addresses. Such a composite key can be considered as representing a navigational path to a record; but equally, it can be considered as a simple primary key allowing associative access. As relational systems came to prominence in the 1980s, navigational APIs (and in particular, procedural APIs) were criticized and fell out of favour. The 1990s, however, brought a new wave of object-oriented databases that often provided both declarative and procedural interfaces. One explanation for this is that they were often used to represent graph-structured information (for example spatial data and engineering data) where access is inherently recursive: the mathematics originally underpinning SQL (specifically, first-order predicate calculus) does not have sufficient power to support recursive queries, even those as simple as a transitive closure. More recent SQL implementations do support hierarchical and recursive queries. A current example of a popular navigational API can be found in the Document Object Model (DOM) often used in web browsers and closely associated with JavaScript. The DOM is essentially an in-memory hierarchical database with an API that is both procedural and navigational. By contrast, the same data (XML or HTML) can be accessed using XPath, which can be categorized as declarative and navigational: data is accessed by following relationships, but the calling program does not issue a sequence of instructions to be followed in order. Languages such as SPARQL used to retrieve Linked Data from the Semantic Web are also simultaneously declarative and navigational. == Examples == IBM Information Management System IDMS

    Read more →
  • In-place algorithm

    In-place algorithm

    In computer science, an in-place algorithm is an algorithm that operates directly on the input data structure without requiring extra space proportional to the input size. In other words, it modifies the input in place, without creating a separate copy of the data structure. An algorithm which is not in-place is sometimes called not-in-place or out-of-place. In-place can have slightly different meanings. In its strictest form, the algorithm can only have a constant amount of extra space, counting everything including function calls and pointers. However, this form is very limited as simply having an index to a length n array requires O(log n) bits. More broadly, in-place means that the algorithm does not use extra space for manipulating the input but may require a small though non-constant extra space for its operation. Usually, this space is O(log n), though sometimes anything in o(n) is allowed. Note that space complexity also has varied choices in whether or not to count the index lengths as part of the space used. Often, the space complexity is given in terms of the number of indices or pointers needed, ignoring their length. In this article, we refer to total space complexity (DSPACE), counting pointer lengths. Therefore, the space requirements here have an extra log n factor compared to an analysis that ignores the lengths of indices and pointers. An algorithm may or may not count the output as part of its space usage. Since in-place algorithms usually overwrite their input with output, no additional space is needed. When writing the output to write-only memory or a stream, it may be more appropriate to only consider the working space of the algorithm. In theoretical applications such as log-space reductions, it is more typical to always ignore output space (in these cases it is more essential that the output is write-only). == Examples == Given an array a of n items, suppose we want an array that holds the same elements in reversed order and to dispose of the original. One seemingly simple way to do this is to create a new array of equal size, fill it with copies from a in the appropriate order and then delete a. function reverse(a[0..n - 1]) allocate b[0..n - 1] for i from 0 to n - 1 b[n − 1 − i] := a[i] return b Unfortunately, this requires O(n) extra space for having the arrays a and b available simultaneously. Also, allocation and deallocation are often slow operations. Since we no longer need a, we can instead overwrite it with its own reversal using this in-place algorithm which will only need constant number (2) of integers for the auxiliary variables i and tmp, no matter how large the array is. function reverse_in_place(a[0..n-1]) for i from 0 to floor((n-2)/2) tmp := a[i] a[i] := a[n − 1 − i] a[n − 1 − i] := tmp As another example, many sorting algorithms rearrange arrays into sorted order in-place, including: bubble sort, comb sort, selection sort, insertion sort, heapsort, and Shell sort. These algorithms require only a few pointers, so their space complexity is O(log n). Quicksort operates in-place on the data to be sorted. However, quicksort requires O(log n) stack space pointers to keep track of the subarrays in its divide and conquer strategy. Consequently, quicksort needs O(log2 n) additional space. Although this non-constant space technically takes quicksort out of the in-place category, quicksort and other algorithms needing only O(log n) additional pointers are usually considered in-place algorithms. Most selection algorithms are also in-place, although some considerably rearrange the input array in the process of finding the final, constant-sized result. Some text manipulation algorithms such as trim and reverse may be done in-place. == In computational complexity == In computational complexity theory, the strict definition of in-place algorithms includes all algorithms with O(1) space complexity, the class DSPACE(1). This class is very limited; it equals the regular languages. In fact, it does not even include any of the examples listed above. Algorithms are usually considered in L, the class of problems requiring O(log n) additional space, to be in-place. This class is more in line with the practical definition, as it allows numbers of size n as pointers or indices. This expanded definition still excludes quicksort, however, because of its recursive calls. Identifying the in-place algorithms with L has some interesting implications; for example, it means that there is a (rather complex) in-place algorithm to determine whether a path exists between two nodes in an undirected graph, a problem that requires O(n) extra space using typical algorithms such as depth-first search (a visited bit for each node). This in turn yields in-place algorithms for problems such as determining if a graph is bipartite or testing whether two graphs have the same number of connected components. == Role of randomness == In many cases, the space requirements of an algorithm can be drastically cut by using a randomized algorithm. For example, if one wishes to know if two vertices in a graph of n vertices are in the same connected component of the graph, there is no known simple, deterministic, in-place algorithm to determine this. However, if we simply start at one vertex and perform a random walk of about 20n3 steps, the chance that we will stumble across the other vertex provided that it is in the same component is very high. Similarly, there are simple randomized in-place algorithms for primality testing such as the Miller–Rabin primality test, and there are also simple in-place randomized factoring algorithms such as Pollard's rho algorithm. == In functional programming == Functional programming languages often discourage or do not support explicit in-place algorithms that overwrite data, since this is a type of side effect; instead, they only allow new data to be constructed. However, good functional language compilers will often recognize when an object very similar to an existing one is created and then the old one is thrown away, and will optimize this into a simple mutation "under the hood". Note that it is possible in principle to carefully construct in-place algorithms that do not modify data (unless the data is no longer being used), but this is rarely done in practice.

    Read more →