AI Chat On Google

AI Chat On Google — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • ConEmu

    ConEmu

    ConEmu (short for Console emulator) is a free and open-source tabbed terminal emulator for Windows. ConEmu presents multiple consoles and simple GUI applications as one customizable GUI window with tabs and a status bar. It also provides emulation for ANSI escape codes for color, bypassing the capabilities of the standard Windows Console Host to provide 256 and 24-bit color in Windows. The program has a large range of customization, including custom color palettes for the standard 16 colors, hotkeys, transparency, an auto-hideable mode (similar to the way Quake originally displayed its developer console). Initially, the program was created as a companion to Far Manager, bringing some features common for graphical file managers to this console application (thumbnails and tiles, drag and drop with other windows, true color interface, and others). As of 2012, ConEmu could be used with any other Win32 console application or simple GUI tool (such as Notepad, PuTTY or DOSBox). ConEmu doesn't provide any shell itself, but rather allows using any other shell. It does provide a limited macro language, to control the hosted applications startup.

    Read more →
  • Linguistic categories

    Linguistic categories

    Linguistic categories include Lexical category, a part of speech such as noun, preposition, etc. Syntactic category, a similar concept which can also include phrasal categories Grammatical category, a grammatical feature such as tense, gender, etc. The definition of linguistic categories is a major concern of linguistic theory, and thus, the definition and naming of categories varies across different theoretical frameworks and grammatical traditions for different languages. The operationalization of linguistic categories in lexicography, computational linguistics, natural language processing, corpus linguistics, and terminology management typically requires resource-, problem- or application-specific definitions of linguistic categories. In Cognitive linguistics it has been argued that linguistic categories have a prototype structure like that of the categories of common words in a language. == Linguistic category inventories == To facilitate the interoperability between lexical resources, linguistic annotations and annotation tools and for the systematic handling of linguistic categories across different theoretical frameworks, a number of inventories of linguistic categories have been developed and are being used, with examples as given below. The practical objective of such inventories is to perform quantitative evaluation (for language-specific inventories), to train NLP tools, or to facilitate cross-linguistic evaluation, querying or annotation of language data. At a theoretical level, the existence of universal categories in human language has been postulated, e.g., in Universal grammar, but also heavily criticized. === Part-of-Speech tagsets === Schools commonly teach that there are 9 parts of speech in English: noun, verb, article, adjective, preposition, pronoun, adverb, conjunction, and interjection. However, there are clearly many more categories and sub-categories. For nouns, the plural, possessive, and singular forms can be distinguished. In many languages words are also marked for their case (role as subject, object, etc.), grammatical gender, and so on; while verbs are marked for tense, aspect, and other things. In some tagging systems, different inflections of the same root word will get different parts of speech, resulting in a large number of tags. For example, NN for singular common nouns, NNS for plural common nouns, NP for singular proper nouns (see the POS tags used in the Brown Corpus). Other tagging systems use a smaller number of tags and ignore fine differences or model them as features somewhat independent from part-of-speech. In part-of-speech tagging by computer, it is typical to distinguish from 50 to 150 separate parts of speech for English. POS tagging work has been done in a variety of languages, and the set of POS tags used varies greatly with language. Tags usually are designed to include overt morphological distinctions, although this leads to inconsistencies such as case-marking for pronouns but not nouns in English, and much larger cross-language differences. The tag sets for heavily inflected languages such as Greek and Latin can be very large; tagging words in agglutinative languages such as Inuit languages may be virtually impossible. Work on stochastic methods for tagging Koine Greek (DeRose 1990) has used over 1,000 parts of speech and found that about as many words were ambiguous in that language as in English. A morphosyntactic descriptor in the case of morphologically rich languages is commonly expressed using very short mnemonics, such as ncmsan for category = noun, type = common, gender = masculine, number = singular, case = accusative, animate = no. The most popular tag set for POS tagging for American English is probably the Penn tag set, developed in the Penn Treebank project. === Multilingual annotation schemes === For Western European languages, cross-linguistically applicable annotation schemes for parts-of-speech, morphosyntax and syntax have been developed with the EAGLES Guidelines. The "Expert Advisory Group on Language Engineering Standards" (EAGLES) was an initiative of the European Commission that ran within the DG XIII Linguistic Research and Engineering programme from 1994 to 1998, coordinated by Consorzio Pisa Ricerche, Pisa, Italy. The EAGLES guidelines provide guidance for markup to be used with text corpora, particularly for identifying features relevant in computational linguistics and lexicography. Numerous companies, research centres, universities and professional bodies across the European Union collaborated to produce the EAGLES Guidelines, which set out recommendations for de facto standards and rules of best practice for: Large-scale language resources (such as text corpora, computational lexicons and speech corpora); Means of manipulating such knowledge, via computational linguistic formalisms, mark up languages and various software tools; Means of assessing and evaluating resources, tools and products. The Eagles guidelines have inspired subsequent work on other regions, as well, e.g., Eastern Europe. A generation later, a similar effort was initiated by the research community under the umbrella of Universal Dependencies. Petrov et al. have proposed a "universal", but highly reductionist, tag set, with 12 categories (for example, no subtypes of nouns, verbs, punctuation, etc.; no distinction of "to" as an infinitive marker vs. preposition (hardly a "universal" coincidence), etc.). Subsequently, this was complemented with cross-lingual specifications for dependency syntax (Stanford Dependencies), and morphosyntax (Interset interlingua, partially building on the Multext-East/Eagles tradition) in the context of the Universal Dependencies (UD), an international cooperative project to create treebanks of the world's languages with cross-linguistically applicable ("universal") annotations for parts of speech, dependency syntax, and (optionally) morphosyntactic (morphological) features. Core applications are automated text processing in the field of natural language processing (NLP) and research into natural language syntax and grammar, especially within linguistic typology. The annotation scheme has it roots in three related projects: The UD annotation scheme uses a representation in the form of dependency trees as opposed to a phrase structure trees. At as of February 2019, there are just over 100 treebanks of more than 70 languages available in the UD inventory. The project's primary aim is to achieve cross-linguistic consistency of annotation. However, language-specific extensions are permitted for morphological features (individual languages or resources can introduce additional features). In a more restricted form, dependency relations can be extended with a secondary label that accompanies the UD label, e.g., aux:pass for an auxiliary (UD aux) used to mark passive voice. The Universal Dependencies have inspired similar efforts for the areas of inflectional morphology, frame semantics and coreference. For phrase structure syntax, a comparable effort does not seem to exist, but the specifications of the Penn Treebank have been applied to (and extended for) a broad range of languages, e.g., Icelandic, Old English, Middle English, Middle Low German, Early Modern High German, Yiddish, Portuguese, Japanese, Arabic and Chinese. === Conventions for interlinear glosses === In linguistics, an interlinear gloss is a gloss (series of brief explanations, such as definitions or pronunciations) placed between lines (inter- + linear), such as between a line of original text and its translation into another language. When glossed, each line of the original text acquires one or more lines of transcription known as an interlinear text or interlinear glossed text (IGT)—interlinear for short. Such glosses help the reader follow the relationship between the source text and its translation, and the structure of the original language. There is no standard inventory for glosses, but common labels are collected in the Leipzig Glossing Rules. Wikipedia also provides a List of glossing abbreviations that draws on this and other sources. === General Ontology for Linguistic Description (GOLD) === GOLD ("General Ontology for Linguistic Description") is an ontology for descriptive linguistics. It gives a formalized account of the most basic categories and relations used in the scientific description of human language, e.g., as a formalization of interlinear glosses. GOLD was first introduced by Farrar and Langendoen (2003). Originally, it was envisioned as a solution to the problem of resolving disparate markup schemes for linguistic data, in particular data from endangered languages. However, GOLD is much more general and can be applied to all languages. In this function, GOLD overlaps with the ISO 12620 Data Category Registry (ISOcat); it is, however, more stringently structured. GOLD was maintained by the LINGUIST List and others from 2007 to 2010. The RELISH project created a mirro

    Read more →
  • Operational historian

    Operational historian

    In manufacturing, an operational historian is a time-series database application that is developed for operational process data. Historian software is often embedded or used in conjunction with standard DCS and PLC control systems to provide enhanced data capture, validation, compression, and aggregation capabilities. Historians have been deployed in almost every industry and contribute to functions such as supervisory control, performance monitoring, quality assurance, and, more recently, machine learning applications which can learn from vast quantities of historical data. These systems were originally developed to capture instrumentation and control data, which led many to use the term "tag" for a stream of process data, referring to the physical "tags" which had been placed on instrumentation for manually capturing data. Raw data may be accessed via OPC HDA, SQL, or REST API interfaces. == Operational Support == Operational historians are typically used within the manufacturing facility by engineers and operators for supervisory functions and analysis. An operational historian will typically capture all instrumentation and control data, whereas an enterprise historian that is deployed to support business functions will capture only a subset of the plant data. Typically, these applications offer data access through dedicated APIs (Application Programming Interfaces) and SDKs (Software Development Kits) which offer high-performance read and write operations. These operate through vendor-specific or custom applications. Front-end tools for trending process data over time are the most common interfaces to these databases. Because these applications are typically deployed next to or near the source of their process data, they are often marketed and sold as 'real-time database systems.' This distinction varies among vendors, who often have to make tradeoffs in performance between data capture and presentation, and application and analysis functionality. The following is a list of typical challenges for operational historians: data collection from instrumentation and controls storage and archiving of very large volumes of data organization of data in the form of "tags" or "points" limiting of monitoring (alarms) and validation aggregation and interpolation manual data entry (MDE) == Data access == As opposed to enterprise historians, the data access layer in the operational historian is designed to offer sophisticated data fetching modes without complex information analysis facilities. The following settings are typically available for data access operations: Data scope (single point or tag, history based on time range, history based on sample count) Request modes (raw data, last-known value, aggregation, interpolation) Sampling (single point, all points without sampling, all points with interval sampling) Data omission (based on the sample quality, based on the sample value, based on the count) Even though the operational historians are rarely relational database management systems, they often offer SQL-based interfaces to query the database. In most of such implementations, the dialect does not follow the SQL standard in order to provide syntax for specifying data access operations parameters.

    Read more →
  • Literature review

    Literature review

    A literature review is an overview of previously published works on a particular topic. The term can refer to a full scholarly paper or a section of a scholarly work such as books or articles. Either way, a literature review provides the researcher/author and the audiences with general information of an existing knowledge of a particular topic. A good literature review has a proper research question, a proper theoretical framework, and/or a chosen research method. It serves to situate the current study within the body of the relevant literature and provides context for the reader. In such cases, the review usually precedes the methodology and results sections of the work. Producing a literature review is often part of a graduate and post-graduate requirement, included in the preparation of a thesis, dissertation, or a journal article. Literature reviews are also common in a research proposal or prospectus (the document approved before a student formally begins a dissertation or thesis). A literature review can be a type of a review article. In this sense, it is a scholarly paper that presents the current knowledge including substantive findings as well as theoretical and methodological contributions to a particular topic. Literature reviews are secondary sources and do not report new or original experimental work. Most often associated with academic-oriented literature, such reviews are found in academic journals and are not to be confused with book reviews, which may also appear in the same publication. Literature reviews are a basis for research in nearly every academic field. == Types == Since the concept of a systematic review was formalized in the 1970s, a basic division among types of reviews is the dichotomy of narrative reviews versus systematic reviews. The main types of narrative reviews are evaluative, exploratory, and instrumental. A fourth type of review of literature (the scientific literature) is the systematic review but it is not called a literature review, which absent further specification, conventionally refers to narrative reviews. A systematic review focuses on a specific research question to identify, appraise, select, and synthesize all high-quality research evidence and arguments relevant to that question. A meta-analysis is typically a systematic review using statistical methods to effectively combine the data used on all selected studies to produce a more reliable result. Torraco (2016) describes an integrative literature review. The purpose of an integrative literature review is to generate new knowledge on a topic through the process of review, critique, and synthesis of the literature under investigation. George et al (2023) offer an extensive overview of review approaches. They also propose a model for selecting an approach by looking at the purpose, object, subject, community, and practices of the review. They describe six different types of review, each with their own unique purposes: Exploratory or scoping reviews focus on breadth as opposed to depth Systematic or integrative reviews integrate empirical studies on a topic Meta-narrative reviews are qualitative and use literature to compare research or practice communities Problematizing or critical reviews propose new perspectives on a concept by association with other literature Meta-analyses and meta-regressions integrate quantitative studies and identify moderators Mixed research syntheses combine other review approaches in the same paper == Process and product == Shields and Rangarajan (2013) distinguish between the process of reviewing the literature and a finished work or product known as a literature review. The process of reviewing the literature is often ongoing and informs many aspects of the empirical research project. The process of reviewing the literature requires different kinds of activities and ways of thinking. Shields and Rangarajan (2013) and Granello (2001) link the activities of doing a literature review with Benjamin Bloom's revised taxonomy of the cognitive domain (ways of thinking: remembering, understanding, applying, analyzing, evaluating, and creating). === Use of artificial intelligence in a literature review === Artificial intelligence (AI) is reshaping traditional literature reviews across various disciplines. Generative pre-trained transformers, such as ChatGPT, are often used by students and academics for review purposes. Since 2023, an increasing number of tools powered by large language models and other artificial intelligence technologies have been developed to assist, automate, or generate literature reviews. Nevertheless, the employment of ChatGPT in academic reviews is problematic due to ChatGPT's propensity to "hallucinate". In response, efforts are being made to mitigate these hallucinations through the integration of plugins. For instance, Rad et al. (2023) used ScholarAI for review in cardiothoracic surgery.

    Read more →
  • Pydio

    Pydio

    Pydio Cells, previously known as just Pydio and formerly known as AjaXplorer, is an open-source file-sharing and synchronisation software that runs on the user's own server or in the cloud. == Presentation == The project was created by musician Charles Du Jeu (current CEO and CTO) in 2007 under the name AjaXplorer. The name was changed in 2013 and became Pydio (an acronym for Put Your Data in Orbit). In May 2018, Pydio switched from PHP to Go with the release of Pydio Cells. The PHP version reached end-of-life state on 31 December 2019. Pydio Cells runs on any server supporting a recent Go version. Windows/Linux/macOS on the Intel architecture are directly supported; a fully functional working ARM implementation is under active development. Pydio Cells has been developed from scratch using the Go programming language; release 4.0.0 introduced code refactoring to fully support the Go modular structure as well as grid computing. Nevertheless, the web-based interface of Cells is very similar to the one from Pydio 8 (in PHP), and it successfully replicates most of its features, while adding a few more. There is also a new synchronisation client (also written in Go). The PHP version has been phased out as the company's focus is moving to Pydio Cells, with community feedback on the new features. According to the company, the switch to the new environment was made "to overcome inherent PHP limitations and provide you with a future-proof and modern solution for collaborating on documents". From a technical point of view, Pydio differs from solutions such as Google Drive or Dropbox. Pydio is not based on a public cloud; instead, the software connects to the user's existing storage (such as SAN / Local FS, SAMBA / CIFS, (s)FTP, NFS, S3-compatible cloud storage, Azure Blob Storage, Google Cloud Storage) as well as to the existing user directories (LDAP / AD, OAuth2 / OIDC SSO, SAML / Azure ADFS SSO, RADIUS, Shibboleth...), which allows companies to keep their data inside their infrastructure, according to their data security policy and user rights management. The software is built in a modular perspective; up to Pydio 8, various plugins allowed administrators to implement extra features. On the server side, Pydio Cells is deployed as a collection of independent microservices communicating among themselves using gRPC and logging user actions via Activity Streams 2.0 (AS2). Pydio Cells microservices are built with the Go Micro framework (using an embedded NATS server). A standard installation will deploy all required services on the same physical server, but for the purposes of performance, reliability and high availability, these can now be spread across several different servers (even in geographically separate locations) according to the 12-factors architecture pattern. Pydio Cells is available either through a free and open-source community distribution (Pydio Cells Home), or a commercially-licensed enterprise distribution (in two variants, Pydio Cells Connect and Pydio Cells Enterprise), which add features not available in the community distribution as well as additional levels of support beyond the community forums. == Features == File sharing between different internal users and across other Pydio instances SSL/TLS Encryption WebDAV file server Creation of dedicated workspaces, for each line of business / project / client, with a dedicated user rights management for each workspace. File-sharing with external users (private links, public links, password protection, download limitation, etc.) Online viewing and editing of documents with Collabora Office (Pydio Cells Enterprise also offers OnlyOffice integration) Preview and editing of image files Integrated audio and video reader Activity stream ('timeline') for all actions taken by users Integrated chat platform Client applications are available for all major desktop and mobile platforms.

    Read more →
  • Kinodynamic planning

    Kinodynamic planning

    In robotics and motion planning, kinodynamic planning is a class of problems for which velocity, acceleration, and force/torque bounds must be satisfied, together with kinematic constraints such as avoiding obstacles. The term was coined by Bruce Donald, Pat Xavier, John Canny, and John Reif. Donald et al. developed the first polynomial-time approximation schemes (PTAS) for the problem. By providing a provably polynomial-time ε-approximation algorithm, they resolved a long-standing open problem in optimal control. Their first paper considered time-optimal control ("fastest path") of a point mass under Newtonian dynamics, amidst polygonal (2D) or polyhedral (3D) obstacles, subject to state bounds on position, velocity, and acceleration. Later they extended the technique to many other cases, for example, to 3D open-chain kinematic robots under full Lagrangian dynamics. == Modern approaches == Since the foundational theoretical work of the 1990s, the field has evolved significantly with new algorithmic approaches that address the computational and practical limitations of early methods. === Sampling-based methods === Many practical heuristic algorithms based on stochastic optimization and iterative sampling have been developed by a wide range of authors to address the kinodynamic planning problem. Popular approaches include extensions of RRT algorithms such as RRT for kinodynamic systems, and sampling-based methods like Model Predictive Path Integral (MPPI) control. These stochastic techniques have been shown to work well in practice and can handle complex, high-dimensional state spaces more efficiently than deterministic methods. However, all motion planning methods are subject to the PSPACE-hardnesss of classical motion planning even without dynamics, which means (assuming the usual structural complexity conjectures) they all can be worst-case exponential-time in the state-space dimension (the number of degrees of freedom). On the other hand, the deterministic methods have provable guarantees of completeness, accuracy, and complexity (for fixed dimension, they are polynomial-time not only in the geometric complexity, but also in ( 1 / ε ) {\displaystyle (1/\varepsilon )} , the closeness of the desired approximation), whereas most of the recent heuristic/stochastic methods sacrifice at least one of these criteria. === Mixed-integer optimization approaches === Recent advances in mixed-integer programming have enabled new deterministic approaches to kinodynamic planning. These methods formulate the planning problem as an optimization task that simultaneously determines the spatial path and control sequence while respecting all kinodynamic constraints. By using techniques such as McCormick envelopes to handle bilinear constraints, these approaches can provide globally optimal solutions with mathematical guarantees while achieving significant computational speedups over traditional methods. === Genetic algorithm approaches === Genetic algorithms have also been adapted for kinodynamic planning, particularly for gradient-free optimization in challenging terrain. These methods use evolutionary computation to optimize trajectories over receding horizons, with specialized mutation operators that ensure vehicle controls remain within operational limits. This approach is particularly useful when dealing with non-differentiable cost functions or when gradient information is unavailable or unreliable. === Three-dimensional terrain planning === The foundational theoretical work of the 1990s was extended to higher degrees of freedom, and even to n {\displaystyle n} -link, 3D open-chain kinematic robots under full Lagrangian dynamics. However, many of the subsequent heuristic techniques (typically employing stochastic optimization) were confined to planar environments. More recent kinodynamic planning has extended beyond these planar environments to handle complex 3D terrains represented as simplicial complexes or triangular meshes. This advancement is particularly important for applications such as autonomous vehicle navigation in off-road environments, where elevation changes and terrain geometry significantly impact vehicle dynamics. These methods must account for pitch angles, surface curvature, and the coupling between terrain geometry and vehicle kinodynamic constraints. == Performance and guarantees == The landscape of performance guarantees in kinodynamic planning has evolved considerably. While early heuristic methods could not guarantee optimality, recent mixed-integer approaches have demonstrated the ability to find globally optimal solutions with proven constraint satisfaction. Experimental comparisons have shown that modern optimization-based planners can achieve execution times several orders of magnitude faster than sampling-based methods while maintaining strict adherence to kinodynamic constraints. However, the choice of method often depends on the specific application requirements. Sampling-based methods remain valuable for their ability to quickly find feasible solutions in high-dimensional spaces and their robustness to modeling uncertainties. Optimization-based methods excel when optimality guarantees and constraint compliance are critical, particularly in safety-critical applications. == Applications == Kinodynamic planning finds applications across numerous domains including: Autonomous vehicles: Path planning for cars, trucks, and other ground vehicles that must respect acceleration, steering, and velocity limits Aerial robotics: Trajectory planning for quadrotors and other unmanned aerial vehicles with dynamic constraints Manipulation: Planning for robotic arms where joint velocities, accelerations, and torques are limited Legged locomotion: Footstep and trajectory planning for walking and running robots Space robotics: Planning under thrust and fuel constraints for spacecraft and rovers

    Read more →
  • Sardinas–Patterson algorithm

    Sardinas–Patterson algorithm

    In coding theory, the Sardinas–Patterson algorithm is a classical algorithm for determining in polynomial time whether a given variable-length code is uniquely decodable, named after August Albert Sardinas and George W. Patterson, who published it in 1953. The algorithm carries out a systematic search for a string which admits two different decompositions into codewords. As Knuth reports, the algorithm was rediscovered about ten years later in 1963 by Floyd, despite the fact that it was at the time already well known in coding theory. == Idea of the algorithm == Consider the code { a ↦ 1 , b ↦ 011 , c ↦ 01110 , d ↦ 1110 , e ↦ 10011 } {\displaystyle \{\,{\texttt {a}}\mapsto {\texttt {1}},{\texttt {b}}\mapsto {\texttt {011}},{\texttt {c}}\mapsto {\texttt {01110}},{\texttt {d}}\mapsto {\texttt {1110}},{\texttt {e}}\mapsto {\texttt {10011}}\,\}} . This code, which is based on an example by Berstel, is an example of a code which is not uniquely decodable, since the string 011101110011 can be interpreted as the sequence of codewords 01110 – 1110 – 011, but also as the sequence of codewords 011 – 1 – 011 – 10011. Two possible decodings of this encoded string are thus given by cdb and babe. In general, a codeword can be found by the following idea: In the first round, we choose two codewords x 1 {\displaystyle x_{1}} and y 1 {\displaystyle y_{1}} such that x 1 {\displaystyle x_{1}} is a prefix of y 1 {\displaystyle y_{1}} , that is, x 1 w = y 1 {\displaystyle x_{1}w=y_{1}} for some "dangling suffix" w {\displaystyle w} . If one tries first x 1 = 011 {\displaystyle x_{1}={\texttt {011}}} and y 1 = 01110 {\displaystyle y_{1}={\texttt {01110}}} , the dangling suffix is w = 10 {\displaystyle {\texttt {w}}={\texttt {10}}} . If we manage to find two sequences x 2 , … , x p {\displaystyle x_{2},\ldots ,x_{p}} and y 2 , … , y q {\displaystyle y_{2},\ldots ,y_{q}} of codewords such that x 2 ⋯ x p = w y 2 ⋯ y q {\displaystyle x_{2}\cdots x_{p}=wy_{2}\cdots y_{q}} , then we are finished: For then the string x = x 1 x 2 ⋯ x p {\displaystyle x=x_{1}x_{2}\cdots x_{p}} can alternatively be decomposed as y 1 y 2 ⋯ y q {\displaystyle y_{1}y_{2}\cdots y_{q}} , and we have found the desired string having at least two different decompositions into codewords. In the second round, we try out two different approaches: the first trial is to look for a codeword that has w as prefix. Then we obtain a new dangling suffix w, with which we can continue our search. If we eventually encounter a dangling suffix that is itself a codeword (or the empty word), then the search will terminate, as we know there exists a string with two decompositions. The second trial is to seek for a codeword that is itself a prefix of w. In our example, we have w = 10 {\displaystyle w={\texttt {10}}} , and the sequence 1 is a codeword. We can thus also continue with w = 0 {\displaystyle w={\texttt {0}}} as the new dangling suffix. == Precise description of the algorithm == The algorithm is described most conveniently using quotients of formal languages. In general, for two sets of strings D and N, the (left) quotient N − 1 D {\displaystyle N^{-1}D} is defined as the residual words obtained from D by removing some prefix in N. Formally, N − 1 D = { y ∣ x y ∈ D and x ∈ N } {\displaystyle N^{-1}D=\{\,y\mid xy\in D~{\textrm {and}}~x\in N\,\}} . Now let C {\displaystyle C} denote the (finite) set of codewords in the given code. The algorithm proceeds in rounds, where we maintain in each round not only one dangling suffix as described above, but the (finite) set of all potential dangling suffixes. Starting with round i = 1 {\displaystyle i=1} , the set of potential dangling suffixes will be denoted by S i {\displaystyle S_{i}} . The sets S i {\displaystyle S_{i}} are defined inductively as follows: S 1 = C − 1 C ∖ { ε } {\displaystyle S_{1}=C^{-1}C\setminus \{\varepsilon \}} . Here, the symbol ε {\displaystyle \varepsilon } denotes the empty word. S i + 1 = C − 1 S i ∪ S i − 1 C {\displaystyle S_{i+1}=C^{-1}S_{i}\cup S_{i}^{-1}C} , for all i ≥ 1 {\displaystyle i\geq 1} . The algorithm computes the sets S i {\displaystyle S_{i}} in increasing order of i {\displaystyle i} . As soon as one of the S i {\displaystyle S_{i}} contains a word from C or the empty word, then the algorithm terminates and answers that the given code is not uniquely decodable. Otherwise, once a set S i {\displaystyle S_{i}} equals a previously encountered set S j {\displaystyle S_{j}} with j < i {\displaystyle j Read more →

  • Knowledge organization system

    Knowledge organization system

    Knowledge organization system (KOS), concept system, or concept scheme is the generic term used in knowledge organization (KO) for the selection of concepts with an indication of selected semantic relations. Despite their differences in type, coverage, and application, all KOS aim to support the organization of knowledge and information to facilitate their management and retrieval. KOS vary in complexity from simple sorted lists to complex relational networks. They represent both structural and functional features, and serve to eliminate ambiguity, control synonyms, establish relationships, and present properties. From their origins in library and information science (LIS), KOS have been applied to other domains and disciplines within science and industry, although scholarly research and debate remain primarily within the KO field. Challenges of KOS include ambiguity of terminology, repercussions of biased systems, and potential obsolescence. KOS can be expressed in RDF and RDFS as per the Simple Knowledge Organization System (SKOS) recommendation by W3C, which aims to enable the sharing and linking of KOS via the Web. One of the largest collections of KOS is the BARTOC registry. == Types == While different schema of KOS have been proposed, most are generally arranged in terms of the complexity of their construction and maintenance. Some scholars argue that organizing KOS on a spectrum oversimplifies the shared characteristics among them, and may even result in a non-ideal structure being chosen. The following types are not exhaustive, and are often not mutually-exclusive in practice. === Term lists === Term lists are the least structured form of KOS. They include lists, glossaries, dictionaries, and synonym rings. Authority files and gazetteers may also be considered term lists, however other scholars categorize them and directories as "metadata-like models". Examples include the Union List of Artist Names name authority file and the GeoNames gazetteer. === Categorization and classification === KOS that emphasize specific (and often hierarchical) structures include subject headings, taxonomies, categorization schema, and classification schema & systems. Despite inconsistent use of the terms "categorization" and "classification" in some literature, categorization is generally loosely-assembled grouping schema and may include attributes that are not mutually exclusive (or having fuzzy boundaries), while classification is related to the arrangement of non-overlapping and mutually-exclusive classes. Classification schema may be universal (such as Dewey Decimal Classification and Information Coding Classification) or domain-specific (such as the National Library of Medicine Classification). === Relationship models === The types of KOS with greatest complexity and which utilize connections between concepts include thesauri, semantic networks, and ontologies. One of the most prominent examples of a semantic network is WordNet. === Others === Certain structures proposed to be considered types of KOS—but are not consistently included in schema—include folksonomies, topic maps, web directory structures, publication organization systems, and bibliometric maps. Some KOS organize other KOS themselves—for instance, PeriodO is a gazetteer of periodization categories. == Applications == Some early KOS were developed as a support system for abstracting and indexing services to be used by specially-trained searchers. With the growth of information digitization, usability became increasingly accessible, and more complex structures were developed. Prominent examples of KOS outside of LIS include organism taxonomy in biology, the periodic table of elements in chemistry, SIC and NAICS classification systems for industry & business, and AGROVOC agricultural controlled vocabulary. == Challenges == The study and design of KOS is an ongoing topic of discussion among KO scholars. === Terminology === [There is] a serious lack of vocabulary control in the literature on controlled vocabulary. Inconsistency of terminology within the study of KOS is a common issue. For instance, "ontology" is used for both a specific type of KOS as well as a generic term for any KOS. The terms "taxonomy", "classification", and "categorization" are also sometimes used interchangeably. === Bias === As knowledge can be historically and culturally biased, scholars have also discussed how KOS themselves can perpetuate harmful practices or stereotypes. For example, a number of concerns and criticisms about the classification of mental disorders in the Diagnostic and Statistical Manual of Mental Disorders have been raised, contributing to ongoing revisions. Ethical and intentional design approaches have been proposed for multi-perspective KOS in efforts to mitigate bias and other harmful practices. === Obsolescence === The possible obsolescence of the thesaurus and other simpler KOS has been the topic of debate, especially in the face of increasingly complex ontologies, the growing usage of "Google-like retrieval systems", and the move of KO theory and research away from LIS and toward computer science. Supporters of thesauri argue its continued usefulness for metadata enrichment, vocabulary mapping, and web services, as well as its usage in specific domains such as corporate intranets and digital image libraries.

    Read more →
  • Opponent process

    Opponent process

    The opponent process is a hypothesis of color vision that states that the human visual system interprets information about color by processing signals from the three types of photoreceptor cells in an antagonistic manner. The three types of cones are called L, M, and S. The names stand for "Long wavelength sensitive,” "middle wavelength sensitive," and "short wavelength sensitive." The opponent-process theory implicates three opponent channels: L versus M, S versus (L+M), and a luminance channel (+ versus -). These cone-opponent mechanisms were at one time thought to be the neural substrate for a psychological theory called Hering's Opponent Colors Theory, which calls for three psychologically important opponent color processes: red versus green, blue versus yellow, and black versus white (luminance). The Opponent Colors Theory is named for the German physiologist Ewald Hering who proposed the idea in the late 19th century. However, it has been argued that Hering’s Opponent Colors Theory lacks adequate phenomenological and empirical support, and may not be a necessary feature of normal human color experience. Correspondingly, considerable physiological and behavioral evidence proves that the physiological cone opponent mechanisms do not constitute the neurobiological basis for Hering's Opponent Colors Theory. == Color theory == === Complementary colors === When staring at a bright color for a while (e.g. red), then looking away at a white field, an afterimage is perceived, such that the original color will evoke its complementary color (cyan, in the case of red input). When complementary colors are combined or mixed, they "cancel each other out" and become neutral (white or gray). That is, complementary colors are never perceived as a mixture; there is no "greenish red" or "yellowish blue", despite claims to the contrary. The strongest color contrast that a color can have is its complementary color. Complementary colors may also be called "opposite colors" and they were originally considered the primary evidence in support of Hering's Opponent Colors Theory. There are two fatal problems with this evidence. First, the complement of red is not green, as called for by Hering's theory; it is bluish-green. And second, there exists a complementary color for every color, so there is nothing special about the set of complementary pairs picked out by Hering's theory. === Unique hues === The colors that define the extremes for each opponent channel are called unique hues, as opposed to composite (mixed) hues. Ewald Hering first defined the unique hues as red, green, blue, and yellow, and based them on the concept that these colors could not be simultaneously perceived. For example, a color cannot appear both red and green. These definitions have been experimentally refined and are represented today by average hue angles of 353° (carmine red), 128° (cobalt green), 228° (cobalt blue), 58° (yellow). The unique hues are a defining feature of many psychological color spaces, but there is substantial evidence showing that the unique hues are not hard wired in the nervous system, contrary to the stipulations of Hering's Opponent Colors Theory. Unique hues can differ between individuals and are often used in psychophysical research to measure variations in color perception due to color-vision deficiencies or color adaptation. While there is considerable inter-subject variability when defining unique hues experimentally, an individual's unique hues are very consistent, to within a few nanometers of wavelength. == Physiological basis == === Relation to LMS color space === The trichromatic theory is in conflict with Hering's Opponent Colors Theory, although it is compatible with a physiological opponent process that compares the outputs of the different classes of cone types. The poles of these cone opponent mechanisms do not correspond to the unique hues of Hering's Opponent Colors Theory and unlike the unique hues, have no privilege in color perception. Most humans have three different cone cells in their retinas that facilitate trichromatic color vision. Colors are determined by the proportional excitation of these three cone types, i.e. their quantum catch. The levels of excitation of each cone type are the parameters that define LMS color space. To calculate the opponent process tristimulus values from the LMS color space, the cone excitations must be compared: The luminous (achromatic) opponent channel is a weighted sum of all three cone cells (plus the rod cells in some conditions). The red–green opponent channel is equal to the difference of the L- and M-cones. The blue–yellow opponent channel is equal to the difference of the S-cone and the average/weighted sum of the L- and M-cones. Most mammals have no L cone (the primate L cone arose from a gene duplication of the M cone opsin gene). These mammals still show two kinds of opponent channels in their retinal ganglion cells: the achromatic channel and the blue-yellow opponency channel. === Cone opponent mechanisms are encoded in the retina === The output of different types of cones are compared by cells in the retina including retina bipolar cells (which compare signals from L and M cones) and bistratified retinal ganglion cells (which compare S cone signals with L and M cone signals). The output of bipolar cells is relayed to the visual cortex by the retinal ganglion cells (RGCs) by way of a thalamic relay station called the lateral geniculate nucleus (LGN) of the thalamus. Much of the scientific knowledge of retinal ganglion cell physiology was obtained by neural recordings of cells in the LGN. The cone-opponent mechanisms in the retina and LGN represent a fundamental physiological opponent process but do not represent the unique hues (or Hering's Opponent Colors Theory). For example, the colors that best elicit responses of the bistratified S-(L+M)-opponent neurons are best described as purplish (or lavender) and lime-green, not "blue" and "yellow". The neurons are sometimes referred to as "blue–yellow" neurons, but this is a historical artifact dating to the time when it was thought that Hering's Opponent Colors Theory was hardwired by the retina and the mismatch between the colors to which they are optimally tuned and Hering's Opponent Colors was overlooked. Cone opponent mechanisms exist in the retinas of many mammals, including monkeys, mice, and cats. In primates, the LGN contains three major classes of layers: Magnocellular layers (M, large-cell) – responsible largely for the luminance channel Parvocellular layers (P, small-cell) – responsible largely for red–green opponency Koniocellular layers (K) – responsible largely for blue–yellow opponency, poor spatial resolution, long latency Other mammals such as cats also have three cell types denoted as X (magno), Y (parvo), and W (konio). The W type is beyond most doubt homologous to the primate K type. There are some subtle differences between the M and X types as well as the Y and P types to make the correspondence unclear. === Advantage === Transmitting information in opponent-channel color space could be advantageous over transmitting it in LMS color space ("raw" signals from each cone type). There is some overlap in the wavelengths of light to which the three types of cones (L for long-wave, M for medium-wave, and S for short-wave light) respond, so it is more efficient for the visual system (from a perspective of dynamic range) to record differences between the responses of cones, rather than each type of cone's individual response. Hurvich and Jameson argued that the use of opponent-channel color space would increase color contrast, making the information easier to process by later stages of vision. === Color blindness === Color blindness can be classified by the cone cell that is affected (protan, deutan, tritan) or by the opponent channel that is affected (red–green or blue–yellow). In either case, the channel can either be inactive (in the case of dichromacy) or have a lower dynamic range (in the case of anomalous trichromacy). For example, individuals with deuteranopia see little difference between the red and green unique hues. == History == Johann Wolfgang von Goethe first studied the physiological effect of opposed colors in his Theory of Colours in 1810. Goethe arranged his color wheel symmetrically "for the colours diametrically opposed to each other in this diagram are those which reciprocally evoke each other in the eye. Thus, yellow demands purple; orange, blue; red, green; and vice versa: Thus again all intermediate gradations reciprocally evoke each other." Ewald Hering proposed opponent color theory in 1892. He thought that the colors red, yellow, green, and blue are special in that any other color can be described as a mix of them, and that they exist in opposite pairs. That is, either red or green is perceived and never greenish-red: Even though yellow is a mixture of red and green in the RGB color theory, humans

    Read more →
  • Australian Geoscience Data Cube

    Australian Geoscience Data Cube

    The Australian Geoscience Data Cube (AGDC) is an approach to storing, processing and analyzing large collections of Earth observation data. The technology is designed to meet challenges of national interest by being agile and flexible with vast amounts of layered grid data. The AGDC reduces processing time of traditional image analysis by calibrating, pre-computing known extents, pixel alignment and storing metadata in a cell lattice structure. The temporal-pixel aligned data can often be analysed faster across space and time dimensions than previous scene based techniques. This allows the AGDC to be flexible in tackling future challenges and improve analysis times on every-increasing data repositories of earth observation. The AGDC has also been used internationally to allow countries to maintain ecologically sustainable programs and reduce the difficulty curve of utilizing Remote Sensing data. == Background == The AGDC was originally conceived by Geoscience Australia but is now maintained in a partnership between Geoscience Australia, Commonwealth Scientific and Industrial Research Organisation (CSIRO) and National Computational Infrastructure National Facility (Australia) (NCI). This is made possible by the funding from the partnership and a number of organisations such as National Collaborative Research Infrastructure Strategy (NCRIS). == Analysis ready data, ingestion and indexing == The data processed in the cube is made analysis ready before being ingested and indexed into the AGDC. Analysis ready data is pre-processed data that has applied corrections for instrument calibration (gains and offsets), geolocation (spatial alignment) and radiometry (solar illumination, incidence angle, topography, atmospheric interference). The ingestion process manages the translation of datasets into the storage units while maintaining a database index. The data within the storage and index can be accessed via API calls often compiled within code such as Python (programming language). Example: s2a_l1c = dc.load(product='s2a_level1c_granule',x=(147.36, 147.41), y=(-35.1, -35.15), measurements=['04','03','02'], output_crs='EPSG:4326', resolution=(-0.00025,0.00025)) === Datasets currently stored === Geoscience Australia Landsat Surface Reflectance (1987 to present) Landsat Pixel Quality Landsat Fractional Cover Landsat NDVI === Datasets that have been piloted === USGS Landsat Surface Reflectance SRTM DEM Himawari 8 MODIS Sentinel-2 L1C / S2A Australian Gridded Climate Data == Open source == The AGDC code base is situated in GitHub as an open repository. The core code base moved to the Open Data Cube in early 2017 as part of an international collaboration. Whilst the code base is the Open Data Cube, individual cubes exist as their own right such as the AGDC on the National Computational Infrastructure National Facility (Australia) (NCI) using the High-Performance Computing Cluster HPCC. The core code can be installed on personal computers or public computers (using git) and has many unit tests. Documentation for the code base exists on Read the Docs. == Challenges of the AGDC == The AGDC is designed to meet nationally significant challenges such as the following. Sustainability Environment Water resource management Disaster assist Policy development Community planning Forest preservation Carbon measurement == International awards == The AGDC won the 2016 Content Platform of the Year award from Geospatial World Forum.

    Read more →
  • Ontology-based data integration

    Ontology-based data integration

    Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology‑based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process. == Background == Data from multiple sources are characterized by multiple types of heterogeneity. The following hierarchy is often used: Syntactic heterogeneity: is a result of differences in representation format of data Schematic or structural heterogeneity: the native model or structure to store data differ in data sources leading to structural heterogeneity. Schematic heterogeneity that particularly appears in structured databases is also an aspect of structural heterogeneity. Semantic heterogeneity: differences in interpretation of the 'meaning' of data are source of semantic heterogeneity System heterogeneity: use of different operating system, hardware platforms lead to system heterogeneity Ontologies, as formal models of representation with explicitly defined concepts and named relationships linking them, are used to address the issue of semantic heterogeneity in data sources. In domains like bioinformatics and biomedicine, the rapid development, adoption and public availability of ontologies [1] Archived 2007-06-16 at the Wayback Machine has made it possible for the data integration community to leverage them for semantic integration of data and information. == The role of ontologies == Ontologies enable the unambiguous identification of entities in heterogeneous information systems and assertion of applicable named relationships that connect these entities together. Specifically, ontologies play the following roles: Content Explication The ontology enables accurate interpretation of data from multiple sources through the explicit definition of terms and relationships in the ontology. Query Model In some systems like SIMS, the query is formulated using the ontology as a global query schema. Verification The ontology verifies the mappings used to integrate data from multiple sources. These mappings may either be user specified or generated by a system. == Approaches using ontologies for data integration == There are three main architectures that are implemented in ontology‑based data integration applications, namely, Single ontology approach A single ontology is used as a global reference model in the system. This is the simplest approach as it can be simulated by other approaches. SIMS is a prominent example of this approach. The Structured Knowledge Source Integration component of Research Cyc is another prominent example of this approach. (Title = Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries). The Gellish Taxonomic Dictionary-Ontology follows this approach as well. Multiple ontologies Multiple ontologies, each modeling an individual data source, are used in combination for integration. Though, this approach is more flexible than the single ontology approach, it requires creation of mappings between the multiple ontologies. Ontology mapping is a challenging issue and is focus of large number of research efforts in computer science [2]. The OBSERVER system is an example of this approach. Hybrid approaches The hybrid approach involves the use of multiple ontologies that subscribe to a common, top-level vocabulary. The top-level vocabulary defines the basic terms of the domain. Thus, the hybrid approach makes it easier to use multiple ontologies for integration in presence of the common vocabulary.

    Read more →
  • Broadcast (parallel pattern)

    Broadcast (parallel pattern)

    Broadcast is a collective communication primitive in parallel programming to distribute programming instructions or data to nodes in a cluster. It is the reverse operation of reduction. The broadcast operation is widely used in parallel algorithms, such as matrix-vector multiplication, Gaussian elimination and shortest paths. The Message Passing Interface implements broadcast in MPI_Bcast. == Definition == A message M [ 1.. m ] {\displaystyle M[1..m]} of length m {\displaystyle m} should be distributed from one node to all other p − 1 {\displaystyle p-1} nodes. T byte {\displaystyle T_{\text{byte}}} is the time it takes to send one byte. T start {\displaystyle T_{\text{start}}} is the time it takes for a message to travel to another node, independent of its length. Therefore, the time to send a package from one node to another is t = s i z e × T byte + T start {\displaystyle t=\mathrm {size} \times T_{\text{byte}}+T_{\text{start}}} . p {\displaystyle p} is the number of nodes and the number of processors. == Binomial Tree Broadcast == With Binomial Tree Broadcast the whole message is sent at once. Each node that has already received the message sends it on further. This grows exponentially as each time step the amount of sending nodes is doubled. The algorithm is ideal for short messages but falls short with longer ones as during the time when the first transfer happens only one node is busy. Sending a message to all nodes takes log 2 ⁡ ( p ) t {\displaystyle \log _{2}(p)t} time which results in a runtime of log 2 ⁡ ( p ) ( m T byte + T start ) {\displaystyle \log _{2}(p)(mT_{\text{byte}}+T_{\text{start}})} == Linear Pipeline Broadcast == The message is split up into k {\displaystyle k} packages and sent piecewise from node n {\displaystyle n} to node n + 1 {\displaystyle n+1} . The time needed to distribute the first message piece is p t = m k T byte + T start {\textstyle pt={\frac {m}{k}}T_{\text{byte}}+T_{\text{start}}} whereby t {\displaystyle t} is the time needed to send a package from one processor to another. Sending a whole message takes ( p + k ) ( m T byte k + T start ) = ( p + k ) t = p t + k t {\displaystyle (p+k)\left({\frac {mT_{\text{byte}}}{k}}+T_{\text{start}}\right)=(p+k)t=pt+kt} . Optimal is to choose k = m ( p − 2 ) T byte T start {\displaystyle k={\sqrt {\frac {m(p-2)T_{\text{byte}}}{T_{\text{start}}}}}} resulting in a runtime of approximately m T byte + p T start + m p T start T byte {\displaystyle mT_{\text{byte}}+pT_{\text{start}}+{\sqrt {mpT_{\text{start}}T_{\text{byte}}}}} The run time is dependent on not only message length but also the number of processors that play roles. This approach shines when the length of the message is much larger than the amount of processors. == Pipelined Binary Tree Broadcast == This algorithm combines Binomial Tree Broadcast and Linear Pipeline Broadcast, which makes the algorithm work well for both short and long messages. The aim is to have as many nodes work as possible while maintaining the ability to send short messages quickly. A good approach is to use Fibonacci trees for splitting up the tree, which are a good choice as a message cannot be sent to both children at the same time. This results in a binary tree structure. We will assume in the following that communication is full-duplex. The Fibonacci tree structure has a depth of about d ≈ log Φ ⁡ ( p ) {\displaystyle d\approx \log _{\Phi }(p)} whereby Φ = 1 + 5 2 {\displaystyle \Phi ={\frac {1+{\sqrt {5}}}{2}}} the golden ratio. The resulting runtime is ( m k T byte + T start ) ( d + 2 k − 2 ) {\textstyle ({\frac {m}{k}}T_{\text{byte}}+T_{\text{start}})(d+2k-2)} . Optimal is k = n ( d − 2 ) T byte 3 T start {\displaystyle k={\sqrt {\frac {n(d-2)T_{\text{byte}}}{3T_{\text{start}}}}}} . This results in a runtime of 2 m T byte + T start log Φ ⁡ ( p ) + 2 m log Φ ⁡ ( p ) T start T byte {\displaystyle 2mT_{\text{byte}}+T_{\text{start}}\log _{\Phi }(p)+{\sqrt {2m\log _{\Phi }(p)T_{\text{start}}T_{\text{byte}}}}} . == Two Tree Broadcast (23-Broadcast) == === Definition === This algorithm aims to improve on some disadvantages of tree structure models with pipelines. Normally in tree structure models with pipelines (see above methods), leaves receive just their data and cannot contribute to send and spread data. The algorithm concurrently uses two binary trees to communicate over. Those trees will be called tree A and B. Structurally in binary trees there are relatively more leave nodes than inner nodes. Basic Idea of this algorithm is to make a leaf node of tree A be an inner node of tree B. It has also the same technical function in opposite side from B to A tree. This means, two packets are sent and received by inner nodes and leaves in different steps. === Tree construction === The number of steps needed to construct two parallel-working binary trees is dependent on the amount of processors. Like with other structures one processor can is the root node who sends messages to two trees. It is not necessary to set a root node, because it is not hard to recognize that the direction of sending messages in binary tree is normally top to bottom. There is no limitation on the number of processors to build two binary trees. Let the height of the combined tree be h = ⌈log(p + 2)⌉. Tree A and B can have a height of h − 1 {\displaystyle h-1} . Especially, if the number of processors correspond to p = 2 h − 1 {\displaystyle p=2^{h}-1} , we can make both sides trees and a root node. To construct this model efficiently and easily with a fully built tree, we can use two methods called "Shifting" and "Mirroring" to get second tree. Let assume tree A is already modeled and tree B is supposed to be constructed based on tree A. We assume that we have p {\displaystyle p} processors ordered from 0 to p − 1 {\displaystyle p-1} . ==== Shifting ==== The "Shifting" method, first copies tree A and moves every node one position to the left to get tree B. The node, which will be located on -1, becomes a child of processor p − 2 {\displaystyle p-2} . ==== Mirroring ==== "Mirroring" is ideal for an even number of processors. With this method tree B can be more easily constructed by tree A, because there are no structural transformations in order to create the new tree. In addition, a symmetric process makes this approach simple. This method can also handle an odd number of processors, in this case, we can set processor p − 1 {\displaystyle p-1} as root node for both trees. For the remaining processors "Mirroring" can be used. === Coloring === We need to find a schedule in order to make sure that no processor has to send or receive two messages from two trees in a step. The edge, is a communication connection to connect two nodes, and can be labelled as either 0 or 1 to make sure that every processor can alternate between 0 and 1-labelled edges. The edges of A and B can be colored with two colors (0 and 1) such that no processor is connected to its parent nodes in A and B using edges of the same color- no processor is connected to its children nodes in A or B using edges of the same color. In every even step the edges with 0 are activated and edges with 1 are activated in every odd step. === Time complexity === In this case the number of packet k is divided in half for each tree. Both trees are working together the total number of packets k = k / 2 + k / 2 {\displaystyle k=k/2+k/2} (upper tree + bottom tree) In each binary tree sending a message to another nodes takes 2 i {\displaystyle 2i} steps until a processor has at least a packet in step i {\displaystyle i} . Therefore, we can calculate all steps as d := log 2 ⁡ ( p + 1 ) ⇒ log 2 ⁡ ( p + 1 ) ≈ log 2 ⁡ ( p ) {\displaystyle d:=\log _{2}(p+1)\Rightarrow \log _{2}(p+1)\approx \log _{2}(p)} . The resulting run time is T ( m , p , k ) ≈ ( m k T byte + T start ) ( 2 d + k − 1 ) {\textstyle T(m,p,k)\approx ({\frac {m}{k}}T_{\text{byte}}+T_{\text{start}})(2d+k-1)} . (Optimal k = m ( 2 d − 1 ) T byte / T start {\textstyle k={\sqrt {{m(2d-1)T_{\text{byte}}}/{T_{\text{start}}}}}} ) This results in a run time of T ( m , p ) ≈ m T byte + T start ⋅ 2 log 2 ⁡ ( p ) + m ⋅ 2 log 2 ⁡ ( p ) T start T byte {\displaystyle T(m,p)\approx mT_{\text{byte}}+T_{\text{start}}\cdot 2\log _{2}(p)+{\sqrt {m\cdot 2\log _{2}(p)T_{\text{start}}T_{\text{byte}}}}} . == ESBT-Broadcasting (Edge-disjoint Spanning Binomial Trees) == In this section, another broadcasting algorithm with an underlying telephone communication model will be introduced. A Hypercube creates network system with p = 2 d ( d = 0 , 1 , 2 , 3 , . . . ) {\displaystyle p=2^{d}(d=0,1,2,3,...)} . Every node is represented by binary 0 , 1 {\displaystyle {0,1}} depending on the number of dimensions. Fundamentally ESBT(Edge-disjoint Spanning Binomial Trees) is based on hypercube graphs, pipelining( m {\displaystyle m} messages are divided by k {\displaystyle k} packets) and binomial trees. The Processor 0 d {\displaystyle 0^{d}} cyclically spreads packets to roots of ESB

    Read more →
  • IOS SDK

    IOS SDK

    The iOS SDK (iOS Software Development Kit), formerly the iPhone SDK, is a software development kit (SDK) developed by Apple Inc. The kit allows for the development of mobile apps on Apple's iOS 17 and iPadOS operating systems. The iOS SDK is a free download for users of Macintosh (or Mac) personal computers. It is not available for Microsoft Windows PCs. The SDK contains sets giving developers access to various functions and services of iOS devices, such as hardware and software attributes. It also contains an iPhone simulator to mimic the look and feel of the device on the computer while developing. New versions of the SDK accompany new versions of iOS. In order to test applications, get technical support, and distribute apps through App Store, developers are required to subscribe to the Apple Developer Program. Combined with Xcode, the iOS SDK helps developers write iOS apps using officially supported programming languages, including Swift and Objective-C. Other companies have also created tools that allow for the development of native iOS apps using their respective programming languages. == History == While originally developing iPhone prior to its unveiling in 2007, Apple's then-CEO Steve Jobs did not intend to let third-party developers build native apps for the iOS operating system, instead directing them to make web applications for the Safari web browser. However, backlash from developers prompted the company to reconsider, with Jobs announcing on October 17, 2007, that Apple would have a software development kit (SDK) available for developers by February 2008. The SDK was released on March 6, 2008. == Features == The iOS SDK is a free download for Mac users. It is not available for Microsoft Windows. To test the application, get technical support, and distribute applications through App Store, developers are required to subscribe to the Apple Developer Program. The SDK contents are separated into the following sets: UIKit Multi-touch events and controls Accelerometer support View hierarchy Localization (i18n) Camera support Media OpenAL audio mixing and recording Video playback Image file formats Quartz Core Animation OpenGL ES Core Services Networking Embedded SQLite database Core Location Threads CoreMotion Mac OS X Kernel TCP/IP Sockets Power management File system Security The SDK also contains an iPhone simulator, a program used to simulate the look and feel of iPhone on the developer's computer. New SDK versions accompany new iOS versions. == Programming languages == The iOS SDK, combined with Xcode, helps developers write iOS applications using officially supported programming languages, including Swift and Objective-C. An .ipa (iOS App Store Package) file is an iOS application archive file which stores an iOS app. === Java === In 2008, Sun Microsystems announced plans to release a Java Virtual Machine (JVM) for iOS, based on the Java Platform, Micro Edition version of Java. This would enable Java applications to run on iPhone and iPod Touch. Soon after the announcement, developers familiar with the SDK's terms of agreement believed that by not allowing third-party applications to run in the background (answer a phone call and still run the application, for example), and not allowing an application to download code from another source, nor allowing an application to interact with a third-party application, Sun's development efforts could be hindered without Apple's cooperation. Sun also worked with a third-party company called Innaworks in attempts to get Java on iPhone. Despite the apparent lack of interest from Apple, a firmware leak of the 2007 iPhone release revealed an ARM chip with a processor with Jazelle support for embedded Java execution. === .NET === Novell announced in September 2009 that they had successfully developed MonoTouch, a software framework that let developers write native iPhone applications in the C# and .NET programming languages, while still maintaining compatibility with Apple's requirements. === Flash === iOS does not support Adobe Flash, and although Adobe has two versions of its software: Flash and Flash Lite, Apple views neither as suitable for the iPhone, claiming that full Flash is "too slow to be useful", and Flash Lite to be "not capable of being used with the Web". In October 2009, Adobe announced that an upcoming update to its Creative Suite would feature a component to let developers build native iPhone apps using the company's Flash development tools. The software was officially released as part of the company's Creative Suite 5 collection of professional applications. === 2010 policy on development tools === In April 2010, Apple made controversial changes to its iPhone Developer Agreement, requiring developers to use only "approved" programming languages in order to publish apps on App Store, and banning applications that used third-party development tools; the ban affected Adobe's Packager tool, which converted Flash apps into iOS apps. After developer backlash and news of a potential anti-trust investigation, Apple again revised its agreement in September, allowing the use of third-party development tools. === Mac Catalyst === Originally called "Project Marzipan", Mac Catalyst helps developers bring iPadOS app experiences to macOS, and make it easier to take apps developed for iPadOS devices to Macs by avoiding the need to write the underlying software code twice.

    Read more →
  • Certifying algorithm

    Certifying algorithm

    In theoretical computer science, a certifying algorithm is an algorithm that outputs, together with a solution to the problem it solves, a proof that the solution is correct. A certifying algorithm is said to be efficient if the combined runtime of the algorithm and a proof checker is slower by at most a constant factor than the best known non-certifying algorithm for the same problem. The proof produced by a certifying algorithm should be in some sense simpler than the algorithm itself, for otherwise any algorithm could be considered certifying (with its output verified by running the same algorithm again). Sometimes this is formalized by requiring that a verification of the proof take less time than the original algorithm, while for other problems (in particular those for which the solution can be found in linear time) simplicity of the output proof is considered in a less formal sense. For instance, the validity of the output proof may be more apparent to human users than the correctness of the algorithm, or a checker for the proof may be more amenable to formal verification. Implementations of certifying algorithms that also include a checker for the proof generated by the algorithm may be considered to be more reliable than non-certifying algorithms. For, whenever the algorithm is run, one of three things happens: it produces a correct output (the desired case), it detects a bug in the algorithm or its implication (undesired, but generally preferable to continuing without detecting the bug), or both the algorithm and the checker are faulty in a way that masks the bug and prevents it from being detected (undesired, but unlikely as it depends on the existence of two independent bugs). == Examples == Many examples of problems with checkable algorithms come from graph theory. For instance, a classical algorithm for testing whether a graph is bipartite would simply output a Boolean value: true if the graph is bipartite, false otherwise. In contrast, a certifying algorithm might output a 2-coloring of the graph in the case that it is bipartite, or a cycle of odd length if it is not. Any graph is bipartite if and only if it can be 2-colored, and non-bipartite if and only if it contains an odd cycle. Both checking whether a 2-coloring is valid and checking whether a given odd-length sequence of vertices is a cycle may be performed more simply than testing bipartiteness. Analogously, it is possible to test whether a given directed graph is acyclic by a certifying algorithm that outputs either a topological order or a directed cycle. It is possible to test whether an undirected graph is a chordal graph by a certifying algorithm that outputs either an elimination ordering (an ordering of all vertices such that, for every vertex, the neighbors that are later in the ordering form a clique) or a chordless cycle. And it is possible to test whether a graph is planar by a certifying algorithm that outputs either a planar embedding or a Kuratowski subgraph. The extended Euclidean algorithm for the greatest common divisor of two integers x and y is certifying: it outputs three integers g (the divisor), a, and b, such that ax + by = g. This equation can only be true of multiples of the greatest common divisor, so testing that g is the greatest common divisor may be performed by checking that g divides both x and y and that this equation is correct.

    Read more →
  • Reverse data management

    Reverse data management

    Reverse data management describes a branch and set of research questions in relational database theory that aim to reverse the common focus of standard data management. Instead of focusing on the "forward" transformation of an input databases (a set of relational tables) to an output table, which is the main focus of standard query evaluation, reverse data management reverses that focus and studies the possible input database transformations that would achieve a desired output. Usually the objective is to find an intervention (a deletion, addition, or change of tuples) of minimal size, in order to achieve a particular change in the output. The problem has been studied at least since the 1980s, but has received renewed attention due to an influential paper in the early 2000s that made a connection between provenance and view propagation. The term was coined in a VLDB 2011 vision paper. The problem has been receiving significant attention in recent years due to its connection to computational fairness. == Topics in reverse data management problems == Example topics in reverse data management include: Deletion propagation with source side-effects: Find a minimal number of tuples to delete in the database in order to delete a particular tuple in the output. Deletion propagation with view side-effects: Find a set of tuples to delete in the database in order to delete a particular tuple in the output, while removing the minimal number of other output tuples. Causal responsibility: Find a minimal number of tuples to delete in the database in order to make a particular input tuple counterfactual. This notion is inspired by the notions of actual cause and causal responsibility from the work of Halpern and Pearl. Resilience: Find a minimal number of tuples to delete in the database in order to make a Boolean query false. The complexity of this problem is identical to the problem of deletion propagation with source-side effects over a different database. Smallest witness problem: Find a minimal number of tuples to keep in the a database (or equivalently, delete a maximal number of tuples) while keeping a particular tuple in the output. Minimum repair: Given a database that violates certain integrity constraints, find a minimal number of tuples to delete in the database in order to fulfill all constraints (also called to "repair" the database).

    Read more →