AI Data Usage

AI Data Usage — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Language identification

    Language identification

    In natural language processing, language identification or language guessing is the problem of determining which natural language a given content is in. Computational approaches to this problem view it as a special case of text categorization, solved with various statistical methods. == Overview == === Logical approach === A common non-statistical intuitive approach (though highly uncertain) is to look for common letter combinations, or distinctive diacritics or punctuation. === Statistical approach === There are several statistical approaches to language identification. An older statistical method by Grefenstette was based on the frequency of short n-grams, which are often function morphemes. For example, "ing" is more common in English than in French, while the sequence "que" is more common in French. Given a new page found on the Web, one counts the number of occurrences of each such short sequence and picks the language whose frequency table it matches the most. One technique is to compare the compressibility of the text to the compressibility of texts in a set of known languages. This approach is known as mutual information based distance measure. The same technique can also be used to empirically construct family trees of languages which closely correspond to the trees constructed using historical methods. Mutual information based distance measure is essentially equivalent to more conventional model-based methods and is not generally considered to be either novel or better than simpler techniques. Another technique, as described by Cavnar and Trenkle (1994) and Dunning (1994) is to create a language n-gram model from a "training text" for each of the languages. These models can be based on characters (Cavnar and Trenkle) or encoded bytes (Dunning); in the latter, language identification and character encoding detection are integrated. Then, for any piece of text needing to be identified, a similar model is made, and that model is compared to each stored language model. The most likely language is the one with the model that is most similar to the model from the text needing to be identified. This approach can be problematic when the input text is in a language for which there is no model. In that case, the method may return another, "most similar" language as its result. Also problematic for any approach are pieces of input text that are composed of several languages, as is common on the Web. As of 2025, a commonly used baseline method is via the fastText library, which has comparable classification accuracy as deep learning techniques, but much faster. == Identifying similar languages == One of the great bottlenecks of language identification systems is to distinguish between closely related languages. Similar languages like Bulgarian and Macedonian or Indonesian and Malay present significant lexical and structural overlap, making it challenging for systems to discriminate between them. In 2014 the DSL shared task has been organized providing a dataset (Tan et al., 2014) containing 13 different languages (and language varieties) in six language groups: Group A (Bosnian, Croatian, Serbian), Group B (Indonesian, Malaysian), Group C (Czech, Slovak), Group D (Brazilian Portuguese, European Portuguese), Group E (Peninsular Spanish, Argentine Spanish), Group F (American English, British English). The best system reached performance of over 95% results (Goutte et al., 2014). Results of the DSL shared task are described in Zampieri et al. 2014. == Software == Apache OpenNLP includes char n-gram based statistical detector and comes with a model that can distinguish 103 languages Apache Tika contains a language detector for 18 languages

    Read more →
  • Communications security

    Communications security

    Communications security is the discipline of preventing unauthorized interceptors from accessing telecommunications in an intelligible form, while still delivering content to the intended recipients. In the North Atlantic Treaty Organization culture, including United States Department of Defense culture, it is often referred to by the abbreviation COMSEC. The field includes cryptographic security, transmission security, emissions security and physical security of COMSEC equipment and associated keying material. COMSEC is used to protect both classified and unclassified traffic on military communications networks, including voice, video, and data. It is used for both analog and digital applications, and both wired and wireless links. Voice over secure internet protocol VOSIP has become the de facto standard for securing voice communication, replacing the need for Secure Terminal Equipment (STE) in much of NATO, including the U.S.A. USCENTCOM moved entirely to VOSIP in 2008. == Specialties == Cryptographic security: The component of communications security that results from the provision of technically sound cryptosystems and their proper use. This includes ensuring message confidentiality and authenticity. Emission security (EMSEC): The protection resulting from all measures taken to deny unauthorized persons information of value that might be derived from communications systems and cryptographic equipment intercepts and the interception and analysis of compromising emanations from cryptographic equipment, information systems, and telecommunications systems. Transmission security (TRANSEC): The component of communications security that results from the application of measures designed to protect transmissions from interception and exploitation by means other than cryptanalysis (e.g. frequency hopping and spread spectrum). Physical security: The component of communications security that results from all physical measures necessary to safeguard classified equipment, material, and documents from access thereto or observation thereof by unauthorized persons. == Related terms == ACES – Automated Communications Engineering Software AEK – Algorithmic Encryption Key AKMS – the Army Key Management System CCI – Controlled Cryptographic Item - equipment which contains COMSEC embedded devices CT3 – Common Tier 3 DTD – Data Transfer Device ICOM – Integrated COMSEC, e.g. a radio with built in encryption KEK – Key Encryption Key KG-30 – family of COMSEC equipment KOI-18 – Tape Reader General Purpose KPK – Key production key KYK-13 – Electronic Transfer Device KYX-15 – Electronic Transfer Device LCMS – Local COMSEC Management Software OTAR – Over the Air Rekeying OWK – Over the Wire Key SKL – Simple Key Loader SOI – Signal operating instructions STE – Secure Terminal Equipment (secure phone) STU-III – (obsolete secure phone, replaced by STE) TED – Trunk Encryption Device such as the WALBURN/KG family TEK – Traffic Encryption Key TPI – Two person integrity TSEC – Telecommunications Security (sometimes referred to in error transmission security or TRANSEC) Types of COMSEC equipment: Authentication equipment Crypto equipment: Any equipment that embodies cryptographic logic or performs one or more cryptographic functions (key generation, encryption, and authentication). Crypto-ancillary equipment: Equipment designed specifically to facilitate efficient or reliable operation of crypto-equipment, without performing cryptographic functions itself. Crypto-production equipment: Equipment used to produce or load keying material == DoD Electronic Key Management System == The Electronic Key Management System (EKMS) is a United States Department of Defense (DoD) key management, COMSEC material distribution, and logistics support system. The National Security Agency (NSA) established the EKMS program to supply electronic key to COMSEC devices in securely and timely manner, and to provide COMSEC managers with an automated system capable of ordering, generation, production, distribution, storage, security accounting, and access control. The Army's platform in the four-tiered EKMS, AKMS, automates frequency management and COMSEC management operations. It eliminates paper keying material, hardcopy Signal operating instructions (SOI) and saves the time and resources required for courier distribution. It has 4 components: LCMS provides automation for the detailed accounting required for every COMSEC account, and electronic key generation and distribution capability. ACES is the frequency management portion of AKMS. ACES has been designated by the Military Communications Electronics Board as the joint standard for use by all services in development of frequency management and crypto-net planning. CT3 with DTD software is in a fielded, ruggedized hand-held device that handles, views, stores, and loads SOI, Key, and electronic protection data. DTD provides an improved net-control device to automate crypto-net control operations for communications networks employing electronically keyed COMSEC equipment. SKL is a hand-held PDA that handles, views, stores, and loads SOI, Key, and electronic protection data. == Key Management Infrastructure (KMI) Program == KMI is intended to replace the legacy Electronic Key Management System to provide a means for securely ordering, generating, producing, distributing, managing, and auditing cryptographic products (e.g., asymmetric keys, symmetric keys, manual cryptographic systems, and cryptographic applications). This system is currently being fielded by Major Commands and variants will be required for non-DoD Agencies with a COMSEC Mission.

    Read more →
  • Content-oriented workflow models

    Content-oriented workflow models

    In data management, a content-oriented workflow model seeks to articulate workflow progression by the presence of content units (like data-records/objects/documents). Most content-oriented workflow approaches provide a life-cycle model for content units, such that workflow progression can be qualified by conditions on the state of the units. Most approaches are research and work in progress and the content models and life-cycle models are more or less formalized. The term content-oriented workflows is an umbrella term for several scientific workflow approaches, namely "data-driven", "resource-driven", "artifact-centric", "object-aware", and "document-oriented". Thus, the meaning of "content" ranges from simple data attributes to self-contained documents; the term "content-oriented workflows" appeared at first in as an umbrella term. Such a general term, independent from a specific approach, is necessary to contrast the content-oriented modelling principle with traditional activity-oriented workflow models (like Petri nets or BPMN) where a workflow is driven by a control flow and where the content production perspective is neglected or even missing. The term "content" was chosen to subsume the different levels in granularity of the content units in the respective workflow models; it was also chosen to make associations with content management. Both terms "artifact-centric" and "data-driven" would also be good candidates for an umbrella term, but each is closely related to a specific approach of a single working group. The "artifact-centric" group itself (i.e. IBM Research) has generalized the characteristics of their approach and has used "information-centric" as an umbrella term in. Yet, the term information is too unspecific in the context of computer science, thus, "content-orientated workflows" is considered as good compromise. == Workflow Model Approaches == === Data-driven === The data-driven process structures provides a sophisticated workflow model being specialized on hierarchical write-and-review-processes. The approach provides interleaved synchronization of sub-processes and extends activity diagrams. Unfortunately, the COREPRO prototype implementation is not publicly available. Research on the project had been ceased. The general idea has been continued by Reichert in form of the #Object-aware approach. Synonyms data-driven process structures / data-driven modeling and coordination Protagonists Dr. Dominic Müller (University of Twente), Joachim Herbst (DaimlerChrysler Research), and Manfred Reichert (at this time Assoc. Prof. at Univ. of Twente, currently Prof. at Ulm Univ.) Organization(s) University of Twente, DaimlerChrysler Period 2005 - 2007 Selected publications Implementation COREPRO === Resource-driven === The resource-driven workflow system is an early approach that considered workflows from a content-oriented perspective and emphasizes on the missing support for plain document-driven processes by traditional activity-oriented workflow engines. The resource-driven approach demonstrated the application of database triggers for handling workflow events. Still the system implementation is centralized and the workflow schema is statically defined. The project appeared in 2005 but many aspects are considered future work by the authors. Research did not continue on the project. Wang completed his PhD thesis in 2009, yet, his thesis does not mention the resource-driven approach to workflow modelling but is about discrete event simulation. Synonyms Resource-based Workflows / Document-Driven Workflow Systems Protagonists Jianrui Wang and Prof. Akhil Kumar Organization Pennsylvania State University Period 2005 - today Selected publications Implementation N/A === Artifact-centric === The artifact-centric approach provides a framework for content-oriented workflows. In this model, the enterprise application landscape includes distributed business services, while the workflow engine is centralized. Process enactment is integrated with database management system infrastructure, and the project is funded by IBM. Synonyms artifact-centric business process models / artifact-based business process (ACP) / artifact-centric workflows Protagonists Richard Hull and Dr. Kamal Bhattacharya as well as Cagdas E. Gerede and Jianwen Su Organization IBM (T.J. Watson Research Center, NY) Period 2007 - today Selected publications Implementation ArtiFact === Object-aware === The object-aware approach manages a set of object types and generates forms for creating object instances. The form completion flow is controlled by transitions between object configurations each describing a progressing set of mandatory attributes. Each object configuration is named by an object state. The data production flow is user-shifting and it is discrete by defining a sequence of object states. The discussion is currently limited to a centralized system, without any workflows across different organizations. However, the approach is of great relevance to many domains like concurrent engineering. Finally, the object-aware approach and its PHILharmonicFlows system are going to provide general-purpose workflow systems for generic enactment of data production processes. Synonyms object-aware process management / datenorientiertes Prozess-Management-System Protagonists Vera Künzle and Prof. Manfred Reichert Organization Ulm University Period 2009 - today Selected publications Implementation PHILharmonicFlows === Distributed Document-oriented === Distributed document-oriented process management (dDPM) enables distributed case handling in heterogeneous system environments and it is based on document-oriented integration. The workflow model reflects the paper-based working practice in inter-institutional healthcare scenarios. It targets distributed knowledge-driven ad hoc workflows, wherein distributed information systems are required to coordinate work with initially unknown sets of actors and activities. The distributed workflow engine supports process planning & process history as well as participant management and process template creation with import/export. The workflow engine embeds a functional fusion of 1) group-based instant messaging 2) with a shared work list editor 3) with version control. The software implementation of dDPM is α-Flow which is available as open source. dDPM and α-Flow provide a content-oriented approach to schema-less workflows. The complete distributed case handling application is provided in form of a single active Document ("α-Doc"). The α-Doc is a case file (as information carrier) with an embedded workflow engine (in form of active properties). Inviting process participants is equivalent to providing them with a copy of an α-Doc, copying it like an ordinary desktop file. All α-Docs that belong to the same case can synchronize each other, based on the participant management, electronic postboxes, store-and-forward messaging, and an offline-capable synchronization protocol. Synonyms distributed document-oriented process management (dDPM), distributed case handling via active documents Protagonists Christoph P. Neumann and Prof. Richard Lenz Organization Friedrich-Alexander-Universität Erlangen-Nürnberg Period 2009 - 2012 Selected Publications and a PhD thesis Implementation α-Flow (open source) == Related Concepts == === Content Management === The bandwidth of Content management systems (CMS) reaches from Web content management systems (WCMS) and Document management system (DMS) to Enterprise Content Management (ECM). Mature DMS products support document production workflows in a basic form, primarily focusing on review cycle workflows concerning a single document. === Groupware and Computer-Supported Cooperative Work === Groupware focuses on messaging (like E-Mail, Chat, and Instant Messaging), shared calendars (e.g. Lotus Notes, Microsoft Outlook with Exchange Server), and conferencing (e.g. Skype). Groupware overlaps with Computer-supported cooperative work (CSCW), that originated from shared multimedia editors (for live drawing/sketching) and synchronous multi-user applications like desktop sharing. The extensive conceptual claim of CSWC must be put into perspective by its actual solution scope, that is available as the CSCW Matrix. === Case Handling === The case handling paradigm stems from Prof. van der Aalst and gained momentum in 2005. The core features are: (a) provide all information available, i.e. present the case as a whole rather than showing bits and pieces, (b) decide about activities on the basis of the information available rather than the activities already executed, (c) separate work distribution from authorization and allow for additional types of roles, not just the execute role, and (d) allow workers to view and add/modify data before or after the corresponding activities have been executed. In healthcare, the flow of a patient between healthcare professionals is considered as a workflow - with activities that inc

    Read more →
  • Data hub

    Data hub

    A data hub is a center of data exchange that is supported by data science, data engineering, and data warehouse technologies to interact with endpoints such as applications and algorithms. == Features == A data hub differs from a data warehouse in that it is generally unintegrated and often at different grains. It differs from an operational data store because a data hub does not need to be limited to operational data. A data hub differs from a data lake by homogenizing data and possibly serving data in multiple desired formats, rather than simply storing it in one place, and by adding other value to the data such as de-duplication, quality, security, and a standardized set of query services. A data lake tends to store data in one place for availability, and allow/require the consumer to process or add value to the data. Data hubs are ideally the "go-to" place for data within an enterprise, so that many point-to-point connections between callers and data suppliers do not need to be made, and so that the data hub organization can negotiate deliverables and schedules with various data enclave teams, rather than being an organizational free-for-all as different teams try to get new services and features from many other teams.

    Read more →
  • WIPO GREEN

    WIPO GREEN

    WIPO GREEN is a World Intellectual Property Organization program established in 2013 that supports global efforts to address climate change and food security through sharing of sustainable technology innovations. == WIPO GREEN database == The WIPO GREEN database is the foundation of the platform. The database is a free, solutions-oriented, global innovation catalog that connects needs for solving environmental or climate change problems with sustainable solutions from prototypes to marketable products available for sale, license, collaborations, knowledge transfer, joint ventures, or collaborations. Green technology innovators can promote their products, businesses, organizations, and governments looking for green technologies can explain their needs and seek collaboration with providers. As of July 2022, WIPO GREEN has over 120,000 technologies, needs and experts, more than 2000 users in 110 countries, and has recorded over 1000 connections made between technology providers and seekers. The database utilizes AI-assisted auto-matching, user uploads tracing and alerts, full-text search for solutions based on long need descriptions, and the Patent2Solution search function for finding commercial applications of a patent, which are some of the unique features of the database. Free registration is required for detailed record view and uploading. All technologies uploaded to the WIPO GREEN database remain the property of the rights holder. It is up to the rights holder and the collaborating parties to structure agreements in the manner they feel is most appropriate and effective. WIPO GREEN does not require that technologies or innovations uploaded to the database be patented or in the process of being patented. Therefore, technology providers can upload their technology while related patent applications are pending. Technology providers are encouraged to upload technology solutions on the WIPO GREEN database and connect with other users to explore partnerships, technology transfers, including funding and licensing opportunities. == Acceleration projects == Acceleration projects work with WIPO GREEN partners and local organizations to explore local challenges and green opportunities for particular environmental needs. These projects are organized annually in different countries or regions around and connect providers and seekers of green technologies. For example, the Latin America Acceleration Project explores innovative new technologies in the region and facilitates green technology exchange between providers and seekers in green opportunities in intensified crop rotation, soil re-carbonization, and forest management in Argentina; zero-till or conservation agriculture in Brazil; and wine production in Chile. In October 2021, a project in Indonesia on palm oil mill effluent (POME), a by-product of palm oil production that emits greenhouse gases and reportedly harms flora and fauna in local rivers, identified viable green solutions to turn the high organic content of POME wastewater into biogas and other environmentally friendly uses. Former projects took place in Cambodia, Indonesia, and the Philippines around wastewater treatment, agriculture, and water technologies. == The Green Technology Book == In November 2022 at UNFCCC COP27, WIPO introduced its new Flagship publication the Green Technology Book. This digital-first publication aims to put innovation, technology and intellectual property at the forefront in the fight against climate change. The inaugural edition of this annual publication focused on available solutions for climate-change adaptation to reduce vulnerability as well as to increase resilience to the impacts of climate change. The book was created in cooperation with the Climate Technology Center and Network (CTCN) and the Egyptian Academy of Scientific Research and Technology (ASTR). It features 200 adaptation technologies, which are also available in the WIPO GREEN database of innovative technologies and needs. == Partners Network == WIPO GREEN partners are public or private institutions that wish to collaborate to advance WIPO GREEN’s mission. The network is aimed at helping the implementation and diffusion of green technology innovations around the world. Partners include government institutions, intergovernmental organizations, academia, and businesses – from small and medium-sized enterprises to Fortune 500 companies. As of 2022, WIPO GREEN has a network of over 146 partner organizations involved in green technology.

    Read more →
  • Why We Post

    Why We Post

    Why We Post is a research project funded by the European Research Council and launched in 2012 by Daniel Miller with the objective of examining the global impact of new social media. The study is based on ethnographic data collected through the course of 15 months in China, India, Turkey, Italy, United Kingdom, Trinidad, Chile and Brazil. The results of this project were released on 29 February 2016. This included the first three of eleven Open Access books (available via UCL Press), a five-week e-course (MOOC) on FutureLearn in English, also available in Chinese, Portuguese, Hindi, Tamil, Italian, Turkish, and Spanish on UCLeXtend. In addition a website containing key discoveries, stories and over 100 films is available in the same 8 languages.

    Read more →
  • Copyright

    Copyright

    A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, educational, or musical form. Copyright is intended to protect the original expression of an idea in the form of a creative work, but not the idea itself. A copyright is subject to limitations based on public interest considerations, such as the fair use doctrine in the United States and fair dealing doctrine in the United Kingdom. Some jurisdictions require "fixing" copyrighted works in a tangible form. It is often shared among multiple authors, each of whom holds a set of rights to use or license the work, and who are commonly referred to as rights holders. These rights normally include reproduction, control over derivative works, distribution, public performance, and moral rights such as attribution. Copyrights can be granted by public law and are in that case considered "territorial rights". This means that copyrights granted by the law of a certain state do not extend beyond the territory of that specific jurisdiction. Copyrights of this type vary by country; many countries, and sometimes a large group of countries, have made agreements with other countries on procedures applicable when works "cross" national borders or national rights are inconsistent. Typically, the public law duration of a copyright expires 50 to 100 years after the creator dies, depending on the jurisdiction. Some countries require certain copyright formalities to establishing copyright, others recognize copyright in any completed work, without a formal registration. When the copyright of a work expires, it enters the public domain. == History == === Background === The concept of copyright developed after the printing press came into use in Europe in the 15th and 16th centuries. It was associated with a common law and rooted in the civil law system. The printing press made it much cheaper to produce works, but as there was initially no copyright law, anyone could buy or rent a press and print any text. Popular new works were immediately re-set and re-published by competitors, so printers needed a constant stream of new material. Fees paid to authors for new works were high and significantly supplemented the incomes of many academics. Printing brought profound social changes. The rise in literacy across Europe led to a dramatic increase in the demand for reading matter. Prices of reprints were low, so publications could be bought by poorer people, creating a mass audience. In German-language markets before the advent of copyright, technical materials, like academic papers and handbooks, were inexpensive and widely available; it has been suggested this contributed to Germany's industrial and economic success. === Conception === The concept of copyright first developed in England. In reaction to the printing of "scandalous books and pamphlets", the English Parliament passed the Licensing of the Press Act 1662, which required all intended publications to be registered with the government-approved Stationers' Company, giving the Stationers the right to regulate what material could be printed. The Statute of Anne, enacted in 1710 in England and Scotland, provided the first legislation to protect copyrights (but not authors' rights). The Copyright Act 1814 extended more rights for authors but did not protect British publications from being reprinted in the US. The Berne International Copyright Convention of 1886 finally provided protection for authors among the countries who signed the agreement, although the US did not join the Berne Convention until 1989. In the US, the Constitution grants Congress the right to establish copyright and patent laws. Shortly after the Constitution was passed, Congress enacted the Copyright Act of 1790, modeling it after the Statute of Anne. While the national law protected authors' published works, authority was granted to the states to protect authors' unpublished works. The most recent major overhaul of copyright in the US, the Copyright Act of 1976, extended federal copyright to works as soon as they are created and "fixed", without requiring publication or registration. State law continues to apply to unpublished works that are not otherwise copyrighted by federal law. This act also changed the calculation of copyright term from a fixed term (then a maximum of fifty-six years) to "life of the author plus 50 years". These changes brought the US closer to conformity with the Berne Convention, and in 1989 the United States further revised its copyright law and joined the Berne Convention officially. Copyright laws allow products of creative human activities, such as literary and artistic production, to be preferentially exploited and thus incentivized. Different cultural attitudes, social organizations, economic models and legal frameworks are seen to account for why copyright emerged in Europe and not, for example, in Asia. In the Middle Ages in Europe, there was generally a lack of any concept of literary property due to the general relations of production, the specific organization of literary production and the role of culture in society. The latter refers to the tendency of oral societies, such as that of Europe in the medieval period, to view knowledge as the product and expression of the collective, rather than to see it as individual property. However, with copyright laws, intellectual production comes to be seen as a product of an individual, with attendant rights. The most significant point is that patent and copyright laws support the expansion of the range of creative human activities that can be commodified. This parallels the ways in which capitalism led to the commodification of many aspects of social life that earlier had no monetary or economic value perse. Copyright has developed into a concept that has a significant effect on nearly every modern industry, including not just literary work, but also forms of creative work such as sound recordings, films, photographs, software, and architecture. === National copyrights === Often seen as the first real copyright law, the 1709 British Statute of Anne gave authors and the publishers to whom they did chose to license their works, the right to publish the author's creations for a fixed period, after which the copyright expired. It was "An Act for the Encouragement of Learning, by Vesting the Copies of Printed Books in the Authors or the Purchasers of such Copies, during the Times therein mentioned." The act also alluded to individual rights of the artist. It began: "Whereas Printers, Booksellers, and other Persons, have of late frequently taken the Liberty of Printing ... Books, and other Writings, without the Consent of the Authors ... to their very great Detriment, and too often to the Ruin of them and their Families:". A right to benefit financially from the work is articulated, and court rulings and legislation have recognized a right to control the work, such as ensuring that the integrity of it is preserved. An irrevocable right to be recognized as the work's creator appears in some countries' copyright laws. The Copyright Clause of the United States, Constitution (1787) authorized copyright legislation: "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." That is, by guaranteeing them a period of time in which they alone could profit from their works, they would be enabled and encouraged to invest the time required to create them, and this would be good for society as a whole. A right to profit from the work has been the philosophical underpinning for much legislation extending the duration of copyright, to the life of the creator and beyond, to their heirs. Yet scholars like Lawrence Lessig have argued that copyright terms have been extended beyond the scope imagined by the Framers. Lessig refers to the Copyright Clause as the "Progress Clause" to emphasize the social dimension of intellectual property rights. The original length of copyright in the United States was 14 years, and it had to be explicitly applied for. If the author wished, they could apply for a second 14‑year monopoly grant, but after that the work entered the public domain, so it could be used and built upon by others. === Continental law === In many jurisdictions of the European continent, comparable legal concepts to copyright did exist from the 16th century on but did change under Napoleonic rule into another legal concept: authors' rights or creator's right laws, from French: droits d'auteur and German Urheberrecht. In many modern-day publications the terms copyright and authors' rights are being mixed, or used as translations, but in a juridical sense the legal concepts do essentially differ. Authors' rights are, generally speaking,

    Read more →
  • Tableau de Concordance

    Tableau de Concordance

    The Tableau de Concordance was the main French diplomatic code used during World War I; the term also refers to any message sent using the code. It was a superenciphered four-digit code that was changed three times between 1 August 1914 and 15 January 1915. The Tableau de Concordance is considered superenciphered because there is more than one step required to use it. First, each word in a message is replaced by four digits via a codebook. These four digits are divided into three groups (one digit, two digits, one digit) so that when the whole message has been translated into code, the four-digit sets can be put together so it looks like the entire message is made up of two-digit pairs. This is called a "Straddle Gimmick." Then, in turn, each of these two digit pairs (and the single digits at the beginning and end) are replaced by two letters. The letters are then combined with no spaces for the final ciphertext. The manual for the Tableau de Concordance included the instruction that if there was not adequate time for completely enciphering the message, it should simply be sent in clear, because a partially enciphered message would have provided insight into the inner workings of the code.

    Read more →
  • Wetware (brain)

    Wetware (brain)

    Wetware is a term drawn from the computer-related idea of hardware or software, but applied to biological life forms. == Usage == The prefix "wet" is a reference to the water found in living creatures. Wetware is used to describe the elements equivalent to hardware and software found in a person, especially the central nervous system (CNS) and the human mind. The term wetware finds use in works of fiction, in scholarly publications and in popularizations. The "hardware" component of wetware concerns the bioelectric and biochemical properties of the CNS, specifically the brain. If the sequence of impulses traveling across the various neurons are thought of symbolically as software, then the physical neurons would be the hardware. The amalgamated interaction of this software and hardware is manifested through continuously changing physical connections, and chemical and electrical influences that spread across the body. The process by which the mind and brain interact to produce the collection of experiences that we define as self-awareness is in question. == History == Although the exact definition has shifted over time, the term Wetware and its fundamental reference to "the physical mind" has been around at least since the mid-1950s. Mostly used in relatively obscure articles and papers, it was not until the heyday of cyberpunk, however, that the term found broad adoption. Among the first uses of the term in popular culture was the Bruce Sterling novel Schismatrix (1985) and the Michael Swanwick novel Vacuum Flowers (1987). Rudy Rucker references the term in a number of books, including one entitled Wetware (1988): ... all sparks and tastes and tangles, all its stimulus/response patterns – the whole bio-cybernetic software of mind. Rucker did not use the word to simply mean a brain, nor in the human-resources sense of employees. He used wetware to stand for the data found in any biological system, analogous perhaps to the firmware that is found in a ROM chip. In Rucker's sense, a seed, a plant graft, an embryo, or a biological virus are all wetware. DNA, the immune system, and the evolved neural architecture of the brain are further examples of wetware in this sense. Rucker describes his conception in a 1992 compendium The Mondo 2000 User's Guide to the New Edge, which he quotes in a 2007 blog entry. Early cyber-guru Arthur Kroker used the term in his blog. With the term getting traction in trendsetting publications, it became a buzzword in the early 1990s. In 1991, Dutch media theorist Geert Lovink organized the Wetware Convention in Amsterdam, which was supposed to be an antidote to the "out-of-body" experiments conducted in high-tech laboratories, such as experiments in virtual reality. Timothy Leary, in an appendix to Info-Psychology originally written in 1975–76 and published in 1989, used the term wetware, writing that "psychedelic neuro-transmitters were the hot new technology for booting-up the 'wetware' of the brain". Another common reference is: "Wetware has 7 plus or minus 2 temporary registers." The numerical allusion is to a classic 1957 article by George A. Miller, The magical number 7 plus or minus two: some limits in our capacity for processing information, which later gave way to Miller's law.

    Read more →
  • AS1 (networking)

    AS1 (networking)

    AS1 (Applicability Statement 1) is a specification about how to transport structured business-to-business data securely and reliably over the Internet. Security is achieved by using digital certificates and encryption. == AS1 technical overview == The AS1 protocol is based on SMTP and S/MIME. It was the first AS protocol developed and uses signing, encryption and MDN conventions. In other words: Files are sent as "attachments" in a specially coded SMIME email message Messages can be signed, but do not have to be Messages can be encrypted, but do not have to be Messages may request an MDN back if all went well, but do not have to request such a message If the original AS1 message requested an MDN... Upon the receipt of the message and its successful decryption or signature validation (as necessary) a "success" MDN will be sent back to the original sender. This MDN is typically signed but not encrypted. Upon the receipt and successful verification of the signature on the MDN, the original sender will "know" that the recipient got their message (this provides the "Non-repudiation" element of AS1) If there are any problems receiving or interpreting the original AS1 message, a "failed" MDN may be sent back. Like any other AS file transfer, AS1 file transfers typically require both sides of the exchange to trade X.509 certificates and specific "trading partner" names before any transfers can take place.

    Read more →
  • Kurzsignale

    Kurzsignale

    The Short Signal Code, also known as the Short Signal Book (German: Kurzsignalbuch), was a short code system used by the Kriegsmarine (German Navy) during World War II to minimize the transmission duration of messages. == Description == The transmission of radio messages had the potential risks of revealing the submarine's presence and direction; if decoded the content was also revealed. Submarines need to provide information, mostly in standard form (position of convoy to attack and of submarine, weather information), to their bases. Initially Morse code transmissions could be used. To inhibit detection, the duration of messages needed to be minimised; for this, Kurzsignale short-coding was used. To prevent interception, messages needed to be encrypted by the Enigma machine. To shorten transmission even further, the message could be sent by a fast machine instead of a human radio operator. For example, the Kurier system – not implemented in time – decreased the time to send a Morse dot from around 50 milliseconds for a human to 1 millisecond. == Short Signal book == The Kurzsignale code was intended to shorten transmission time to below the time required to get a directional fix. It was not primarily intended to hide signal contents; protection was intended to be achieved by encoding with the Enigma machine. A copy of the Kurzsignale code book was captured from German submarine U-110 on 9 May 1941. In August 1941, Dönitz began addressing U-boats by the names of their commanders, instead of boat numbers. The method of defining U-boat meeting points in the Short Signal Book was regarded as compromised, so a method was defined by B-Dienst cryptanalysts to disguise their positions on the Kriegsmarine German Naval Grid System (German:Gradnetzmeldeverfahren) was introduced and used until the end of the war == Radio direction finding == Aware of the danger presented by radio direction finding (RDF), the Kriegsmarine developed various systems to speed up broadcast. The Kurzsignale code system condensed messages into short codes consisting of short sequences for common terms such as "convoy location" so that additional descriptions would not be needed in the message. The resulting Kurzsignal was then encoded with the Enigma machine and subsequently transmitted as rapidly as possible, typically taking about 20 seconds. Typical length of an information or weather signal was about 25 characters. Conventional RDF needed about a minute to fix the bearing of a radio signal, and the Kurzsignale protected against this. However, the huff-duff system which was in use by the Allies could cope with these short transmissions. The fully automated burst transmission Kurier system, in testing from August 1944, could send a Kurzsignal in not more than 460 milliseconds; this was short enough to prevent location even by huff-duff and, if deployed, would have been a serious setback for Allied anti-submarine and code-breaking activities. By late 1944 the Kurier program was a top priority, but the war ended before the system was operational. == Short Weather cipher == A similar coding system was used for weather reports from U-boats, the Wetterkurzschlüssel (Short Weather Cipher). Code books were captured from U-559 on 30 October 1942.

    Read more →
  • Data Management Association

    Data Management Association

    The Data Management Association (DAMA), formerly known as the Data Administration Management Association, is a global not-for-profit organization which aims to advance concepts and practices about information management and data management. It describes itself as vendor-independent, all-volunteer organization, and has a membership consisting of technical and business professionals. Its international branch is called DAMA International (or DAMA-I), and DAMA also has various continental and national branches around the world. == History == The Data Management Association International was founded in 1980 in Los Angeles. Other early chapters were: San Francisco, Portland, Seattle, Minneapolis, New York, and Washington D.C. == Data Management Body of Knowledge == DAMA has published the Data Management Body of Knowledge (DMBOK), which contains suggestions on best practices and suggestions of a common vernacular for enterprise data management. The first edition (DAMA-DMBOK) was published on 2009 November 1, the second edition (DAMA-DMBOK2) was published on 2017 July 1., and the Revised second edition (DAMA-DMBOK2 rev.2) was published on 2019 March 19. DMBOK has been described by the authors as being an "equivalent" to the Project Management Body of Knowledge (PMBOK) and Business Analysis Body of Knowledge (BABOK). It encompasses topics such as data architecture, security, quality, modelling, governance, big data, data science, and more. DMBOK also includes the DAMA Data Wheel, an infographic which represents core data management practices. The center of the infographic is data governance, and the surrounding segments each represent a different aspect of data management: Data architecture Data modeling and design Data storage and operations Data security Data integration and interoperability Document management Content management Master data management Reference data and master data Data warehousing Metadata management Data quality Business intelligence Data science == Professional Accreditation == DAMA also provides a professional data management certification for individuals known as a Certified Data Management Professional (CDMP), which is based on the DMBOK as a study reference. There are four levels of certification based on career experience and exam results. The highest level, Fellow, requires 25 years of experience and nomination by DAMA members. It is an example of one of many competing certifications for data management professionals.

    Read more →
  • InstallCore

    InstallCore

    InstallCore (stylized as installCore) was an installation and content distribution platform created by ironSource, considered potentially unwanted programs (PUP) by a number of anti-malware vendors. It included a software development kit (SDK) for Windows and Mac OS X. The program allowed those using it for distribution to include monetization by advertisements or charging for installation, and made its installations invisible to the user and its anti-virus software. The platform and its programs have been rated potentially unwanted programs (PUP) or potentially unwanted applications (PUA) by anti-malware product vendors since 2014, and by Windows Defender Antivirus since 2015. The platform was primarily designed for efficient web-based deployment of various types of application software. As of August 2012, InstallCore was managing 100 million installations every month, offering services for paid, unpaid, and free software by using the SDK version. == History == The InstallCore team introduced the first version of the SDK at the beginning of 2011. The SDK was a fork of the FoxTab installer and had only basic Installation features. InstallCore was discontinued as part of a company flotation in late 2020. == Criticism and malware classification == InstallCore and its software packages have been classified as potentially unwanted programs (PUP) or potentially unwanted applications (PUA), by anti-malware product vendors and Windows Defender Antivirus from 2014–2015 onwards, with many stating that it installs adware and other additional PUPs. Malwarebytes identified the program as "a family of bundlers that installs more than one application on the user's computer". It has been described as "crossing the line into full-blown malware" and a "nasty Trojan".

    Read more →
  • Customer data management

    Customer data management

    Customer data management (CDM) is the ways in which businesses keep track of their customer information and survey their customer base in order to obtain feedback. CDM includes a range of software or cloud computing applications designed to give large organizations rapid and efficient access to customer data. Surveys and data can be centrally located and widely accessible within a company, as opposed to being warehoused in separate departments. CDM encompasses the collection, analysis, organizing, reporting and sharing of customer information throughout an organization. Businesses need a thorough understanding of their customers’ needs if they are to retain and increase their customer base. Efficient CDM solutions provide companies with the ability to deal instantly with customer issues and obtain immediate feedback. As a result, customer retention and customer satisfaction can show marked improvement. According to a study by Aberdeen Group, "above-average and best-in-class companies... attain greater than 20% annual improvement in retention rates, revenues, data accuracy and partner/customer satisfaction rates." == Customer data management and cloud computing == Cloud computing offers an attractive choice for CDM in many companies due to its accessibility and cost-effectiveness. Businesses can decide who, within their company, should have the ability to create, adjust, analyze or share customer information. In December 2010, 52% of Information Technology (IT) professionals worldwide were deploying, or planning to deploy, cloud computing; this percentage is far higher in many countries. == Background == Customer data management, as a term, was coined in the 1990s, pre-dating the alternative term enterprise feedback management (EFM). CDM was introduced as a software solution that would replace earlier disc-based or paper-based surveys and spreadsheet data. Initially, CDM solutions were marketed to businesses as software, which were specific to one company, and often to one department within that company. This was superseded by application service providers (ASPs) where software was hosted for end user organizations, thus avoiding the necessity for IT professionals to deploy and support software. However, ASPs with their single-tenancy architecture were, in turn, superseded by software as a service (SaaS), engineered for multi-tenancy. By 2007 SaaS applications, giving businesses on-demand access to their customer information, were rapidly gaining popularity compared with ASPs. Cloud computing now includes SaaS and many prominent CDM providers offer cloud-based applications to their clients. In recent years, there has been a push away from the term EFM, with many of those working in this area advocating the slightly updated use of CDM. The return to the term CDM is largely based on the greater need for clarity around the solutions offered by companies, and on the desire to retire terminology veering on techno-jargon that customers may have a hard time understanding.

    Read more →
  • Why We Post

    Why We Post

    Why We Post is a research project funded by the European Research Council and launched in 2012 by Daniel Miller with the objective of examining the global impact of new social media. The study is based on ethnographic data collected through the course of 15 months in China, India, Turkey, Italy, United Kingdom, Trinidad, Chile and Brazil. The results of this project were released on 29 February 2016. This included the first three of eleven Open Access books (available via UCL Press), a five-week e-course (MOOC) on FutureLearn in English, also available in Chinese, Portuguese, Hindi, Tamil, Italian, Turkish, and Spanish on UCLeXtend. In addition a website containing key discoveries, stories and over 100 films is available in the same 8 languages.

    Read more →