AI Analytics Ranking

AI Analytics Ranking — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Service-oriented software engineering

    Service-oriented software engineering

    Service-oriented software engineering (SOSE), also referred to as service engineering, is a software engineering methodology focused on the development of software systems by composition of reusable services (service-orientation) often provided by other service providers. Since it involves composition, it shares many characteristics of component-based software engineering, the composition of software systems from reusable components, but it adds the ability to dynamically locate necessary services at run-time. These services may be provided by others as web services, but the essential element is the dynamic nature of the connection between the service users and the service providers. == Service-oriented interaction pattern == There are three types of actors in a service-oriented interaction: service providers, service users and service registries. They participate in a dynamic collaboration which can vary from time to time. Service providers are software services that publish their capabilities and availability with service registries. Service users are software systems (which may be services themselves) that accomplish some task through the use of services provided by service providers. Service users use service registries to discover and locate the service providers they can use. This discovery and location occurs dynamically when the service user requests them from a service registry.

    Read more →
  • Documentalist

    Documentalist

    A documentalist is a professional, trained in documentation science and specializing in assisting researchers in their search for scientific and technical documentation. With the development of bibliographical databases such as MEDLINE, documentalists were professionals who searched such databases on the behalf of users. When the field of documentation changed its name to information science, the terms information specialist or information professional often replaced the term documentalist.

    Read more →
  • OpenWSN

    OpenWSN

    OpenWSN aims to build an open standard-based and open source implementation of a complete constrained network protocol stack for wireless sensor networks and Internet of Things. The project was created at the University of California Berkeley and extended at the INRIA and at the Open University of Catalonia (UOC). The root of OpenWSN is a deterministic MAC layer implementing the IEEE 802.15.4e TSCH based on the concept of Time Slotted Channel Hopping (TSCH). Above the MAC layer, the Low Power Lossy Network stack is based on IETF standards including the IETF 6TiSCH management and adaptation layer (a minimal configuration profile, 6top protocol and different scheduling functions). The stack is complemented by an implementation of 6LoWPAN, RPL in non-storing mode, UDP and CoAP, enabling access to devices running the stack from the native IPv6 through open standards. OpenWSN is related to other projects including the following: RIOT OpenMote OpenWSN is available for Linux, Windows and OS X platforms. Current release of OpenWSN is 1.14.0.

    Read more →
  • Information audit

    Information audit

    The information audit (IA) extends the concept of auditing from a traditional scope of accounting and finance to the organisational information management system. Information is representative of a resource which requires effective management and this led to the development of interest in the use of an IA. Prior the 1990s and the methodologies of Orna, Henczel, Wood, Buchanan and Gibb, IA approaches and methodologies focused mainly upon an identification of formal information resources (IR). Later approaches included an organisational analysis and the mapping of the information flow. This gave context to analysis within an organisation's information systems and a holistic view of their IR and as such could contribute to the development of the information systems architecture (ISA). In recent years the IA has been overlooked in favour of the systems development process which can be less expensive than the IA, yet more heavily technically focused, project specific (not holistic) and does not favour the top-down analysis of the IA. == Definition == A definition for the Information Audit cannot be universally agreed-upon amongst scholars, however the definition offered by ASLIB received positive support from a few notable scholars including Henczel, Orna and Wood; “(the IA is a) systematic examination of information use, resources and flows, with a verification by reference to both people and existing documents, in order to establish the extent to which they are contributing to an organisation’s objectives” In summary, the term audit itself implies a counting, the IA being much the same yet it counts IR and analyses how they are used and how critical they are to the success of a given task. == Role and scope of an IA == In much the same way as the IA is difficult to define, it can be utilised in a range of contexts by the information professional, from complying with freedom of information legislation to identifying any existing gaps, duplications, bottlenecks or other inefficiencies in information flows and to understand how existing channels can be used for knowledge transfer In 2007 Buchanan and Gibb developed upon their 1998 examination of the IA process by outlining a summary of its main objectives: To identify an organisation’s information resource To identify an organisation’s information needs Furthermore, Buchanan and Gibb went on to state that the IA also had to meet the following additional objectives: To identify the cost/benefits of information resources To identify the opportunities to use the information resources for strategic competitive advantage To integrate IT investment with strategic business initiatives To identify information flow and processes To develop an integrated information strategy and/or policy To create an awareness of the importance of Information Resource Management (IRM) To monitor/evaluate conformance to information related standards, legislations, policy and guidelines. == Methodology evolution == === Overview === In 1976 Riley first published a definition of IA as a way of analysing IR based on a cost-benefit model. Since Riley, scholars have outlined further developed methodologies. Henderson took a cost-benefit approach hoping to draw focus from manpower-costing to information storage and acquisition which he felt was being overlooked. In 1985 Gillman focused upon identifying the relationships which existed between various components in order to map them to one another. Neither Henderson nor Gillman’s methods offered alternative approaches beyond the existing organisational frameworks. Quinn took a hybrid-approach combining Gillman and Henderson’s methods to identify the purpose of existing IR and to position them within the organisation, as did Worlock. The differentiator between Quinn and Worlock lay in Worlock’s consideration of solutions outside of the current organisational structure. These approaches had thus far had paid little attention to the needs of the user or in making structured recommendations for the development of a corporate information strategy. Therefore, here follows a brief outline and overall comparison of four published strategic approaches in order that one might understand the development of the IA methodology. === Burk and Horton === In 1988 Burk and Horton developed InfoMap, the first IA methodology developed for widespread use. It aimed to discover, map and evaluate the IR within an organisation using a 4-stage process: Survey staff using questionnaires/interviews Measure the IR against cost/value Analyse resources Synthesise the findings and map the strengths and weaknesses of the IR against the objectives of the organisation. Although the method inventoried all IR (and therefore met standard ISO 1779) this bottom-up approach revealed limited analysis of the organisation holistically and the steps were not explicit enough. === Orna === Orna produced a top-down methodology in contrast to Burk and Horton, placing emphasis upon the importance of organisational analysis and aimed to assist in the production of a corporate information policy. Initially the method had just 4-stages, this later revised to a 10-stage process which included pre and post-audit stages as below: Conduct a preliminary review to confirm operational/strategic direction Gain support/resource from management Gain commitment from the other stakeholders (staff) Planning including the project, team, tools and techniques Identify the IR, information flow and produce a cost/value assessment Interpret findings based upon current versus desired state Produce a report to present findings Implement recommendations Monitor effects of change Repeat the IA Orna’s method introduced the need for a cyclical IA to be put in place in order for the IR to be continually tracked and improvements made regularly. Again this method was criticised for lacking some practical application and in 2004 Orna revised the methodology once more to try to rectify this problem === Buchanan and Gibb === In 1998, similarly to Orna's earlier publication, Buchanan and Gibb took a top-down approach, drawing techniques from established management disciplines to provide a framework and a level of familiarity for information professionals. This set of techniques was a notable contribution to IA methodologies and understood the need to be flexible for each organisation. Theirs was a 5-stage process: Promote benefits of the IA through seminars/surveys/CEO letter for cooperation Identify the mission objectives of the organisation, define environment (PEST), map information flow and examine organisation culture. Analyse and formulate action plan for problem areas, flow diagrams and a report of findings and recommendations Account for cost of IR and related services using Activity Based Costing (ABC) and Output Based Specification (OBS). Synthesise the whole process in final audit report and provide an information strategy (strategic direction) in relation to the organisation’s mission statement. This was the introduction of a new approach to costing the IR and had an integrated strategic direction, yet the scholars admitted that this method may be impractical for smaller organisations. === Henczel === Henczel’s methodology drew upon the strengths of Orna and Buchanan and Gibb to produce a 7-stage process: Planning and submission of business case for approval to proceed Data collection and development of an IR database and population through survey techniques Structured data analysis Data evaluation, interpretation and formulation of recommendations Communication of recommendations through a report Implementing recommendations through a devised programme The IA as a continuum-establishment of a cyclical process Focus was made once more on the strategic direction of the organisation conducting the IA. Furthermore, Henczel made examination into the use of the IA as a first-step in the development of a knowledge audit or knowledge management strategy as discussed in the later section. == Case studies == Scholars and information professionals have since tested the above methodologies with varied results. An early case study produced by Soy and Bustelo in a Spanish financial institution in 1999 aimed to identify the use of information resources for qualitative and quantitative data analysis due to the rapid expansion of the organisation within a six-year period. Although the methodology was not explicitly credited to any of the above-mentioned scholars, it did follow a strategic (post 1990's) IA process including gaining support from management, the use of questionnaires for data collection, analysis and evaluation of the data, identification and mapping of the IR, cost-analysis and outlining recommendations to assist with the establishment of an Information policy. In addition the IA report suggested that the process would need to be continual (cyclical as Orna, Henczel and Buchanan and Gibb suggest). Conclusions of this case-study stated that th

    Read more →
  • Outline of natural language processing

    Outline of natural language processing

    Natural language processing is computer activity in which computers are entailed to analyze, understand, alter, or generate natural language. This includes the automation of any or all linguistic forms, activities, or methods of communication, such as conversation, correspondence, reading, written composition, dictation, publishing, translation, lip reading, and so on. Natural-language processing is also the name of the branch of computer science, artificial intelligence, and linguistics concerned with enabling computers to engage in communication using natural language(s) in all forms, including but not limited to speech, print, writing, and signing. The following outline is provided as an overview of and topical guide to natural-language processing: == Natural-language processing == Natural-language processing can be described as all of the following: A field of science – systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe. An applied science – field that applies human knowledge to build or design useful things. A field of computer science – scientific and practical approach to computation and its applications. A branch of artificial intelligence – intelligence of machines and robots and the branch of computer science that aims to create it. A subfield of computational linguistics – interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective. An application of engineering – science, skill, and profession of acquiring and applying scientific, economic, social, and practical knowledge, in order to design and also build structures, machines, devices, systems, materials and processes. An application of software engineering – application of a systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software. A subfield of computer programming – process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages (such as Java, C++, C#, Python, etc.). The purpose of programming is to create a set of instructions that computers use to perform specific operations or to exhibit desired behaviors. A subfield of artificial intelligence programming – A type of system – set of interacting or interdependent components forming an integrated whole or a set of elements (often called 'components' ) and relationships which are different from relationships of the set or its elements to other elements or sets. A system that includes software – software is a collection of computer programs and related data that provides the instructions for telling a computer what to do and how to do it. Software refers to one or more computer programs and data held in the storage of the computer. In other words, software is a set of programs, procedures, algorithms and its documentation concerned with the operation of a data processing system. A type of technology – making, modification, usage, and knowledge of tools, machines, techniques, crafts, systems, methods of organization, in order to solve a problem, improve a preexisting solution to a problem, achieve a goal, handle an applied input/output relation or perform a specific function. It can also refer to the collection of such tools, machinery, modifications, arrangements and procedures. Technologies significantly affect human as well as other animal species' ability to control and adapt to their natural environments. A form of computer technology – computers and their application. NLP makes use of computers, image scanners, microphones, and many types of software programs. Language technology – consists of natural-language processing (NLP) and computational linguistics (CL) on the one hand, and speech technology on the other. It also includes many application oriented aspects of these. It is often called human language technology (HLT). == Prerequisite technologies == The following technologies make natural-language processing possible: Communication – the activity of a source sending a message to a receiver Language – Speech – Writing – Computing – Computers – Computer programming – Information extraction – User interface – Software – Text editing – program used to edit plain text files Word processing – piece of software used for composing, editing, formatting, printing documents Input devices – pieces of hardware for sending data to a computer to be processed Computer keyboard – typewriter style input device whose input is converted into various data depending on the circumstances Image scanners – == Subfields of natural-language processing == Information extraction (IE) – field concerned in general with the extraction of semantic information from text. This covers tasks such as named-entity recognition, coreference resolution, relationship extraction, etc. Ontology engineering – field that studies the methods and methodologies for building ontologies, which are formal representations of a set of concepts within a domain and the relationships between those concepts. Speech processing – field that covers speech recognition, text-to-speech and related tasks. Statistical natural-language processing – Statistical semantics – a subfield of computational semantics that establishes semantic relations between words to examine their contexts. Distributional semantics – a subfield of statistical semantics that examines the semantic relationship of words across a corpora or in large samples of data. == Related fields == Natural-language processing contributes to, and makes use of (the theories, tools, and methodologies from), the following fields: Automated reasoning – area of computer science and mathematical logic dedicated to understanding various aspects of reasoning, and producing software which allows computers to reason completely, or nearly completely, automatically. A sub-field of artificial intelligence, automatic reasoning is also grounded in theoretical computer science and philosophy of mind. Linguistics – scientific study of human language. Natural-language processing requires understanding of the structure and application of language, and therefore it draws heavily from linguistics. Applied linguistics – interdisciplinary field of study that identifies, investigates, and offers solutions to language-related real-life problems. Some of the academic fields related to applied linguistics are education, linguistics, psychology, computer science, anthropology, and sociology. Some of the subfields of applied linguistics relevant to natural-language processing are: Bilingualism / Multilingualism – Computer-mediated communication (CMC) – any communicative transaction that occurs through the use of two or more networked computers. Research on CMC focuses largely on the social effects of different computer-supported communication technologies. Many recent studies involve Internet-based social networking supported by social software. Contrastive linguistics – practice-oriented linguistic approach that seeks to describe the differences and similarities between a pair of languages. Conversation analysis (CA) – approach to the study of social interaction, embracing both verbal and non-verbal conduct, in situations of everyday life. Turn-taking is one aspect of language use that is studied by CA. Discourse analysis – various approaches to analyzing written, vocal, or sign language use or any significant semiotic event. Forensic linguistics – application of linguistic knowledge, methods and insights to the forensic context of law, language, crime investigation, trial, and judicial procedure. Interlinguistics – study of improving communications between people of different first languages with the use of ethnic and auxiliary languages (lingua franca). For instance by use of intentional international auxiliary languages, such as Esperanto or Interlingua, or spontaneous interlanguages known as pidgin languages. Language assessment – assessment of first, second or other language in the school, college, or university context; assessment of language use in the workplace; and assessment of language in the immigration, citizenship, and asylum contexts. The assessment may include analyses of listening, speaking, reading, writing or cultural understanding, with respect to understanding how the language works theoretically and the ability to use the language practically. Language pedagogy – science and art of language education, including approaches and methods of language teaching and study. Natural-language processing is used in programs designed to teach language, including first- and second-language training. Language planning – Language policy – Lexicography – Literacies – Pragmatics – Second-language acquisition – Stylistics – Translation – Comp

    Read more →
  • Authoritative Legal Entity Identifier

    Authoritative Legal Entity Identifier

    An Authoritative Legal Entity Identifier (ALEI) is the identifier assigned by a government jurisdiction authorized by statute or decree to create a legal entity and to maintain the authoritative registries of legal entities. ALEIs are used within supply chain data, ERP applications and master data management systems to support accurate and consistent identification of entities in digital records, supply chains, and government databases. ALEIs are described in the international standard ISO 8000-116, which outlines a structured format that makes the locally unique identifier into a globally unique one and ensures global interoperability and data quality. == Structure == An ALEI is composed of three main components: a prefix that identifies the jurisdiction and register, a subdomain element (optional), and the local registration number of the entity. For example, the identifier "US-DE.BER:3031657" refers to an entity registered in the Delaware Business Entity Register in the United States. The standardization of this structure is governed by ISO 8000-116, which is designed to ensure each ALEI is globally unique and resolvable. == Comparison with other identifiers == ALEIs differ from proxy identifiers such as the DUNS number, NCAGE code, or the Legal Entity Identifier (LEI) managed by GLEIF. While proxy identifiers can be issued by institutions that do not create legal entities, ALEIs are created and maintained by public bodies with the authority to form and register legal entities. This authoritative origin makes ALEIs particularly suitable for applications involving legal traceability, government regulation, and international transparency efforts. == Usage == ALEIs are increasingly utilized to identify legal entities in public and private datasets. The identifiers support supply chain accuracy, regulatory compliance, and the unification of master data. The first practical implementation of an ALEI was the International Business Registration Number (IBRN), developed to provide globally unique identifiers for registered business entities. IBRNs are issued by authorized government jurisdictions and are used to verify entities across borders, particularly in the context of trade facilitation and data exchange systems. For instance, business directories and registration systems in U.S. states like Connecticut provide structured registration documents that can be used to verify the ALEIs they issue. The use of ALEIs has been recommended by international organizations such as the Extractive Industries Transparency Initiative (EITI) and Open ownership to improve beneficial ownership registries. == Policy and regulation == ALEIs have been referenced in policy consultations such as those related to the U.S. Financial Data Transparency Act. Federal institutions including the Federal Reserve and FDIC have examined the potential for ALEIs to unify entity identification across regulatory databases.

    Read more →
  • Literature review

    Literature review

    A literature review is an overview of previously published works on a particular topic. The term can refer to a full scholarly paper or a section of a scholarly work such as books or articles. Either way, a literature review provides the researcher/author and the audiences with general information of an existing knowledge of a particular topic. A good literature review has a proper research question, a proper theoretical framework, and/or a chosen research method. It serves to situate the current study within the body of the relevant literature and provides context for the reader. In such cases, the review usually precedes the methodology and results sections of the work. Producing a literature review is often part of a graduate and post-graduate requirement, included in the preparation of a thesis, dissertation, or a journal article. Literature reviews are also common in a research proposal or prospectus (the document approved before a student formally begins a dissertation or thesis). A literature review can be a type of a review article. In this sense, it is a scholarly paper that presents the current knowledge including substantive findings as well as theoretical and methodological contributions to a particular topic. Literature reviews are secondary sources and do not report new or original experimental work. Most often associated with academic-oriented literature, such reviews are found in academic journals and are not to be confused with book reviews, which may also appear in the same publication. Literature reviews are a basis for research in nearly every academic field. == Types == Since the concept of a systematic review was formalized in the 1970s, a basic division among types of reviews is the dichotomy of narrative reviews versus systematic reviews. The main types of narrative reviews are evaluative, exploratory, and instrumental. A fourth type of review of literature (the scientific literature) is the systematic review but it is not called a literature review, which absent further specification, conventionally refers to narrative reviews. A systematic review focuses on a specific research question to identify, appraise, select, and synthesize all high-quality research evidence and arguments relevant to that question. A meta-analysis is typically a systematic review using statistical methods to effectively combine the data used on all selected studies to produce a more reliable result. Torraco (2016) describes an integrative literature review. The purpose of an integrative literature review is to generate new knowledge on a topic through the process of review, critique, and synthesis of the literature under investigation. George et al (2023) offer an extensive overview of review approaches. They also propose a model for selecting an approach by looking at the purpose, object, subject, community, and practices of the review. They describe six different types of review, each with their own unique purposes: Exploratory or scoping reviews focus on breadth as opposed to depth Systematic or integrative reviews integrate empirical studies on a topic Meta-narrative reviews are qualitative and use literature to compare research or practice communities Problematizing or critical reviews propose new perspectives on a concept by association with other literature Meta-analyses and meta-regressions integrate quantitative studies and identify moderators Mixed research syntheses combine other review approaches in the same paper == Process and product == Shields and Rangarajan (2013) distinguish between the process of reviewing the literature and a finished work or product known as a literature review. The process of reviewing the literature is often ongoing and informs many aspects of the empirical research project. The process of reviewing the literature requires different kinds of activities and ways of thinking. Shields and Rangarajan (2013) and Granello (2001) link the activities of doing a literature review with Benjamin Bloom's revised taxonomy of the cognitive domain (ways of thinking: remembering, understanding, applying, analyzing, evaluating, and creating). === Use of artificial intelligence in a literature review === Artificial intelligence (AI) is reshaping traditional literature reviews across various disciplines. Generative pre-trained transformers, such as ChatGPT, are often used by students and academics for review purposes. Since 2023, an increasing number of tools powered by large language models and other artificial intelligence technologies have been developed to assist, automate, or generate literature reviews. Nevertheless, the employment of ChatGPT in academic reviews is problematic due to ChatGPT's propensity to "hallucinate". In response, efforts are being made to mitigate these hallucinations through the integration of plugins. For instance, Rad et al. (2023) used ScholarAI for review in cardiothoracic surgery.

    Read more →
  • Algorithm engineering

    Algorithm engineering

    Algorithm engineering focuses on the design, analysis, implementation, optimization, profiling and experimental evaluation of computer algorithms, bridging the gap between algorithmics theory and practical applications of algorithms in software engineering. It is a general methodology for algorithmic research. == Origins == In 1995, a report from an NSF-sponsored workshop "with the purpose of assessing the current goals and directions of the Theory of Computing (TOC) community" identified the slow speed of adoption of theoretical insights by practitioners as an important issue and suggested measures to reduce the uncertainty by practitioners whether a certain theoretical breakthrough will translate into practical gains in their field of work, and tackle the lack of ready-to-use algorithm libraries, which provide stable, bug-free and well-tested implementations for algorithmic problems and expose an easy-to-use interface for library consumers. But also, promising algorithmic approaches have been neglected due to difficulties in mathematical analysis. The term "algorithm engineering" was first used with specificity in 1997, with the first Workshop on Algorithm Engineering (WAE97), organized by Giuseppe F. Italiano. == Difference from algorithm theory == Algorithm engineering does not intend to replace or compete with algorithm theory, but tries to enrich, refine and reinforce its formal approaches with experimental algorithmics (also called empirical algorithmics). This way it can provide new insights into the efficiency and performance of algorithms in cases where the algorithm at hand is less amenable to algorithm theoretic analysis, formal analysis pessimistically suggests bounds which are unlikely to appear on inputs of practical interest, the algorithm relies on the intricacies of modern hardware architectures like data locality, branch prediction, instruction stalls, instruction latencies which the machine model used in Algorithm Theory is unable to capture in the required detail, the crossover between competing algorithms with different constant costs and asymptotic behaviors needs to be determined. == Methodology == Some researchers describe algorithm engineering's methodology as a cycle consisting of algorithm design, analysis, implementation and experimental evaluation, joined by further aspects like machine models or realistic inputs. They argue that equating algorithm engineering with experimental algorithmics is too limited, because viewing design and analysis, implementation and experimentation as separate activities ignores the crucial feedback loop between those elements of algorithm engineering. === Realistic models and real inputs === While specific applications are outside the methodology of algorithm engineering, they play an important role in shaping realistic models of the problem and the underlying machine, and supply real inputs and other design parameters for experiments. === Design === Compared to algorithm theory, which usually focuses on the asymptotic behavior of algorithms, algorithm engineers need to keep further requirements in mind: Simplicity of the algorithm, implementability in programming languages on real hardware, and allowing code reuse. Additionally, constant factors of algorithms have such a considerable impact on real-world inputs that sometimes an algorithm with worse asymptotic behavior performs better in practice due to lower constant factors. === Analysis === Some problems can be solved with heuristics and randomized algorithms in a simpler and more efficient fashion than with deterministic algorithms. Unfortunately, this makes even simple randomized algorithms difficult to analyze because there are subtle dependencies to be taken into account. === Implementation === Huge semantic gaps between theoretical insights, formulated algorithms, programming languages and hardware pose a challenge to efficient implementations of even simple algorithms, because small implementation details can have rippling effects on execution behavior. The only reliable way to compare several implementations of an algorithm is to spend an considerable amount of time on tuning and profiling, running those algorithms on multiple architectures, and looking at the generated machine code. === Experiments === See: Experimental algorithmics === Application engineering === Implementations of algorithms used for experiments differ in significant ways from code usable in applications. While the former prioritizes fast prototyping, performance and instrumentation for measurements during experiments, the latter requires thorough testing, maintainability, simplicity, and tuning for particular classes of inputs. === Algorithm libraries === Stable, well-tested algorithm libraries like LEDA play an important role in technology transfer by speeding up the adoption of new algorithms in applications. Such libraries reduce the required investment and risk for practitioners, because it removes the burden of understanding and implementing the results of academic research. == Conferences == Two main conferences on Algorithm Engineering are organized annually, namely: Symposium on Experimental Algorithms (SEA), established in 1997 (formerly known as WEA). SIAM Meeting on Algorithm Engineering and Experiments (ALENEX), established in 1999. The 1997 Workshop on Algorithm Engineering (WAE'97) was held in Venice (Italy) on September 11–13, 1997. The Third International Workshop on Algorithm Engineering (WAE'99) was held in London, UK in July 1999. The first Workshop on Algorithm Engineering and Experimentation (ALENEX99) was held in Baltimore, Maryland on January 15–16, 1999. It was sponsored by DIMACS, the Center for Discrete Mathematics and Theoretical Computer Science (at Rutgers University), with additional support from SIGACT, the ACM Special Interest Group on Algorithms and Computation Theory, and SIAM, the Society for Industrial and Applied Mathematics.

    Read more →
  • Outline of computer security

    Outline of computer security

    The following outline is provided as an overview of and topical guide to computer security: Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It focuses on protecting computer software, systems, and networks from threats that can lead to unauthorized information disclosure, theft, or damage to hardware, software, or data, as well as to the disruption or misdirection of the services they provide. The growing significance of computer security reflects the increasing dependence on computer systems, the Internet, and evolving wireless network standards. This reliance has expanded with the proliferation of smart devices, including smartphones, televisions, and other components of the Internet of things (IoT). (yes) == Essence of computer security == Computer security can be described as all of the following: a branch of security Network security application security == Areas of computer security == Access control – selective restriction of access to a place or other resource. The act of accessing may mean consuming, entering, or using. Permission to access a resource is called authorization. Computer access control – includes authorization, authentication, access approval, and audit. Authentication Knowledge-based authentication Integrated Windows Authentication Password Password length parameter Secure Password Authentication Secure Shell Kerberos (protocol) SPNEGO NTLMSSP AEGIS SecureConnect TACACS Cyber security and countermeasure Device fingerprint Physical security – protecting property and people from damage or harm (such as from theft, espionage, or terrorist attacks). It includes security measures designed to deny unauthorized access to facilities, (such as a computer room), equipment (such as your computer), and resources (like the data storage devices, and data, in your computer). If a computer gets stolen, then the data goes with it. In addition to theft, physical access to a computer allows for ongoing espionage, like the installment of a hardware keylogger device, and so on. Data security – protecting data, such as a database, from destructive forces and the unwanted actions of unauthorized users. Information privacy – relationship between collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them. Privacy concerns exist wherever personally identifiable information or other sensitive information is collected and stored – in digital form or otherwise. Improper or non-existent disclosure control can be the root cause for privacy issues. Internet privacy – involves the right or mandate of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to oneself via the Internet. Privacy can entail either Personally Identifying Information (PII) or non-PII information such as a site visitor's behavior on a website. PII refers to any information that can be used to identify an individual. For example, age and physical address alone could identify who an individual is without explicitly disclosing their name, as these two factors relate to a specific person. Mobile security – security pertaining to smartphones, especially with respect to the personal and business information stored on them. Network security – provisions and policies adopted by a network administrator to prevent and monitor unauthorized access, misuse, modification, or denial of a computer network and network-accessible resources. Network security involves the authorization of access to data in a network, which is controlled by the network administrator. Network Security Toolkit Internet security – computer security specifically related to the Internet, often involving browser security but also network security on a more general level as it applies to other applications or operating systems on a whole. Its objective is to establish rules and measures to use against attacks over the Internet. The Internet represents an insecure channel for exchanging information leading to a high risk of intrusion or fraud, such as phishing. Different methods have been used to protect the transfer of data, including encryption. World Wide Web Security – dealing with the vulnerabilities of users who visit websites. Cybercrime on the Web can include identity theft, fraud, espionage and intelligence gathering. For criminals, the Web has become the preferred way to spread malware. == Computer security threats == Methods of Computer Network Attack and Computer Network Exploitation Social engineering is a frequent method of attack, and can take the form of phishing, or spear phishing in the corporate or government world, as well as counterfeit websites. Password sharing and insecure password practices Poor patch management Computer crime – Computer criminals – Hackers – in the context of computer security, a hacker is someone who seeks and exploits weaknesses in a computer system or computer network. Password cracking – Software cracking – Script kiddies – List of computer criminals – Identity theft – Computer malfunction – Operating system failure and vulnerabilities Hard disk drive failure – occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer. A disk failure may occur in the course of normal operation, or due to an external factor such as exposure to fire or water or high magnetic fields, or suffering a sharp impact or environmental contamination, which can lead to a head crash. Data recovery from a failed hard disk is problematic and expensive. Backups are essential Computer and network surveillance – Man in the Middle Loss of anonymity – when one's identity becomes known. Identification of people or their computers allows their activity to be tracked. For example, when a person's name is matched with the IP address they are using, their activity can be tracked thereafter by monitoring the IP address. HTTP Cookie Local Shared Object Web bug Spyware Adware Cyber spying – obtaining secrets without the permission of the holder of the information (personal, sensitive, proprietary or of classified nature), from individuals, competitors, rivals, groups, governments and enemies for personal, economic, political or military advantage using methods on the Internet, networks or individual computers through the use of cracking techniques and malicious software including Trojan horses and spyware. It may be done online from by professionals sitting at their computer desks on bases in far away countries, or it may involve infiltration at home by computer trained conventional spies and moles, or it may be the criminal handiwork of amateur malicious hackers, software programmers, or thieves. Computer and network eavesdropping Lawful Interception War Driving Packet analyzer (aka packet sniffer) – mainly used as a security tool (in many ways, including for the detection of network intrusion attempts), packet analyzers can also be used for spying, to collect sensitive information (e.g., login details, cookies, personal communications) sent through a network, or to reverse engineer proprietary protocols used over a network. One way to protect data sent over a network such as the Internet is by using encryption software. Cyberwarfare – Exploit – piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic (usually computerized). Such behavior frequently includes things like gaining control of a computer system, allowing privilege escalation, or a denial-of-service attack. Trojan Computer virus Computer worm Denial-of-service attack – an attempt to make a machine or network resource unavailable to its intended users, usually consisting of efforts to temporarily or indefinitely interrupt or suspend services of a host connected to the Internet. One common method of attack involves saturating the target machine with external communications requests, so much so that it cannot respond to legitimate traffic, or responds so slowly as to be rendered essentially unavailable. Distributed denial-of-service attack (DDoS) – DoS attack sent by two or more persons. Hacking tool Malware Computer virus Computer worm Keylogger – program that does keystroke logging, which is the action of recording (or logging) the keys struck on a keyboard, typically in a covert manner so that the person using the keyboard is unaware that their actions are being monitored. There are also HID spoofing hardware keyloggers, like a USB device inserting stored keystores when connected. Rootkit – stealthy type of software, typically malicious, designed to hide the existence of certain processes or programs from normal methods of detection and enable contin

    Read more →
  • Algorithmic mechanism design

    Algorithmic mechanism design

    Algorithmic mechanism design (AMD) lies at the intersection of economic game theory, optimization, and computer science. The prototypical problem in mechanism design is to design a system for multiple self-interested participants, such that the participants' self-interested actions at equilibrium lead to good system performance. Typical objectives studied include revenue maximization and social welfare maximization. Algorithmic mechanism design differs from classical economic mechanism design in several respects. It typically employs the analytic tools of theoretical computer science, such as worst case analysis and approximation ratios, in contrast to classical mechanism design in economics which often makes distributional assumptions about the agents. It also considers computational constraints to be of central importance: mechanisms that cannot be efficiently implemented in polynomial time are not considered to be viable solutions to a mechanism design problem. This often, for example, rules out the classic economic mechanism, the Vickrey–Clarke–Groves auction. == History == Noam Nisan and Amir Ronen first coined "Algorithmic mechanism design" in a research paper published in 1999.

    Read more →
  • XOR swap algorithm

    XOR swap algorithm

    In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the exclusive or operation. It is sometimes discussed as a program optimization, but there are almost no cases where swapping via exclusive or provides benefit over the standard, obvious technique. == The algorithm == Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: Since XOR is a commutative operation, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: In the above System/370 assembly code sample, R1 and R2 are distinct registers, and each XR operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and xor places the result of the operation in the first register (Note: x86 supports XCHG instruction so using triple XOR do not make sense on this architecture). In RISC-V assembly, value X and Y are in registers x10 and x11, and xor places the result of the operation in the first operand. However, in the pseudocode or high-level language version or implementation, the algorithm fails if x and y use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is not the same as if x and y have the same values. The trouble only comes when x and y use the same storage location, in which case their values must already be equal. That is, if x and y use the same storage location, then the line: sets x to zero (because x = y so X XOR Y is zero) and sets y to zero (since it uses the same storage location), causing x and y to lose their original values. == Proof of correctness == The binary operation XOR over bit strings of length N {\displaystyle N} exhibits the following properties (where ⊕ {\displaystyle \oplus } denotes XOR): L1. Commutativity: A ⊕ B = B ⊕ A {\displaystyle A\oplus B=B\oplus A} L2. Associativity: ( A ⊕ B ) ⊕ C = A ⊕ ( B ⊕ C ) {\displaystyle (A\oplus B)\oplus C=A\oplus (B\oplus C)} L3. Identity exists: there is a bit string, 0, (of length N) such that A ⊕ 0 = A {\displaystyle A\oplus 0=A} for any A {\displaystyle A} L4. Each element is its own inverse: for each A {\displaystyle A} , A ⊕ A = 0 {\displaystyle A\oplus A=0} . Suppose that we have two distinct registers R1 and R2 as in the table below, with initial values A and B respectively. We perform the operations below in sequence, and reduce our results using the properties listed above. === Linear algebra interpretation === As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional vector space over the field with two elements, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that x and y are each single bits, not bit vectors. For example, the step: which also has the implicit: corresponds to the matrix ( 1 1 0 1 ) {\displaystyle \left({\begin{smallmatrix}1&1\\0&1\end{smallmatrix}}\right)} as ( 1 1 0 1 ) ( x y ) = ( x + y y ) . {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}x\\y\end{pmatrix}}={\begin{pmatrix}x+y\\y\end{pmatrix}}.} The sequence of operations is then expressed as: ( 1 1 0 1 ) ( 1 0 1 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} (working with binary values, so 1 + 1 = 0 {\displaystyle 1+1=0} ), which expresses the elementary matrix of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length n, these 2×2 matrices are replaced by 2n×2n block matrices such as ( I n I n 0 I n ) . {\displaystyle \left({\begin{smallmatrix}I_{n}&I_{n}\\0&I_{n}\end{smallmatrix}}\right).} These matrices are operating on values, not on variables (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location. == Code example == A C function that implements the XOR swap algorithm: The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple x ^= x resulting in zero. == Reasons for avoidance in practice == On modern CPU architectures, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via instruction pipelines. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of instruction-level parallelism. === Aliasing === The XOR swap is also complicated in practice by aliasing. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with call by name, as in Jensen's Device, where swapping i and A[i] via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A[i]; A[i] = temp changes the value for i in the second statement, which then results in the incorrect i value for A[i] in the third statement. == Variations == The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as modular arithmetic or bignums to guarantee that the computation of X + Y cannot cause an error due to integer overflow. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of AddSwap above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of modular arithmetic, i. e. are done in the cyclic group Z / 2 s Z {\displaystyle \mathbb {Z} /2^{s}\mathbb {Z} } where s {\displaystyle s} is the number of bits of unsigned int. Indeed, the correctness of the algorithm follows from the fact that the formulas ( x + y ) − y = x {\displaystyle (x+y)-y=x} and ( x + y ) − ( ( x + y ) − y ) = y {\displaystyle (x+y)-((x+y)-y)=y} hold in any abelian group. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group ( Z / 2 Z ) s {\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{s}} (which is the direct sum of s copies of Z / 2 Z {\displaystyle \mathbb {Z} /2\mathbb {Z} } ). This doesn't hold when dealing with the signed int type (the default for int). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in AddSwap can be expressed via matrix multiplication as: ( 1 − 1 0 1 ) ( 1 0 1 − 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&-1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&-1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} == Application to register allocation == On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal register allocatio

    Read more →
  • Ontology engineering

    Ontology engineering

    In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling. Ontology engineering aims at making explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the inter-operability problems brought about by semantic obstacles, i.e. the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain. Automated processing of information not interpretable by software agents can be improved by adding rich semantics to the corresponding resources, such as video files. One of the approaches for the formal conceptualization of represented knowledge domains is the use of machine-interpretable ontologies, which provide structured data in, or based on, RDF, RDFS, and OWL. Ontology engineering is the design and creation of such ontologies, which can contain more than just the list of terms (controlled vocabulary); they contain terminological, assertional, and relational axioms to define concepts (classes), individuals, and roles (properties) (TBox, ABox, and RBox, respectively). Ontology engineering is a relatively new field of study concerning the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. A common way to provide the logical underpinning of ontologies is to formalize the axioms with description logics, which can then be translated to any serialization of RDF, such as RDF/XML or Turtle. Beyond the description logic axioms, ontologies might also contain SWRL rules. The concept definitions can be mapped to any kind of resource or resource segment in RDF, such as images, videos, and regions of interest, to annotate objects, persons, etc., and interlink them with related resources across knowledge bases, ontologies, and LOD datasets. This information, based on human experience and knowledge, is valuable for reasoners for the automated interpretation of sophisticated and ambiguous contents, such as the visual content of multimedia resources. Application areas of ontology-based reasoning include, but are not limited to, information retrieval, automated scene interpretation, and knowledge discovery. == Languages == An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based: Common logic is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other. The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions. The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language. IDEF5 is a software engineering method to develop and maintain usable, accurate, domain ontologies. KIF is a syntax for first-order logic that is based on S-expressions. Rule Interchange Format (RIF), F-Logic and its successor ObjectLogic combine ontologies and rules. OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs. OntoUML is a well-founded language for specifying reference ontologies. SHACL (RDF SHapes Constraints Language) is a language for describing structure of RDF data. It can be used together with RDFS and OWL or it can be used independently from them. XBRL (Extensible Business Reporting Language) is a syntax for expressing business semantics. == Methodologies and tools == DOGMA KAON OntoClean HOZO Protégé (software) Large language models == In life sciences == Life sciences is flourishing with ontologies that biologists use to make sense of their experiments. For inferring correct conclusions from experiments, ontologies have to be structured optimally against the knowledge base they represent. The structure of an ontology needs to be changed continuously so that it is an accurate representation of the underlying domain. Recently, an automated method was introduced for engineering ontologies in life sciences such as Gene Ontology (GO), one of the most successful and widely used biomedical ontology. Based on information theory, it restructures ontologies so that the levels represent the desired specificity of the concepts. Similar information theoretic approaches have also been used for optimal partition of Gene Ontology. Given the mathematical nature of such engineering algorithms, these optimizations can be automated to produce a principled and scalable architecture to restructure ontologies such as GO. Open Biomedical Ontologies (OBO), a 2006 initiative of the U.S. National Center for Biomedical Ontology, provides a common 'foundry' for various ontology initiatives, amongst which are: The Generic Model Organism Project (GMOD) Gene Ontology Consortium Sequence Ontology Ontology Lookup Service The Plant Ontology Consortium Standards and Ontologies for Functional Genomics and more

    Read more →
  • Doubao

    Doubao

    Doubao (Chinese: 豆包) is an artificial intelligence assistant developed by ByteDance. == History == The chatbot was launched in August 2023. By November 2024, it had become China's most popular AI chatbot, with approximately 60 million monthly active users according to industry analytics. == Design == Doubao is powered by Volcano Engine (Volcengine), 120 trillion tokens consumed per day. == Variants == === Dola === The international version of Doubao is Dola which was launched in August 2023 as Cici. Dola is powered by OpenAI's GPT series of large language models and by Google's Gemini.

    Read more →
  • Principles for a Data Economy

    Principles for a Data Economy

    The Principles for a Data Economy – Data Rights and Transactions is a transatlantic legal project carried out jointly by the American Law Institute (ALI) and the European Law Institute (ELI). The Principles for a Data Economy deals with a range of different legal questions that arise in the data economy. Since data is different from other tradeable items, the Principles draw up legal rules for data transactions and data rights that take into account the interests of different stakeholders involved in the data economy. The Principles are designed to facilitate contractual relations as well as the drafting of model agreements and can guide courts and legislators worldwide. The project proposes a set of principles that can be implemented in any legal system and is designed to work in conjunction with any kind of data privacy/data protection law, intellectual property law or trade secret law. The Principles do not address or seek to change any of the substantive rules of these bodies of law. The Project Team consists of Neil B Cohen and Christiane Wendehorst (as Project Reporters) and Lord John Thomas as well as Steven O. Weise (as Project Chairs). == Characteristics of data == The law governing trades in commerce has historically focused on trade in items that are tangible like goods or on intangible assets, such as shares or licenses. However, data does not fit into any of these traditional categories, nor does it qualify as a service. It is often unclear how traditional legal rules and doctrines can apply to data, as data is different from other assets in many ways. For example, data can be multiplied at basically no cost and can be used in parallel for a variety of different purposes by many different people at the same time (data is a “non-rivalrous” resource). Uncertainty regarding the applicable rules to govern the data economy may inhibit innovation and growth and trouble stakeholders like data-driven industries, start-ups, and consumers. == Stakeholders in the data economy == The Principles have taken the basic types of players and relations which can be found in data ecosystems as a starting point to provide guidance in different situations. The central actors in the data economy are data controllers (also called “data holders”). They are in a position to access the data and decide for which purposes and means this data should be processed. A controller may exercise control all by itself or share it with co-controllers, such as under a data pooling arrangement. Data processors provide the processing of data on a controller’s behalf as a service. Another important group of stakeholders includes those that contribute to the generation of data (e.g. data subjects). Other players in the data economy include data assemblers or data intermediaries (e.g. data trusts). == History of the project and timeline == Before the official adoption of the project by ALI and ELI bodies in 2018, the project team carried out a Feasibility Study from October 2016 to February 2018. In the following years, the project team produced a number of drafts (e.g. “Preliminary Drafts” No. 1 to 4, “Tentative Draft No. 1”) and project progress were regularly discussed with advisory bodies and members of both the ALI and the ELI. The project reporters also included feedback and insights from industry stakeholders and experts that was gained after several meetings and workshops, hosted, inter alia by UNCITRAL, UNIDROIT and several national governmental institutions. Tentative Draft No. 2 was presented at the ALI Annual Meeting in May 2021 and approved by ALI membership. The latest draft ("Final Council Draft") was also approved by the ELI Council and ELI Membership. The Principles for a Data Economy were presented at an international conference with representatives from institutions such as the Uniform Law Commission (ULC), the European Commission, UNIDROIT, the OECD, the International Chamber of Commerce (ICC) and the World Economic Forum (WEF) in October 2021. == Project structure == The current draft (“Tentative Draft No. 2”) of the Principles consists of five Parts that each governs different aspects of the data economy: General Provisions, Data Contracts, Data Rights, Third Party Aspects of Data Activities, and Multi-State Issues. === General Provisions === Part I includes general provisions that apply to all other Parts of the Principles for a Data Economy. This Part sets out the purpose of the Principles: they aim to make existing law in the field of the data economy more coherent and support the development of the law in this field by courts and legislators worldwide. It is also clarified that the Principles have a wide scope of application and can be used in a variety of ways by stakeholders in the data economy. The Principles may, for example, serve private parties as a basis for contract formation, guide the deliberations of arbitral tribunals or inspire national legislation. Part I then defines several key terms, such as ‘digital data’ and ‘data right’. The scope of the Principles is limited to matters where information is recorded as an asset, resource or tradeable commodity and where large amounts of data, rather than single pieces of information, are concerned. This Part also clarifies that remedies with respect to data contracts and data rights are left to the applicable national law. === Data Contracts === Part II lists different types of contracts that often occur in the data economy and establishes two broad categories, namely contracts for the supply and sharing of data and contracts for services with regard to data. Contracts for the supply and sharing of data include, e.g. data transfer contracts or data pooling arrangements, while contracts for services with regard to data cover contracts for the processing of data or data intermediary contracts. The Principles provide default terms for each contract type, on issues such as the manner in which data should supply or which characteristics the data supplied should meet. These default terms 'automatically' become part of the contract unless the parties agree otherwise. === Data Rights === Part III governs legally protected interests of players in the data economy that stem from the characteristics of data as a resource (e.g. its non-rivalrous nature) or from public interest considerations. Such data rights may include the right to data access, the right to require the controller to desist from data activities or to correct incorrect/incomplete data, or even to receive an economic share in profits derived from the use of data. For example, the Principles deal with data rights of stakeholders that had a share in the co-generation of data and identify different factors to be considered in determining whether to afford a party a data right. The underlying idea that parties who have contributed to the generation of data should have some rights in the utilization of the data is also recognized by governmental institutions, such as by the Japanese Ministry of Economy, Trade and Industry (METI), and the term co-generated data, which was coined by the Principles for a Data Economy, has been adopted, inter alia by the European Commission, the German Data Ethics Commission and the Global Partnership on Artificial Intelligence (GPAI). This Part also deals with data rights for the public interest, such as data sharing rights in the field of innovation. === Third Party Aspects === Part IV governs different situations in which data transactions interfere with the rights of third parties. Such rights include intellectual property rights or rights derived from data privacy or data protection law. This Part sets out under which circumstances data activities should be considered wrongful vis à vis another party. For example, a data activity (like data processing or the onward supply of data) could be considered wrongful, if a controller interferes with the rights of data subjects that are protected by data-protection law. A data activity could also be wrongful if the controller is non-compliant with contractual limitations on data activities, enforceable by the protected party (e.g. a controller may only process data for a certain purpose). If someone obtained access to data by unauthorized means (i.e. data “theft”) this could also be considered wrongful. The Part on Third-Party Aspects also takes a detailed look at the effects of the onward supply of data can have on third parties, while balancing the protection of third parties on the one hand, with the interests of data recipients and the desire to encourage data sharing on the other. === Multi-State Issues === As transactions in the data economy are international by nature and hardly occur within one legal system alone, the Part V of the Principles also briefly touches upon the applicability of the rules and doctrines of private international law to such transactions. == Links == Website of the “Principles for a Data Economy – Data Rights and Transaction

    Read more →
  • Transaction data

    Transaction data

    Transaction data or transaction information is a category of data describing transactions. Transaction data/information gather variables generally referring to reference data or master data – e.g. dates, times, time zones, currencies. Typical transactions are: Financial transactions about orders, invoices, payments; Work transactions about plans, activity records; Logistic transactions about deliveries, storage records, travel records, etc.. == Management == Recording and storing transactions is called records management. The record of the transaction is stored in a place where the retention can be guaranteed and where data is archived or removed following a retention period. Formats of recorded transactions can be digital data in databases and spreadsheets, or handwritten texts in physical documents like former bankbooks. Transaction processing systems are application software that generate transactions and manage transaction data/information, e.g. SAP and Oracle Financials. == Data warehousing == Transaction data can be summarised in a data warehouse, which helps accessibility and analysis of the data.

    Read more →