AI Assistant Zia Spark Icon

AI Assistant Zia Spark Icon — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Parkerian Hexad

    Parkerian Hexad

    The Parkerian Hexad is a set of six elements of information security proposed by Donn B. Parker in 1998. The Parkerian Hexad adds three additional attributes to the three classic security attributes of the CIA triad (confidentiality, integrity, availability). The Parkerian Hexad attributes are the following: Confidentiality Possession or Control Integrity Authenticity Availability Utility These attributes of information are atomic in that they are not broken down into further constituents; they are non-overlapping in that they refer to unique aspects of information. Any information security breach can be described as affecting one or more of these fundamental attributes of information. == Attributes from the CIA triad == === Confidentiality === Confidentiality refers to the "quality or state of being private or secret; known only to a limited few", or "the property that information is not made available or disclosed to unauthorized individuals, entities, or processes". For example: If an enterprise's strategic plans are leaked to competitors then this is a breach of confidentiality; If unauthorized persons gain access to an individual's financial records then that individual's confidentiality is breached. === Integrity === Integrity refers to being correct or consistent with the intended state of information. Any unauthorized modification of data, whether deliberate or accidental, is a breach of data integrity. For example: Data stored on disk are expected to be stable. If the data is changed at random by problems with a disk controller then this is a breach of integrity; Data generated by a medical device is transmitted and stored in the healthcare center but neither altered nor tampered with; Application programs are supposed to record information correctly. If the application introduces deviations from the intended values then this is a breach of integrity. "From Donn Parker: My definition of information integrity comes from the dictionaries. Integrity means that the information is whole, sound, and unimpaired (not necessarily correct). It means nothing is missing from the information it is complete and in intended good order". === Availability === Availability means having timely access to information. For example: A disk crash or denial-of-service attacks both cause a breach of availability. Any delay in response of a system that exceeds the expected service levels for that system can be described as a breach of availability. GPS jamming can lead to loss of Availability of the GPS system. == Parker's added attributes == === Authenticity === Authenticity is the "quality of being authentic or of established authority for truth and correctness". Parker defines it thus: "is the information genuine and accurate? Does it conform to reality and have validity?" and "authoritative, valid, true, real, genuine, or worthy of acceptance or belief by reason of conformity to fact and reality". === Possession or control === Possession or control refers to the loss of data by the authorized user (even if the ʺthiefʺ cannot access the data). From a control systems perspective, it is any loss of control (the ability to change settings and functions) or loss of view (the ability to monitor the system’s operation and its response to controls). Suppose a thief were to steal a sealed envelope containing a bank debit card and its personal identification number. Even if the thief did not open that envelope, it's reasonable for the victim to be concerned that the thief could do so at any time. That situation illustrates a loss of control or possession of information but does not involve the breach of confidentiality. === Utility === Utility refers to the data's usefulness. For example: Suppose someone encrypted data on disk to prevent unauthorized access or undetected modifications–and then lost the decryption key: that would be a breach of utility. The data would be confidential, controlled, integral, authentic, and available–they just wouldn't be useful in that form. The conversion of salary data from one currency into an inappropriate currency would be a breach of utility, as would the storage of data in a format inappropriate for a specific computer architecture; e.g., EBCDIC instead of ASCII or 9-track magnetic tape instead of DVD-ROM. A tabular representation of data substituted for a graph could be described as a breach of utility if the substitution made it more difficult to interpret the data. Utility is often confused with availability because breaches such as those described in these examples may also require time to work around the change in data format or presentation. However, the concept of usefulness is distinct from that of availability.

    Read more →
  • Long division

    Long division

    In arithmetic, long division is a standard division algorithm suitable for dividing multi-digit numbers that is simple enough to perform by hand. It breaks down a division problem into a series of easier steps. As in all division problems, one number, called the dividend, is divided by another, called the divisor, producing a result called the quotient. It enables computations involving arbitrarily large numbers to be performed by following a series of simple steps. The abbreviated form of long division is called short division, which is almost always used instead of long division when the divisor has only one digit. == History == Related algorithms have existed since the 12th century. Al-Samawal al-Maghribi (1125–1174) performed calculations with decimal numbers that essentially require long division, leading to infinite decimal results, but without formalizing the algorithm. Caldrini (1491) is the earliest printed example of long division, known as the Danda method in medieval Italy, and it became more practical with the introduction of decimal notation for fractions by Pitiscus (1608). The specific algorithm in modern use was introduced by Henry Briggs c. 1600. == Education == Inexpensive calculators and computers have become the most common tools for performing division in educational and professional contexts worldwide, reducing reliance on traditional paper-and-pencil techniques. Internally, these devices implement various division algorithms, many of which rely on iterative approximations and multiplication to improve computational efficiency. Educational approaches to teaching division vary across countries and regions, reflecting differing curricular priorities. In North America, long division has been de-emphasized or, in some cases, removed from portions of the curriculum as part of reform mathematics, which emphasizes conceptual understanding and the use of technology. In contrast, many education systems in Europe and Asia continue to emphasize mastery of standard algorithms, including long division, as a foundational arithmetic skill. For example, curricula in countries such as Japan and Germany typically introduce and reinforce long division during primary education, often alongside mental arithmetic strategies and problem-solving techniques. International assessments such as the Trends in International Mathematics and Science Study (TIMSS) highlight these differences, showing variation in how procedural fluency and conceptual understanding are balanced across educational systems. These differing approaches reflect broader educational philosophies regarding the balance between procedural fluency, conceptual understanding, and the role of technology in mathematics education. == Method == In English-speaking countries, long division does not use the division slash ⟨∕⟩ or division sign ⟨÷⟩ symbols but instead constructs a tableau. The divisor is separated from the dividend by a right parenthesis ⟨)⟩ or vertical bar ⟨|⟩; the dividend is separated from the quotient by a vinculum (i.e., an overbar). The combination of these two symbols is sometimes known as a long division symbol, division bracket, or even a bus stop. It developed in the 18th century from an earlier single-line notation separating the dividend from the quotient by a left parenthesis. The process is begun by dividing the left-most digit of the dividend by the divisor. The quotient (rounded down to an integer) becomes the first digit of the result, and the remainder is calculated (this step is notated as a subtraction). This remainder carries forward when the process is repeated on the following digit of the dividend (notated as 'bringing down' the next digit to the remainder). When all digits have been processed and no remainder is left, the process is complete. An example is shown below, representing the division of 500 by 4 (with a result of 125). 125 (Explanations) 4)500 4 ( 4 × 1 = 4) 10 ( 5 - 4 = 1) 8 ( 4 × 2 = 8) 20 (10 - 8 = 2) 20 ( 4 × 5 = 20) 0 (20 - 20 = 0) A more detailed breakdown of the steps goes as follows: Find the shortest sequence of digits starting from the left end of the dividend, 500, that the divisor 4 goes into at least once. In this case, this is simply the first digit, 5. The largest number that the divisor 4 can be multiplied by without exceeding 5 is 1, so the digit 1 is put above the 5 to start constructing the quotient. Next, the 1 is multiplied by the divisor 4, to obtain the largest whole number that is a multiple of the divisor 4 without exceeding the 5 (4 in this case). This 4 is then placed under and subtracted from the 5 to get the remainder, 1, which is placed under the 4 under the 5. Afterwards, the first as-yet unused digit in the dividend, in this case the first digit 0 after the 5, is copied directly underneath itself and next to the remainder 1, to form the number 10. At this point the process is repeated enough times to reach a stopping point: The largest number by which the divisor 4 can be multiplied without exceeding 10 is 2, so 2 is written above as the second leftmost quotient digit. This 2 is then multiplied by the divisor 4 to get 8, which is the largest multiple of 4 that does not exceed 10; so 8 is written below 10, and the subtraction 10 minus 8 is performed to get the remainder 2, which is placed below the 8. The next digit of the dividend (the last 0 in 500) is copied directly below itself and next to the remainder 2 to form 20. Then the largest number by which the divisor 4 can be multiplied without exceeding 20, which is 5, is placed above as the third leftmost quotient digit. This 5 is multiplied by the divisor 4 to get 20, which is written below and subtracted from the existing 20 to yield the remainder 0, which is then written below the second 20. At this point, since there are no more digits to bring down from the dividend and the last subtraction result was 0, we can be assured that the process finished. If the last remainder when we ran out of dividend digits had been something other than 0, there would have been two possible courses of action: We could just stop there and say that the dividend divided by the divisor is the quotient written at the top with the remainder written at the bottom, and write the answer as the quotient followed by a fraction that is the remainder divided by the divisor. We could extend the dividend by writing it as, say, 500.000... and continue the process (using a decimal point in the quotient directly above the decimal point in the dividend), in order to get a decimal answer, as in the following example. 31.75 4)127.00 12 (12 ÷ 4 = 3) 07 (0 remainder, bring down next figure) 4 (7 ÷ 4 = 1 r 3) 3.0 (bring down 0 and the decimal point) 2.8 (7 × 4 = 28, 30 ÷ 4 = 7 r 2) 20 (an additional zero is brought down) 20 (5 × 4 = 20) 0 In this example, the decimal part of the result is calculated by continuing the process beyond the units digit, "bringing down" zeros as being the decimal part of the dividend. This example also illustrates that, at the beginning of the process, a step that produces a zero can be omitted. Since the first digit 1 is less than the divisor 4, the first step is instead performed on the first two digits 12. Similarly, if the divisor were 13, one would perform the first step on 127 rather than 12 or 1. === Basic procedure for long division of n ÷ m === Find the location of all decimal points in the dividend n and divisor m. If necessary, simplify the long division problem by moving the decimals of the divisor and dividend by the same number of decimal places, to the right (or to the left), so that the decimal of the divisor is to the right of the last digit. When doing long division, keep the numbers lined up straight from top to bottom under the tableau. After each step, be sure the remainder for that step is less than the divisor. If it is not, there are three possible problems: the multiplication is wrong, the subtraction is wrong, or a greater quotient is needed. In the end, the remainder, r, is added to the growing quotient as a fraction, r⁄m. === Invariant property and correctness === The basic presentation of the steps of the process (above) focuses on what steps are to be performed, rather than the properties of those steps that ensure the result will be correct (specifically, that q × m + r = n, where q is the final quotient and r the final remainder). A slight variation of presentation requires more writing, and requires that we change, rather than just update, digits of the quotient, but can shed more light on why these steps actually produce the right answer by allowing evaluation of q × m + r at intermediate points in the process. This illustrates the key property used in the derivation of the algorithm (below). Specifically, we amend the above basic procedure so that we fill the space after the digits of the quotient under construction with 0's, to at least the 1's place, and include those 0's in the numbers we write below the division bra

    Read more →
  • Documentation

    Documentation

    Documentation is any communicable material that is used to describe, explain, or instruct regarding some attributes of an object, system, or procedure, such as its parts, assembly, installation, maintenance, and use. As a form of knowledge management and knowledge organization, documentation can be provided on paper, online, or on digital or analog media, such as audio tape or CDs. Examples of such resources include user guides, white papers, online help, and quick-reference guides. Paper or hard-copy documentation has become less common. Contemporary documentation is often distributed through websites, software products, and other online applications. Documentation, understood as a set of instructional materials, should not be confused with documentation science, which is the study of the recording and retrieval of information. == Principles for producing documentation == While associated International Organization for Standardization (ISO) standards are not easily available publicly, a guide from other sources for this topic may serve the purpose. Documentation development may involve document drafting, formatting, submitting, reviewing, approving, distributing, reposting and tracking, etc., and are convened by associated standard operating procedure in a regulatory industry. It could also involve creating content from scratch. Documentation should be easy to read and understand. If it is too long and too wordy, it may be misunderstood or ignored. Clear, concise words should be used, and sentences should be limited to a maximum of 15 words. Documentation intended for a general audience should avoid gender-specific terms and cultural biases. In a series of procedures, steps should be clearly numbered. == Producing documentation == Technical writers and corporate communicators are professionals whose field and work is documentation. Ideally, technical writers have a background in both the subject matter and also in writing, managing content, and information architecture. Technical writers more commonly collaborate with subject-matter experts, such as engineers, technical experts, medical professionals, etc. to define and then create documentation to meet the user's needs. Corporate communications includes other types of written documentation, for example: Market communications (MarCom): MarCom writers endeavor to convey the company's value proposition through a variety of print, electronic, and social media. This area of corporate writing is often engaged in responding to proposals. Technical communication (TechCom): Technical writers document a company's product or service. Technical publications can include user guides, installation and configuration manuals, and troubleshooting and repair procedures. Legal writing: This type of documentation is often prepared by attorneys or paralegals. Compliance documentation: This type of documentation codifies standard operating procedures, for any regulatory compliance needs, as for safety approval, taxation, financing, and technical approval. Healthcare documentation: This field of documentation encompasses the timely recording and validation of events that have occurred during the course of providing health care. == Documentation in computer science == === Types === The following are typical software documentation types: Request for proposal Requirements/statement of work/scope of work Software design and functional specification System design and functional specifications Change management, error and enhancement tracking User acceptance testing Manpages The following are typical hardware and service documentation types: Network diagrams Network maps Datasheet for IT systems (server, switch, e.g.) Service catalog and service portfolio (Information Technology Infrastructure Library) === Software Documentation Folder (SDF) tool === A common type of software document written in the simulation industry is the SDF. When developing software for a simulator, which can range from embedded avionics devices to 3D terrain databases by way of full motion control systems, the engineer keeps a notebook detailing the development "the build" of the project or module. The document can be a wiki page, Microsoft Word document or other environment. They should contain a requirements section, an interface section to detail the communication interface of the software. Often a notes section is used to detail the proof of concept, and then track errors and enhancements. Finally, a testing section to document how the software was tested. This documents conformance to the client's requirements. The result is a detailed description of how the software is designed, how to build and install the software on the target device, and any known defects and workarounds. This build document enables future developers and maintainers to come up to speed on the software in a timely manner, and also provides a roadmap to modifying code or searching for bugs. === Software tools for network inventory and configuration === These software tools can automatically collect data of your network equipment. The data could be for inventory and for configuration information. The Information Technology Infrastructure Library requests to create such a database as a basis for all information for the IT responsible. It is also the basis for IT documentation. Examples include XIA Configuration. == Documentation in criminal justice == "Documentation" is the preferred term for the process of populating criminal databases. Examples include the National Counterterrorism Center's Terrorist Identities Datamart Environment, sex offender registries, and gang databases. == Documentation in early childhood education == Documentation, as it pertains to the early childhood education field, is "when we notice and value children's ideas, thinking, questions, and theories about the world and then collect traces of their work (drawings, photographs of the children in action, and transcripts of their words) to share with a wider community". Thus, documentation is a process, used to link the educator's knowledge and learning of the child/children with the families, other collaborators, and even to the children themselves. Documentation is an integral part of the cycle of inquiry - observing, reflecting, documenting, sharing and responding. Pedagogical documentation, in terms of the teacher documentation, is the "teacher's story of the movement in children's understanding". According to Stephanie Cox Suarez in "Documentation - Transforming our Perspectives", "teachers are considered researchers, and documentation is a research tool to support knowledge building among children and adults". Documentation can take many different styles in the classroom. The following exemplifies ways in which documentation can make the research, or learning, visible: Documentation panels (bulletin-board-like presentation with multiple pictures and descriptions about the project or event). Daily log (a log kept every day that records the play and learning in the classroom) Documentation developed by or with the children (when observing children during documentation, the child's lens of the observation is used in the actual documentation) Individual portfolios (documentation used to track and highlight the development of each child) Electronic documentation (using apps and devices to share documentation with families and collaborators) Transcripts or recordings of conversations (using recording in documentation can bring about deeper reflections for both the educator and the child) Learning stories (a narrative used to "describe learning and help children see themselves as powerful learners") The classroom as documentation (reflections and documentation of the physical environment of a classroom). Documentation is certainly a process in and of itself, and it is also a process within the educator. The following is the development of documentation as it progresses for and in the educator themselves: Develop(s) habits of documentation Become(s) comfortable with going public with recounting of activities Develop(s) visual literacy skills Conceptualize(s) the purpose of documentation as making learning styles visible, and Share(s) visible theories for interpretation purposes and further design of curriculum.

    Read more →
  • Artificial Intelligence Applications Institute

    Artificial Intelligence Applications Institute

    The Artificial Intelligence Applications Institute (AIAI) at the School of Informatics at the University of Edinburgh is a non-profit technology transfer organisation that promoted research in the field of artificial intelligence. == History == The Artificial Intelligence Applications Institute (AIAI) was founded in 1983 at the University of Edinburgh as a specialist research and technology-transfer unit focusing on the practical uses of artificial intelligence (AI). The institute was established by Professor Jim Howe and colleagues from the Science and Engineering Research Council (SERC) Special Interest Group in AI in the Department of Artificial Intelligence, with a mission to apply AI techniques to solve real-world industrial and governmental problems. Under the directorship of Austin Tate, who served from 1985 to 2019, AIAI became one of the leading UK research centres devoted to AI programming systems, intelligent planning systems, decision support, and knowledge-based engineering. It collaborated with both academic partners and international organisations such as the European Space Agency and the UK Ministry of Defence. In 2001, AIAI joined the newly created Centre for Intelligent Systems and their Applications (CISA) within the University's School of Informatics. In December 2019, the institute was renamed the Artificial Intelligence and its Applications Institute to reflect a broader integration of fundamental and applied AI research. == Research programmes == AIAI’s research spans multiple areas of artificial intelligence, including: AI programming Systems - Edinburgh Prolog, Edinburgh Common Lisp, Logo; Knowledge representation and reasoning – development of ontologies, rule-based inference, and semantic modelling; Automated planning and scheduling – intelligent task management systems used in aerospace, manufacturing, and emergency response; Natural language processing and intelligent agents – interaction frameworks for human–computer collaboration; AI ethics and decision-making – research into responsible deployment and evaluation of autonomous systems. The institute also contributes to interdisciplinary fields such as computational creativity, explainable AI, and human–AI interaction. AIAI maintains close collaboration with the Bayes Centre and the Alan Turing Institute through joint research programmes and doctoral training initiatives. == Technology transfer and impact == From its inception, AIAI has combined academic research with technology-transfer activity, offering professional training, industrial consultancy, and bespoke software systems. It pioneered one of the earliest knowledge-based project-management systems, O-Plan, later evolved into the I-Plan framework used for autonomous planning and workflow management.

    Read more →
  • Human-in-the-loop

    Human-in-the-loop

    Human-in-the-loop (HITL) is used in multiple contexts. It can be defined as a model requiring human interaction. HITL is associated with modeling and simulation (M&S) in the live, virtual, and constructive taxonomy. HITL, along with the related human-on-the-loop, are also used in relation to lethal autonomous weapons. Further, HITL is used in the context of machine learning.It is also used in conversational AI to manage complex interactions that require human empathy. == Machine learning == In machine learning, HITL is used in the sense of humans aiding the computer in making the correct decisions in building a model. HITL improves machine learning over random sampling by selecting the most critical data needed to refine the model. == Simulation == In simulation, HITL models may conform to human factors requirements as in the case of a mockup. In this type of simulation, a human is always part of the simulation and consequently influences the outcome in such a way that is difficult if not impossible to reproduce exactly. HITL also readily allows for the identification of problems and requirements that may not be easily identified by other means of simulation. HITL is often referred to as an interactive simulation, which is a special kind of physical simulation in which physical simulations include human operators, such as in a flight or a driving simulator. === Benefits === Human-in-the-loop allows the user to change the outcome of an event or process. The immersion effectively contributes to a positive transfer of acquired skills into the real world. This can be demonstrated by trainees utilizing flight simulators in preparation to become pilots. HITL also allows for the acquisition of knowledge regarding how a new process may affect a particular event. Utilizing HITL allows participants to interact with realistic models and attempt to perform as they would in an actual scenario. HITL simulations bring to the surface issues that would not otherwise be apparent until after a new process has been deployed. A real-world example of HITL simulation as an evaluation tool is its usage by the Federal Aviation Administration (FAA) to allow air traffic controllers to test new automation procedures by directing the activities of simulated air traffic while monitoring the effect of the newly implemented procedures. As with most processes, there is always the possibility of human error, which can only be reproduced using HITL simulation. Although much can be done to automate systems, humans typically still need to take the information provided by a system to determine the next course of action based on their judgment and experience. Intelligent systems can only go so far in certain circumstances to automate a process; only humans in the simulation can accurately judge the final design. Tabletop simulation may be useful in the very early stages of project development for the purpose of collecting data to set broad parameters, but the important decisions require human-in-the-loop simulation. HITL reflects scenarios where human input remains essential despite advances in automation. === Within the virtual simulation taxonomy === Virtual simulations inject HITL in a central role by exercising motor control skills (e.g. flying an airplane), decision making skills (e.g. committing fire control resources to action), or communication skills (e.g. as members of a C4I team). === Examples === Flight simulators Driving simulators Marine simulators Video games Supply chain management simulators Digital puppetry === Misconceptions === Although human-in-the-loop simulation can include a computer simulation in the form of a synthetic environment, computer simulation is not necessarily a form of human-in-the-loop simulation, and is often considered as human-out-of-the loop simulation. In this particular case, a computer model’s behavior is modified according to a set of initial parameters. The results of the model differ from the results stemming from a true human-in-the-loop simulation because the results can easily be replicated time and time again, by simply providing identical parameters. == Weapons == === Taxonomy === Three classifications of the degree of human control of autonomous weapon systems were laid out by Bonnie Docherty in a 2012 Human Rights Watch report. human-in-the-loop: a human must instigate the action of the weapon (in other words not fully autonomous) human-on-the-loop: a human may abort an action human-out-of-the-loop: no human action is involved === Positive human action === In discussions of autonomous weapons and nuclear command and control, the phrase positive human action has been used alongside "human-in-the-loop" to emphasize that a human operator must affirmatively authorize the use of force. Descriptions of the United States Navy's Aegis Combat System have used the phrase in characterizing a requirement for affirmative human action to initiate live firing. A survey of autonomous weapons systems described the Aegis "Auto SM" mode as one in which "the system fully develops the engagement process however engagement requires positive human action". The phrase entered United States federal law in the National Defense Authorization Act for Fiscal Year 2025, which stipulates that artificial intelligence systems not compromise "the principle of requiring positive human actions in execution of decisions by the President with respect to the employment of nuclear weapons".

    Read more →
  • DIKW pyramid

    DIKW pyramid

    The DIKW pyramid (also known as the knowledge pyramid or information hierarchy) is a model describing relationships between data, information, knowledge and wisdom sometimes also stylized as a chain, refer to models of possible structural and functional relationships between a set of components—often four, data, information, knowledge, and wisdom. The concept has roots predating the 1980s. In the latter years of that decade, interest in the models grew after explicit presentations and discussions, including from Milan Zeleny, Russell Ackoff, and Robert W. Lucky. Subsequent important discussions extended along theoretical and practical lines into the coming decades. While debate continues as to actual meaning of the component terms of DIKW-type models, and the actual nature of their relationships—including occasional doubt being cast over any simple, linear, unidirectional model—even so they have become very popular visual representations in use by business, the military, and others. Among the academic and popular, not all versions of the DIKW-type models include all four components (earlier ones excluding data, later ones excluding or downplaying wisdom, and several including additional components (for instance Ackoff inserting "understanding" before and Zeleny adding "enlightenment" after the wisdom component). In addition, DIKW-type models are no longer always presented as pyramids, instead also as a chart or framework (e.g., by Zeleny), as flow diagrams (e.g., by Liew, and by Chisholm et al.), and sometimes as a continuum (e.g., by Choo et al.). == Short description == As Rowley noted in 2007, the DIKW model "is often quoted, or used implicitly, in definitions of data, information and knowledge in the information management, information systems and knowledge management literatures, but [as of that date] there ha[d] been limited direct discussion of the hierarchy". Reviews of textbooks and a survey of scholars in relevant fields indicate that there was not a consensus as to definitions used in the model as of that date, and as reviewed by Liew in that year, even less "in the description of the processes that transform components lower in the hierarchy into those above them". Zins work, published in 2007—from studies in 2003-2005 that documented "130 definitions of data, information, and knowledge formulated by 45 scholars", published in 2007—to suggest that the data–information–knowledge components of DIKW refer to a class of no less than five models, as a function of whether data, information, and knowledge are each conceived of as subjective, objective (what Zins terms, "universal" or "collective") or both. In Zins' usage, subjective and objective "are not related to arbitrariness and truthfulness, which are usually attached to the concepts of subjective knowledge and objective knowledge". Information science, Zins argues, studies data and information, but not knowledge, as knowledge is an internal (subjective) rather than an external (universal–collective) phenomenon. == Representations == === Graphical representation === DIKW is a hierarchical model often depicted as a pyramid, sometimes as a chain, with data at its base and wisdom at its apex (or chain-beginning and -end). Both Zeleny and Ackoff have been credited with originating the pyramid representation, although neither used a pyramid to present their ideas. According to Wallace, Debons and colleagues may have been the first to "present the hierarchy graphically". Many variations of the DIKW-type pyramid have been produced. One, in use by knowledge managers in the United States Department of Defense, attempts to show the DIKW progression to enable effective decisions and consequent activities supporting shared understanding throughout defense organizations, as well as supporting management of risks associated with decisions. DIKW-type hierarchical information paradigms have also been represented as two-dimensional charts, and as flow diagrams, where relationships between the components may be presented less hierarchically, with defining aspects of the relationships, feedback loops, etc. === Computational representation === Intelligent decision support systems are trying to improve decision making by introducing new technologies and methods from the domain of modeling and simulation in general, and in particular from the domain of intelligent software agents in the contexts of agent-based modeling. The following example describes a military decision support system, but the architecture and underlying conceptual idea are transferable to other application domains: The value chain starts with data quality describing the information within the underlying command and control systems. Information quality tracks the completeness, correctness, currency, consistency and precision of the data items and information statements available. Knowledge quality deals with procedural knowledge and information embedded in the command and control system such as templates for adversary forces, assumptions about entities such as ranges and weapons, and doctrinal assumptions, often coded as rules. Awareness quality measures the degree of using the information and knowledge embedded within the command and control system. Awareness is explicitly placed in the cognitive domain. By the introduction of a common operational picture, data are put into context, which leads to information instead of data. The next step, which is enabled by service-oriented web-based infrastructures (but not yet operationally used), is the use of models and simulations for decision support. Simulation systems are the prototype for procedural knowledge, which is the basis for knowledge quality. Finally, using intelligent software agents to continually observe the battle sphere, apply models and simulations to analyze what is going on, to monitor the execution of a plan, and to do all the tasks necessary to make the decision maker aware of what is going on, command and control systems could even support situational awareness, the level in the value chain traditionally limited to pure cognitive methods. == History == Danny P. Wallace, a professor of library and information science, explained that the origin of the DIKW pyramid is uncertain: The presentation of the relationships among data, information, knowledge, and sometimes wisdom in a hierarchical arrangement has been part of the language of information science for many years. Although it is uncertain when and by whom those relationships were first presented, the ubiquity of the notion of a hierarchy is embedded in the use of the acronym DIKW as a shorthand representation for the data-to-information-to-knowledge-to-wisdom transformation.Many authors think that the idea of the DIKW relationship originated from two lines in the poem "Choruses", by T. S. Eliot, that appeared in the pageant play The Rock, in 1934: === Knowledge, intelligence, and wisdom === In 1927, Clarence W. Barron addressed his employees at Dow Jones & Company on the hierarchy: "Knowledge, Intelligence and Wisdom". === Data, information, knowledge === In 1955, English-American economist and educator Kenneth Boulding presented a variation on the hierarchy consisting of "signals, messages, information, and knowledge". However, "[t]he first author to distinguish among data, information, and knowledge and to also employ the term 'knowledge management' may have been American educator Nicholas L. Henry", in a 1974 journal article. === Data, information, knowledge, wisdom === Other early versions (prior to 1982) of the hierarchy that refer to a data tier include those of Chinese-American geographer Yi-Fu Tuan and sociologist-historian Daniel Bell.. In 1980, Irish-born engineer Mike Cooley invoked the same hierarchy in his critique of automation and computerization, in his book Architect or Bee?: The Human / Technology Relationship. Thereafter, in 1987, Czechoslovakia-born educator Milan Zeleny mapped the components of the hierarchy to knowledge forms: know-nothing, know-what, know-how, and know-why. Zeleny "has frequently been credited with proposing the [representation of DIKW as a pyramid ]... although he actually made no reference to any such graphical model." The hierarchy appears again in a 1988 address to the International Society for General Systems Research, by American organizational theorist Russell Ackoff, published in 1989. Subsequent authors and textbooks cite Ackoff's as the "original articulation" of the hierarchy or otherwise credit Ackoff with its proposal. Ackoff's version of the model includes an understanding tier (as Adler had, before him), interposed between knowledge and wisdom. Although Ackoff did not present the hierarchy graphically, he has also been credited with its representation as a pyramid. In 1989, Bell Labs veteran Robert W. Lucky wrote about the four-tier "information hierarchy" in the form of a pyramid in his book Silicon Dreams. In the same year as Ackoff presented his a

    Read more →
  • Digital artifact

    Digital artifact

    Digital artifact in information science, is any undesired or unintended alteration in data introduced in a digital process by an involved technique and/or technology. Digital artifact can be of any content types including text, audio, video, image, animation or a combination. == Information science == In information science, digital artifacts result from: Hardware malfunction: In computer graphics, visual artifacts may be generated whenever a hardware component such as the processor, memory chip, cabling malfunctions, etc., corrupts data. Examples of malfunctions include physical damage, overheating, insufficient voltage and GPU overclocking. Common types of hardware artifacts are texture corruption and T-vertices in 3D graphics, and pixelization in MPEG compressed video. Software malfunction: Artifacts may be caused by algorithm flaws such as decoding/encoding audio or video, or a poor pseudo-random number generator that would introduce artifacts distinguishable from the desired noise into statistical models. Compression: Controlled amounts of unwanted information may be generated as a result of the use of lossy compression techniques. One example is the artifacts seen in JPEG and MPEG compression algorithms that produce compression artifacts. Quantization: Digital imprecision generated in the process of converting analog information into digital space, is due to the limited granularity of digital numbering space. In computer graphics, quantization is seen as pixelation. Aliasing: As a consequence of sampling or sample-rate conversion, energy from frequencies outside of the signal frequency band of interest are folded across multiples of the Nyquist frequency. This is typically mitigated by using an anti-aliasing filter. Filtering: The process of filtering a signal, such as using an anti-aliasing filter, causes undesired alterations to the signal due to imperfections in the frequency response magnitude and phase, and due to the time domain impulse response. Rolling shutter, the line scanning of an object that is moving too fast for the image sensor to capture a unitary image. Error diffusion: poorly-weighted kernel coefficients result in undesirable visual artifacts.

    Read more →
  • Information quality

    Information quality

    Information quality (IQ) is a contextual property of or a perspective to the content within information systems. There exist two complementary yet partially conflicting definitions of high-quality: firstly, information is considered high quality if it is fit for its intended purpose ; secondly, it is deemed high quality if it conforms to specified requirements . The primary distinction between these definitions is that Juran's perspective focuses on the suitability of information for its intended purpose, which can be measured by the success of its application even without direct access to or exact knowledge of the data. For example, a black-box AI with access to English Wikipedia can work well for users' purposes but using Estonian Wikipedia fails for the same purposes. Given that the AI remains the same, it can be concluded that English version data would be of higher quality in comparison to Estonian version, even without exact comparison of data contents and their properties in each version. In contrast, Crosby emphasizes adherence to predefined specifications, assuming specific criteria rather than measuring the success of its use; for instance, information in Wikipedia could be proven to be good based on criteria such as existing peer validation and academic references, even if the AI results are poor. This approach falls into problems when data is not completely accessible or all quality properties cannot be known and measured leading to false impression of quality due to lacking and misleading metrics. Numerous IQ frameworks and methodologies provide tangible approach to assess and measure DQ/IQ in a robust and rigorous manner. == Conceptual problems == Although the foundational definitions are usable for most everyday purposes, specialists often use more complex models for information quality. It has been suggested, however, that higher the quality the greater will be the confidence in meeting more general, less specific contexts. == Dimensions and metrics of information quality == "Information quality" is a measure of its fitness for use or conformance to requirements. In this way, "quality" is considered contextual and it can then vary across users and uses of the information. The exact degree of quality is often described with dimensions such as accuracy, timeliness, completeness, and similar scales. Although a huge amount of academic research has been directed to these dimensions, there does not exist consensus on their definitions or practical usefulness . Historically, Richard Wang and Diane Strong proposed a list of dimensions or elements used in assessing Information Quality is: Intrinsic IQ: accuracy, objectivity, believability, reputation Contextual IQ: relevance, value-added, timeliness, completeness, amount of information Representational IQ: interpretability, format, coherence, compatibility Accessibility IQ: accessibility, access security Other authors propose similar but different lists of dimensions for analysis, and emphasize measurement and reporting as information quality metrics. Larry English prefers the term "characteristics" to dimensions. However, a considerable amount of information quality research involves investigating and describing various categories of desirable attributes (or dimensions) of data. Research has recently shown the huge diversity of terms and classification structures used. === Quality metrics === Source: Authority/verifiability Authority refers to the expertise or recognized official status of a source. Consider the reputation of the author and publisher. When working with legal or government information, consider whether the source is the official provider of the information. Verifiability refers to the ability of a reader to verify the validity of the information irrespective of how authoritative the source is. To verify the facts is part of the duty of care of the journalistic deontology, as well as, where possible, to provide the sources of information so that they can be verified Scope of coverage Scope of coverage refers to the extent to which a source explores a topic. Consider time periods, geography or jurisdiction and coverage of related or narrower topics. Composition and organization Composition and organization has to do with the ability of the information source to present its particular message in a coherent, logically sequential manner. Objectivity Objectivity is the bias or opinion expressed when a writer interprets or analyze facts. Consider the use of persuasive language, the source's presentation of other viewpoints, its reason for providing the information and advertising. Integrity Adherence to moral and ethical principles; soundness of moral character The state of being whole, entire, or undiminished Comprehensiveness Of large scope; covering or involving much; inclusive: a comprehensive study. Comprehending mentally; having an extensive mental grasp. Insurance. covering or providing broad protection against loss. Validity Validity of some information has to do with the degree of obvious truthfulness which the information carries Uniqueness As much as 'uniqueness' of a given piece of information is intuitive in meaning, it also significantly implies not only the originating point of the information but also the manner in which it is presented and thus the perception which it conjures. The essence of any piece of information we process consists to a large extent of those two elements. Timeliness Timeliness refers to information that is current at the time of publication. Consider publication, creation and revision dates. Beware of Web site scripting that automatically reflects the current day's date on a page. Reproducibility (utilized primarily when referring to instructive information) Means that documented methods are capable of being used on the same data set to achieve a consistent result. == Professional associations == IQ International—the International Association for Information and Data Quality IQ International is a not-for-profit, vendor neutral, professional association formed in 2004, dedicated to building the information and data quality profession. CDOIQ Society Chief Data Officers and Information Quality Society is a global professional society supporting data leaders with networking, meetings, best practices, experience, certification, and training. == Information quality conferences == A number of major conferences relevant to information quality are held annually: Annual MIT Chief Data Officer & Information Quality (CDOIQ) Symposium Annual conferences held at the Massachusetts Institute of Technology, Cambridge, MA, USA Data Governance and Information Quality Conference Commercial conferences held each year in the USA Data Quality Asia Pacific Commercial conference held annually in Sydney or Melbourne, Australia Enterprise Data and Business Intelligence Conference Europe Commercial conferences held annually in London, England. Information and Data Quality Conference Not for profit conference run annually by IQ International (the International Association for Information and Data Quality) in the USA International Conference on Information Quality Academic Conference launched through MITIQ held annually at a University Master Data Management & Data Governance Conferences Six major conferences are run annually by the MDM Institute in venues such as London, San Francisco, Sydney, Toronto, Madrid, Frankfurt, Shanghai and New York City.

    Read more →
  • Enterprise mobile application

    Enterprise mobile application

    The term enterprise mobile application is used in the context of mobile apps created/brought by individual organizations for their workers to carry out the functions required to run the organization. It is the process of building a mobile application for the requirements of an enterprise. An enterprise mobile application belonging to an organization is expected to be used by only the workers of that organization. The definition of enterprise mobile application does not include the mobile apps that an organization create for its customers or consumers of the products or services generated by the organization. == Example == An organization, whether for-profit or non-profit, may create a mobile app for its members to track inventory levels of supplies they distribute to their target communities or materials used in product manufacturing. Such a mobile app comes under the definition of enterprise mobile application. However, the same organization may also create another mobile app to sell their products to end users or spread awareness of their services to various communities, and that mobile app would not come under definition of enterprise mobile application. == Enterprise mobile solution providers == Enterprise Mobile solution providers create and develop apps for individual organizations that can buy instead of creating the apps themselves. Reasons for Organizations buying the apps include time and cost savings, technical expertise. Today Enterprise Mobility is playing track role for enterprise transformation. Today, enterprises needs productivity is a fast way. Enterprise mobility helps business owners to build their work in a progressive way by assisting enterprise mobility solutions.

    Read more →
  • Lancichinetti–Fortunato–Radicchi benchmark

    Lancichinetti–Fortunato–Radicchi benchmark

    Lancichinetti–Fortunato–Radicchi benchmark is an algorithm that generates benchmark networks (artificial networks that resemble real-world networks). They have a priori known communities and are used to compare different community detection methods. The advantage of the benchmark over other methods is that it accounts for the heterogeneity in the distributions of node degrees and of community sizes. == The algorithm == The node degrees and the community sizes are distributed according to a power law, with different exponents. The benchmark assumes that both the degree and the community size have power law distributions with different exponents, γ {\displaystyle \gamma } and β {\displaystyle \beta } , respectively. N {\displaystyle N} is the number of nodes and the average degree is ⟨ k ⟩ {\displaystyle \langle k\rangle } . There is a mixing parameter μ {\displaystyle \mu } , which is the average fraction of neighboring nodes of a node that do not belong to any community that the benchmark node belongs to. This parameter controls the fraction of edges that are between communities. Thus, it reflects the amount of noise in the network. At the extremes, when μ = 0 {\displaystyle \mu =0} all links are within community links, if μ = 1 {\displaystyle \mu =1} all links are between nodes belonging to different communities. One can generate the benchmark network using the following steps. Step 1: Generate a network with nodes following a power law distribution with exponent γ {\displaystyle \gamma } and choose extremes of the distribution k min {\displaystyle k_{\min }} and k max {\displaystyle k_{\max }} to get desired average degree is ⟨ k ⟩ {\displaystyle \langle k\rangle } . Step 2: ( 1 − μ ) {\displaystyle (1-\mu )} fraction of links of every node is with nodes of the same community, while fraction μ {\displaystyle \mu } is with the other nodes. Step 3: Generate community sizes from a power law distribution with exponent β {\displaystyle \beta } . The sum of all sizes must be equal to N {\displaystyle N} . The minimal and maximal community sizes s min {\displaystyle s_{\min }} and s max {\displaystyle s_{\max }} must satisfy the definition of community so that every non-isolated node is in at least in one community: s min > k min {\displaystyle s_{\min }>k_{\min }} s max > k max {\displaystyle s_{\max }>k_{\max }} Step 4: Initially, no nodes are assigned to communities. Then, each node is randomly assigned to a community. As long as the number of neighboring nodes within the community does not exceed the community size a new node is added to the community, otherwise stays out. In the following iterations the “homeless” node is randomly assigned to some community. If that community is complete, i.e. the size is exhausted, a randomly selected node of that community must be unlinked. Stop the iteration when all the communities are complete and all the nodes belong to at least one community. Step 5: Implement rewiring of nodes keeping the same node degrees but only affecting the fraction of internal and external links such that the number of links outside the community for each node is approximately equal to the mixing parameter μ {\displaystyle \mu } . == Testing == Consider a partition into communities that do not overlap. The communities of randomly chosen nodes in each iteration follow a p ( C ) {\displaystyle p(C)} distribution that represents the probability that a randomly picked node is from the community C {\displaystyle C} . Consider a partition of the same network that was predicted by some community finding algorithm and has p ( C 2 ) {\displaystyle p(C_{2})} distribution. The benchmark partition has p ( C 1 ) {\displaystyle p(C_{1})} distribution. The joint distribution is p ( C 1 , C 2 ) {\displaystyle p(C_{1},C_{2})} . The similarity of these two partitions is captured by the normalized mutual information. I n = ∑ C 1 , C 2 p ( C 1 , C 2 ) log 2 ⁡ p ( C 1 , C 2 ) p ( C 1 ) p ( C 2 ) 1 2 H ( { p ( C 1 ) } ) + 1 2 H ( { p ( C 2 ) } ) {\displaystyle I_{n}={\frac {\sum _{C_{1},C_{2}}p(C_{1},C_{2})\log _{2}{\frac {p(C_{1},C_{2})}{p(C_{1})p(C_{2})}}}{{\frac {1}{2}}H(\{p(C_{1})\})+{\frac {1}{2}}H(\{p(C_{2})\})}}} If I n = 1 {\displaystyle I_{n}=1} the benchmark and the detected partitions are identical, and if I n = 0 {\displaystyle I_{n}=0} then they are independent of each other.

    Read more →
  • Artificial intelligence in industry

    Artificial intelligence in industry

    Industrial artificial intelligence, or industrial AI, refers to the application of artificial intelligence to industrial business processes. Unlike general artificial intelligence which is a frontier research discipline to build computerized systems that perform tasks requiring human intelligence, industrial AI is more concerned with the application of such technologies to address industrial pain-points for customer value creation, productivity improvement, cost reduction, site optimization, predictive analysis and insight discovery. Artificial intelligence and machine learning have become key enablers to leverage data in production in recent years due to a number of different factors: More affordable sensors and the automated process of data acquisition; More powerful computation capability of computers to perform more complex tasks at a faster speed with lower cost; Faster connectivity infrastructure and more accessible cloud services for data management and computing power outsourcing. == Categories == Possible applications of industrial AI and machine learning in the production domain can be divided into seven application areas: Market and trend analysis Machinery and equipment Intralogistics Production process Supply chain Building Product Each application area can be further divided into specific application scenarios that describe concrete AI/ML scenarios in production. While some application areas have a direct connection to production processes, others cover production adjacent fields like logistics or the factory building. An example from the application scenario Process Design & Innovation are collaborative robots. Collaborative robotic arms are able to learn the motion and path demonstrated by human operators and perform the same task. Predictive and preventive maintenance through data-driven machine learning are application scenarios from the Machinery & Equipment application area. == Challenges == In contrast to entirely virtual systems, in which ML applications are already widespread today, real-world production processes are characterized by the interaction between the virtual and the physical world. Data is recorded using sensors and processed on computational entities and, if desired, actions and decisions are translated back into the physical world via actuators or by human operators. This poses major challenges for the application of ML in production engineering systems. These challenges are attributable to the encounter of process, data and model characteristics: The production domain's high reliability requirements, high risk and loss potential, the multitude of heterogeneous data sources and the non-transparency of ML model functionality impede a faster adoption of ML in real-world production processes. In particular, production data comprises a variety of different modalities, semantics and quality. Furthermore, production systems are dynamic, uncertain and complex, and engineering and manufacturing problems are data-rich but information-sparse. Besides that, due to the variety of use cases and data characteristics, problem-specific data sets are required, which are difficult to acquire, hindering both practitioners and academic researchers in this domain. === Process and industry characteristics === The domain of production engineering can be considered as a rather conservative industry when it comes to the adoption of advanced technology and their integration into existing processes. This is due to high demands on reliability of the production systems resulting from the potentially high economic harm of reduced process effectiveness due to e.g., additional unplanned downtime or insufficient product qualities. In addition, the specifics of machining equipment and products prevent area-wide adoptions across a variety of processes. Besides the technical reasons, the reluctant adoption of ML is fueled by a lack of IT and data science expertise across the domain. === Data characteristics === The data collected in production processes mainly stem from frequently sampling sensors to estimate the state of a product, a process, or the environment in the real world. Sensor readings are susceptible to noise and represent only an estimate of the reality under uncertainty. Production data typically comprises multiple distributed data sources resulting in various data modalities (e.g., images from visual quality control systems, time-series sensor readings, or cross-sectional job and product information). The inconsistencies in data acquisition lead to low signal-to-noise ratios, low data quality and great effort in data integration, cleaning and management. In addition, as a result from mechanical and chemical wear of production equipment, process data is subject to various forms of data drifts. === Machine learning model characteristics === ML models are considered as black-box systems given their complexity and intransparency of input-output relation. This reduces the comprehensibility of the system behavior and thus also the acceptance by plant operators. Due to the lack of transparency and the stochasticity of these models, no deterministic proof of functional correctness can be achieved, complicating the certification of production equipment. Given their inherent unrestricted prediction behavior, ML models are vulnerable against erroneous or manipulated data, further risking the reliability of the production system because of lacking robustness and safety. In addition to high development and deployment costs, the data drifts cause high maintenance costs, which is disadvantageous compared to purely deterministic programs. == Standard processes for data science in production == The development of ML applications – starting with the identification and selection of the use case and ending with the deployment and maintenance of the application – follows dedicated phases that can be organized in standard process models. The process models assist in structuring the development process and defining requirements that must be met in each phase to enter the next phase. The standard processes can be classified into generic and domain-specific ones. Generic standard processes (e.g., CRISP-DM, ASUM-DM, or knowledge discovery in databases (KDD)) describe a generally valid methodology and are thus independent of individual domains. Domain-specific processes on the other hand consider specific peculiarities and challenges of special application areas. The Machine Learning Pipeline in Production is a domain-specific data science methodology that is inspired by the CRISP-DM model and was specifically designed to be applied in fields of engineering and production technology. To address the core challenges of ML in engineering – process, data, and model characteristics – the methodology especially focuses on use-case assessment, achieving a common data and process understanding data integration, data preprocessing of real-world production data and the deployment and certification of real-world ML applications. == Industrial data sources == The foundation of most artificial intelligence and machine learning applications in industrial settings are comprehensive datasets from the respective fields. Those datasets act as the basis for training the employed models. In other domains, like computer vision, speech recognition or language models, extensive reference datasets (e.g. ImageNet, Librispeech, The People's Speech) and data scraped from the open internet are frequently used for this purpose. Such datasets rarely exist in the industrial context because of high confidentiality requirements and high specificity of the data. Industrial applications of artificial intelligence are therefore often faced with the problem of data availability. For these reasons, existing open datasets applicable to industrial applications, often originate from public institutions like governmental agencies or universities and data analysis competitions hosted by companies. In addition to this, data sharing platforms exist. However, most of these platforms have no industrial focus and offer limited filtering abilities regarding industrial data sources.

    Read more →
  • List of algorithm general topics

    List of algorithm general topics

    This is a list of algorithm general topics. Analysis of algorithms Ant colony algorithm Approximation algorithm Best and worst cases Big O notation Combinatorial search Competitive analysis Computability theory Computational complexity theory Embarrassingly parallel problem Emergent algorithm Evolutionary algorithm Fast Fourier transform Genetic algorithm Graph exploration algorithm Heuristic Hill climbing Implementation Las Vegas algorithm Lock-free and wait-free algorithms Monte Carlo algorithm Numerical analysis Online algorithm Polynomial time approximation scheme Problem size Pseudorandom number generator Quantum algorithm Random-restart hill climbing Randomized algorithm Running time Sorting algorithm Search algorithm Stable algorithm (disambiguation) Super-recursive algorithm Tree search algorithm

    Read more →
  • Deep Learning Indaba

    Deep Learning Indaba

    The Deep Learning Indaba is an annual conference and educational event that aims to strengthen machine learning and artificial intelligence (AI) capacity across Africa. Launched in 2017, it brings together students, researchers, industry practitioners, and policymakers from across the African continent. == History == The Deep Learning Indaba began in 2017 at the University of the Witwatersrand with over 300 participants from 23 African countries, offering tutorials in advanced AI topics and featuring notable speakers like Nando de Freitas. In 2018, it expanded to 650 delegates at Stellenbosch University, introducing parallel sessions to encourage collaboration. The 2019 edition in Nairobi, Kenya, reflected further growth, with increasing sponsorship and support from major tech companies like Google and Microsoft. === Deep Learning IndabaX ===

    Read more →
  • Chandy–Misra–Haas algorithm resource model

    Chandy–Misra–Haas algorithm resource model

    The Chandy–Misra–Haas algorithm resource model checks for deadlock in a distributed system. It was developed by K. Mani Chandy, Jayadev Misra and Laura M. Haas. == Locally dependent == Consider the n processes P1, P2, P3, P4, P5,, ... ,Pn which are performed in a single system (controller). P1 is locally dependent on Pn, if P1 depends on P2, P2 on P3, so on and Pn−1 on Pn. That is, if P 1 → P 2 → P 3 → … → P n {\displaystyle P_{1}\rightarrow P_{2}\rightarrow P_{3}\rightarrow \ldots \rightarrow P_{n}} , then P 1 {\displaystyle P_{1}} is locally dependent on P n {\displaystyle P_{n}} . If P1 is said to be locally dependent to itself if it is locally dependent on Pn and Pn depends on P1: i.e. if P 1 → P 2 → P 3 → … → P n → P 1 {\displaystyle P_{1}\rightarrow P_{2}\rightarrow P_{3}\rightarrow \ldots \rightarrow P_{n}\rightarrow P_{1}} , then P 1 {\displaystyle P_{1}} is locally dependent on itself. == Description == The algorithm uses a message called probe(i,j,k) to transfer a message from controller of process Pj to controller of process Pk. It specifies a message started by process Pi to find whether a deadlock has occurred or not. Every process Pj maintains a boolean array dependent which contains the information about the processes that depend on it. Initially the values of each array are all "false". === Controller sending a probe === Before sending, the probe checks whether Pj is locally dependent on itself. If so, a deadlock occurs. Otherwise it checks whether Pj, and Pk are in different controllers, are locally dependent and Pj is waiting for the resource that is locked by Pk. Once all the conditions are satisfied it sends the probe. === Controller receiving a probe === On the receiving side, the controller checks whether Pk is performing a task. If so, it neglects the probe. Otherwise, it checks the responses given Pk to Pj and dependentk(i) is false. Once it is verified, it assigns true to dependentk(i). Then it checks whether k is equal to i. If both are equal, a deadlock occurs, otherwise it sends the probe to next dependent process. == Algorithm == In pseudocode, the algorithm works as follows: === Controller sending a probe === if Pj is locally dependent on itself then declare deadlock else for all Pj,Pk such that (i) Pi is locally dependent on Pj, (ii) Pj is waiting for 'Pk and (iii) Pj, Pk are on different controllers. send probe(i, j, k). to home site of Pk === Controller receiving a probe === if (i)Pk is idle / blocked (ii) dependentk(i) = false, and (iii) Pk has not replied to all requests of to Pj then begin "dependents""k"(i) = true; if k == i then declare that Pi is deadlocked else for all Pa,Pb such that (i) Pk is locally dependent on Pa, (ii) Pa is waiting for 'Pb and (iii) Pa, Pb are on different controllers. send probe(i, a, b). to home site of Pb end == Example == P1 initiates deadlock detection. C1 sends the probe saying P2 depends on P3. Once the message is received by C2, it checks whether P3 is idle. P3 is idle because it is locally dependent on P4 and updates dependent3(2) to True. As above, C2 sends probe to C3 and C3 sends probe to C1. At C1, P1 is idle so it update dependent1(1) to True. Therefore, deadlock can be declared. == Complexity == Suppose there are n {\displaystyle n} controllers and m {\displaystyle m} processes, at most m ( n − 1 ) / 2 {\displaystyle m(n-1)/2} messages need to be exchanged to detect a deadlock, with a delay of O ( n ) {\displaystyle O(n)} messages.

    Read more →
  • Principles for a Data Economy

    Principles for a Data Economy

    The Principles for a Data Economy – Data Rights and Transactions is a transatlantic legal project carried out jointly by the American Law Institute (ALI) and the European Law Institute (ELI). The Principles for a Data Economy deals with a range of different legal questions that arise in the data economy. Since data is different from other tradeable items, the Principles draw up legal rules for data transactions and data rights that take into account the interests of different stakeholders involved in the data economy. The Principles are designed to facilitate contractual relations as well as the drafting of model agreements and can guide courts and legislators worldwide. The project proposes a set of principles that can be implemented in any legal system and is designed to work in conjunction with any kind of data privacy/data protection law, intellectual property law or trade secret law. The Principles do not address or seek to change any of the substantive rules of these bodies of law. The Project Team consists of Neil B Cohen and Christiane Wendehorst (as Project Reporters) and Lord John Thomas as well as Steven O. Weise (as Project Chairs). == Characteristics of data == The law governing trades in commerce has historically focused on trade in items that are tangible like goods or on intangible assets, such as shares or licenses. However, data does not fit into any of these traditional categories, nor does it qualify as a service. It is often unclear how traditional legal rules and doctrines can apply to data, as data is different from other assets in many ways. For example, data can be multiplied at basically no cost and can be used in parallel for a variety of different purposes by many different people at the same time (data is a “non-rivalrous” resource). Uncertainty regarding the applicable rules to govern the data economy may inhibit innovation and growth and trouble stakeholders like data-driven industries, start-ups, and consumers. == Stakeholders in the data economy == The Principles have taken the basic types of players and relations which can be found in data ecosystems as a starting point to provide guidance in different situations. The central actors in the data economy are data controllers (also called “data holders”). They are in a position to access the data and decide for which purposes and means this data should be processed. A controller may exercise control all by itself or share it with co-controllers, such as under a data pooling arrangement. Data processors provide the processing of data on a controller’s behalf as a service. Another important group of stakeholders includes those that contribute to the generation of data (e.g. data subjects). Other players in the data economy include data assemblers or data intermediaries (e.g. data trusts). == History of the project and timeline == Before the official adoption of the project by ALI and ELI bodies in 2018, the project team carried out a Feasibility Study from October 2016 to February 2018. In the following years, the project team produced a number of drafts (e.g. “Preliminary Drafts” No. 1 to 4, “Tentative Draft No. 1”) and project progress were regularly discussed with advisory bodies and members of both the ALI and the ELI. The project reporters also included feedback and insights from industry stakeholders and experts that was gained after several meetings and workshops, hosted, inter alia by UNCITRAL, UNIDROIT and several national governmental institutions. Tentative Draft No. 2 was presented at the ALI Annual Meeting in May 2021 and approved by ALI membership. The latest draft ("Final Council Draft") was also approved by the ELI Council and ELI Membership. The Principles for a Data Economy were presented at an international conference with representatives from institutions such as the Uniform Law Commission (ULC), the European Commission, UNIDROIT, the OECD, the International Chamber of Commerce (ICC) and the World Economic Forum (WEF) in October 2021. == Project structure == The current draft (“Tentative Draft No. 2”) of the Principles consists of five Parts that each governs different aspects of the data economy: General Provisions, Data Contracts, Data Rights, Third Party Aspects of Data Activities, and Multi-State Issues. === General Provisions === Part I includes general provisions that apply to all other Parts of the Principles for a Data Economy. This Part sets out the purpose of the Principles: they aim to make existing law in the field of the data economy more coherent and support the development of the law in this field by courts and legislators worldwide. It is also clarified that the Principles have a wide scope of application and can be used in a variety of ways by stakeholders in the data economy. The Principles may, for example, serve private parties as a basis for contract formation, guide the deliberations of arbitral tribunals or inspire national legislation. Part I then defines several key terms, such as ‘digital data’ and ‘data right’. The scope of the Principles is limited to matters where information is recorded as an asset, resource or tradeable commodity and where large amounts of data, rather than single pieces of information, are concerned. This Part also clarifies that remedies with respect to data contracts and data rights are left to the applicable national law. === Data Contracts === Part II lists different types of contracts that often occur in the data economy and establishes two broad categories, namely contracts for the supply and sharing of data and contracts for services with regard to data. Contracts for the supply and sharing of data include, e.g. data transfer contracts or data pooling arrangements, while contracts for services with regard to data cover contracts for the processing of data or data intermediary contracts. The Principles provide default terms for each contract type, on issues such as the manner in which data should supply or which characteristics the data supplied should meet. These default terms 'automatically' become part of the contract unless the parties agree otherwise. === Data Rights === Part III governs legally protected interests of players in the data economy that stem from the characteristics of data as a resource (e.g. its non-rivalrous nature) or from public interest considerations. Such data rights may include the right to data access, the right to require the controller to desist from data activities or to correct incorrect/incomplete data, or even to receive an economic share in profits derived from the use of data. For example, the Principles deal with data rights of stakeholders that had a share in the co-generation of data and identify different factors to be considered in determining whether to afford a party a data right. The underlying idea that parties who have contributed to the generation of data should have some rights in the utilization of the data is also recognized by governmental institutions, such as by the Japanese Ministry of Economy, Trade and Industry (METI), and the term co-generated data, which was coined by the Principles for a Data Economy, has been adopted, inter alia by the European Commission, the German Data Ethics Commission and the Global Partnership on Artificial Intelligence (GPAI). This Part also deals with data rights for the public interest, such as data sharing rights in the field of innovation. === Third Party Aspects === Part IV governs different situations in which data transactions interfere with the rights of third parties. Such rights include intellectual property rights or rights derived from data privacy or data protection law. This Part sets out under which circumstances data activities should be considered wrongful vis à vis another party. For example, a data activity (like data processing or the onward supply of data) could be considered wrongful, if a controller interferes with the rights of data subjects that are protected by data-protection law. A data activity could also be wrongful if the controller is non-compliant with contractual limitations on data activities, enforceable by the protected party (e.g. a controller may only process data for a certain purpose). If someone obtained access to data by unauthorized means (i.e. data “theft”) this could also be considered wrongful. The Part on Third-Party Aspects also takes a detailed look at the effects of the onward supply of data can have on third parties, while balancing the protection of third parties on the one hand, with the interests of data recipients and the desire to encourage data sharing on the other. === Multi-State Issues === As transactions in the data economy are international by nature and hardly occur within one legal system alone, the Part V of the Principles also briefly touches upon the applicability of the rules and doctrines of private international law to such transactions. == Links == Website of the “Principles for a Data Economy – Data Rights and Transaction

    Read more →