AI Detector Similar To Turnitin

AI Detector Similar To Turnitin — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Product-family engineering

    Product-family engineering

    Product-family engineering (PFE), also known as product-line engineering (PLE), is based on the ideas of "domain engineering" created by the Software Engineering Institute, a term coined by James Neighbors in his 1980 dissertation at University of California, Irvine. Software product lines are quite common in our daily lives, but before a product family can be successfully established, an extensive process has to be followed. This process is known as product-family engineering. Product-family engineering can be defined as a method that creates an underlying architecture of an organization's product platform. It provides an architecture that is based on commonality as well as planned variabilities. The various product variants can be derived from the basic product family, which creates the opportunity to reuse and differentiate on products in the family. Product-family engineering is conceptually similar to the widespread use of vehicle platforms in the automotive industry. Product-family engineering is a relatively new approach to the creation of new products, recently evolving to Model-Based Product Line Engineering (MBPLE), emphasizing the centrality of a model-centric approach in PLE. It focuses on the process of engineering new products in such a way that it is possible to reuse product components and apply variability with decreased costs and time. Product-family engineering is all about reusing components and structures as much as possible, according to the ISO/IEC 26550/2015 and the latest ISO/IEC 26580/2021 that introduced the concept of feature-based Product Line Engineering. Several studies have proven that using a product-family engineering approach for product development can have several benefits. Here is a list of some of them: Higher productivity Higher quality Faster time-to-market Lower labor needs The Nokia case mentioned below also illustrates these benefits. In 2025 the publishing of the book Model-Based Product Line Engineering (MBPLE): The feature-based path to product lines success by Marco Forlingieri, Tim Weilkiens and Hugo Guillermo Chalé-Gongora formalized the foundation of the discipline, including best practices and new industrial cases. == Overall process == The product family engineering process consists of several phases. The three main phases are: Phase 1: Product management Phase 2: Domain engineering Phase 3: Product engineering The process has been modeled on a higher abstraction level. This has the advantage that it can be applied to all kinds of product lines and families, not only software. The model can be applied to any product family. Figure 1 (below) shows a model of the entire process. Below, the process is described in detail. The process description contains elaborations of the activities and the important concepts being used. All concepts printed in italic are explained in Table 1. === Phase 1: product management === The first phase is the starting up of the whole process. In this phase some important aspects are defined especially with regard to economic aspects. This phase is responsible for outlining market strategies and defining a scope, which tells what should and should not be inside the product family. ==== Evaluate business visioning ==== During this first activity all context information relevant for defining the scope of the product line is collected and evaluated. It is important to define a clear market strategy and take external market information into account, such as consumer demands. The activity should deliver a context document that contains guidelines, constraints and the product strategy. ==== Define product line scope ==== Scoping techniques are applied to define which aspects are within the scope. This is based upon the previous step in the process, where external factors have been taken into account. The output is a product portfolio description, which includes a list of current and future products and also a product roadmap. It can be argued whether phase 1, product management, is part of the product-family-engineering process, because it could be seen as an individual business process that is more focused on the management aspects instead of the product aspect. However phase 2 needs some important input from this phase, as a large piece of the scope is defined in this phase. So from this point of view it is important to include the product-management phase (phase 1) into the entire process as a base for the domain-engineering process. === Phase 2: domain engineering === During the domain-engineering phases, the variable and common requirements are gathered for the whole product line. The goal is to establish a reusable platform. The output of this phase is a set of common and variable requirements for all products in the product line. ==== Analyze domain requirements ==== This activity includes all activities for analyzing the domain with regard to concept requirements. The requirements are categorized and split up into two new activities. The output is a document with the domain analysis. As can be seen in Figure 1 the process of defining common requirements is a parallel process with defining variable requirements. Both activities take place at the same time. ==== Define common requirements ==== Includes all activities for eliciting and documenting the common requirements of the product line, resulting in a document with reusable common requirements. ==== Define variable requirements ==== Includes all activities for eliciting and documenting the variable requirements of the product line, resulting in a document with variable requirements. ==== Design domain ==== This process step consists of activities for defining the reference architecture of the product line. This generates an abstract structure for all products in the product line. ==== Implement domain ==== During this step a detailed design of the reusable components and the implementation of these components are created. ==== Test domain ==== Validates and verifies the reusability of components. Components are tested against their specifications. After successful testing of all components in different use cases and scenarios, the domain engineering phase has been completed. === Phase 3: product engineering === In the final phase a product X is being engineered. This product X uses the commonalities and variability from the domain engineering phase, so product X is being derived from the platform established in the domain engineering phase. It basically takes all common requirements and similarities from the preceding phase plus its own variable requirements. Using the base from the domain engineering phase and the individual requirements of the product engineering phase a complete and new product can be built. After the product has been fully tested and approved, the product X can be delivered. ==== Define product requirements ==== Developing the product requirements specification for the individual product and reuse the requirements from the preceding phase. ==== Design product ==== All activities for producing the product architecture. Makes use of the reference architecture from the step "design domain", it selects and configures the required parts of the reference architecture and incorporates product specific adaptations. ==== Build product ==== During this process the product is built, using selections and configurations of the reusable components. ==== Test product ==== During this step the product is verified and validated against its specifications. A test report gives information about all tests that were carried out, this gives an overview of possible errors in the product. If the product in the next step is not accepted, the process will loop back to "build product", in Figure 1 this is indicated as "[unsatisfied]". ==== Deliver and support product ==== The final step is the acceptance of the final product. If it has been successfully tested and approved to be complete, it can be delivered. If the product does not satisfy to the specifications, it has to be rebuilt and tested again. The next figure shows the overall process of product-family engineering as described above. It is a full process overview with all concepts attached to the different steps. == Process data diagram == On the left side the entire process from the top to bottom has been drawn. All activities on the left side are linked to the concepts on the right side through dotted lines. Every concept has a number, which reflects the association with other concepts. == List of concepts == Below the list with concepts will be explained. Most concept definitions are extracted from Pohl, Bockle, & Linden (2005) and also some new definitions have been added. Table 1: List of concepts == Example == There are some good examples of the use of product family engineering, which were quite successful. The abstract model of product family engineering allows different kinds of uses, most of them are related to the consumer electronics m

    Read more →
  • Brain Imaging Data Structure

    Brain Imaging Data Structure

    The Brain Imaging Data Structure (BIDS) is a standard for organizing, annotating, and describing data collected during neuroimaging experiments. It is based on a formalized file and directory structure and metadata files (based on JSON and TSV) with controlled vocabulary. This standard has been adopted by a multitude of labs around the world as well as databases such as OpenNeuro, SchizConnect, Developing Human Connectome Project, and FCP-INDI, and is seeing uptake in an increasing number of studies. While originally specified for MRI data, BIDS has been extended to several other imaging modalities such as MEG, EEG, and intracranial EEG (see also BIDS Extension Proposals). == History == The project is a community-driven effort. BIDS, originally OBIDS (Open Brain Imaging Data Structure), was initiated during an INCF sponsored data sharing working group meeting (January 2015) at Stanford University. It was subsequently spearheaded and maintained by Chris Gorgolewski. Since October 2019, the project is headed by a Steering Group and maintained by a separate team of maintainers, the Maintainers Group, according to a governance document that was approved of by the BIDS community in a vote. BIDS has advanced under the direction and effort of contributors, the community of researchers that appreciate the value of standardizing neuroimaging data to facilitate sharing and analysis. == BIDS Extension Proposals == BIDS can be extended in a backwards compatible way and is evolving over time. This is accomplished through BIDS Extension Proposals (BEPs), which are community-driven processes following agreed-upon guidelines. A full list of finalized BEPs and BEPs in progress can be found on the BIDS website

    Read more →
  • Payment tokenization

    Payment tokenization

    Payment tokenization is a data security process that replaces sensitive payment information, such as credit card numbers, with a unique identifier or "token." This token can be used in place of actual data during transactions but has no exploitable value if breached, thereby reducing the risk of data theft and fraud. == Overview == Payment tokenization is generally categorized into two types: security tokens and payment tokens. Security tokens, also known as post-authorization tokens, are used to replace sensitive information like Primary Account Numbers (PANs), such as credit card numbers either after a payment is authorized or for storing data securely (data-at-rest), such as in merchant databases. These models have been in use since the mid-2000s, following the introduction of the Payment Card Industry Data Security Standard in 2004, which established standards for safeguarding cardholder data. The Payment Card Industry Security Standards Council's 2011 Tokenization Guidelines and the proposed American National Standards Institute X9 standards emphasize using tokens primarily to secure sensitive information, not as replacements for payment credentials processed over financial networks. Traditionally, merchants stored PANs to support backend operations such as settlements, reconciliations, chargebacks, loyalty programs, and customer service. However, with the adoption of security tokenization, merchants can substitute PANs with tokens in their systems. This not only reduces their exposure to fraud but also helps minimize the scope and cost of PCI-DSS compliance, offering a more secure and efficient way to manage cardholder data. == Applications == Payment tokenization is widely used by mobile wallets such as Apple Pay, Google Pay, and Samsung Pay use tokenization to safely store card data on devices. E-commerce platforms rely on it to securely retain customer payment details for recurring purchases. At the physical point of sale, EMV-enabled systems use tokenization to protect card information during in-store transactions. Also, subscription billing services implement tokenization to manage and safeguard payment credentials for ongoing charges.

    Read more →
  • MDS matrix

    MDS matrix

    An MDS matrix (maximum distance separable) is a matrix representing a function with certain diffusion properties that have useful applications in cryptography. Technically, an m × n {\displaystyle m\times n} matrix A {\displaystyle A} over a finite field K {\displaystyle K} is an MDS matrix if it is the transformation matrix of a linear transformation f ( x ) = A x {\displaystyle f(x)=Ax} from K n {\displaystyle K^{n}} to K m {\displaystyle K^{m}} such that no two different ( m + n ) {\displaystyle (m+n)} -tuples of the form ( x , f ( x ) ) {\displaystyle (x,f(x))} coincide in n {\displaystyle n} or more components. Equivalently, the set of all ( m + n ) {\displaystyle (m+n)} -tuples ( x , f ( x ) ) {\displaystyle (x,f(x))} is an MDS code, i.e., a linear code that reaches the Singleton bound. Let A ~ = ( I n A ) {\displaystyle {\tilde {A}}={\begin{pmatrix}\mathrm {I} _{n}\\\hline \mathrm {A} \end{pmatrix}}} be the matrix obtained by joining the identity matrix I n {\displaystyle \mathrm {I} _{n}} to A {\displaystyle A} . Then a necessary and sufficient condition for a matrix A {\displaystyle A} to be MDS is that every possible n × n {\displaystyle n\times n} submatrix obtained by removing m {\displaystyle m} rows from A ~ {\displaystyle {\tilde {A}}} is non-singular. This is also equivalent to the following: all the sub-determinants of the matrix A {\displaystyle A} are non-zero. Then a binary matrix A {\displaystyle A} (namely over the field with two elements) is never MDS unless it has only one row or only one column with all components 1 {\displaystyle 1} . Reed–Solomon codes have the MDS property and are frequently used to obtain the MDS matrices used in cryptographic algorithms. Serge Vaudenay suggested using MDS matrices in cryptographic primitives to produce what he called multipermutations, not-necessarily linear functions with this same property. These functions have what he called perfect diffusion: changing t {\displaystyle t} of the inputs changes at least m − t + 1 {\displaystyle m-t+1} of the outputs. He showed how to exploit imperfect diffusion to cryptanalyze functions that are not multipermutations. MDS matrices are used for diffusion in such block ciphers as AES, SHARK, Square, Twofish, Anubis, KHAZAD, Manta, Hierocrypt, Kalyna, Camellia and HADESMiMC, and in the stream cipher MUGI and the cryptographic hash function Whirlpool, Poseidon.

    Read more →
  • Computer Graphics International

    Computer Graphics International

    Computer Graphics International (CGI) is one of the oldest annual international conferences on computer graphics. It is organized by the Computer Graphics Society (CGS). Researchers across the whole world are invited to share their experiences and novel achievements in various fields - like computer graphics and human-computer interaction. Former conferences have been held recently in Hong Kong (China), Geneva (Switzerland), Shanghai (China), Geneva (virtually), Calgary (Canada), Bintan (Indonesia) and Yokohama (Japan). == Awards == Starting in the year of 2013, CGI has given yearly a Best Paper Award and a Career Achievement Award. == Venues ==

    Read more →
  • Campus network

    Campus network

    A campus network, campus area network, corporate area network or CAN is a computer network made up of an interconnection of local area networks (LANs) within a limited geographical area. The networking equipments (switches, routers) and transmission media (optical fiber, copper plant, Cat5 cabling etc.) are almost entirely owned by the campus tenant / owner: an enterprise, university, government etc. A campus area network is larger than a local area network but smaller than a metropolitan area network (MAN) or wide area network (WAN). == University campuses == College or university campus area networks often interconnect a variety of buildings, including administrative buildings, academic buildings, laboratories, university libraries, or student centers, residence halls, gymnasiums, and other outlying structures, like conference centers, technology centers, and training institutes. Early examples include the Stanford University Network at Stanford University, Project Athena at MIT, and the Andrew Project at Carnegie Mellon University. == Corporate campuses == Much like a university campus network, a corporate campus network serves to connect buildings. Examples of such are the networks at Googleplex and Microsoft's campus. Campus networks are normally interconnected with high speed Ethernet links operating over optical fiber such as gigabit Ethernet and 10 Gigabit Ethernet. == Area range == The range of CAN is 1 to 5 km (1 to 3 mi). If two buildings have the same domain and they are connected with a network, then it will be considered as CAN only. Though the CAN is mainly used for corporate campuses so the link will be high speed.

    Read more →
  • Kerckhoffs's principle

    Kerckhoffs's principle

    Kerckhoffs's principle (also called Kerckhoffs's desideratum, assumption, axiom, doctrine or law) of cryptography was stated by the Dutch cryptographer Auguste Kerckhoffs in the 19th century. The principle holds that a cryptosystem should be secure, even if everything about the system, except the key, is public knowledge. This concept is widely embraced by cryptographers, in contrast to security through obscurity, which is not. Kerckhoffs's principle was phrased by the American mathematician Claude Shannon as "the enemy knows the system", i.e., "one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them". In that form, it is called Shannon's maxim. Another formulation by American researcher and professor Steven M. Bellovin is: In other words—design your system assuming that your opponents know it in detail. (A former official at NSA's National Computer Security Center told me that the standard assumption there was that serial number 1 of any new device was delivered to the Kremlin.) == Origins == The invention of telegraphy radically changed military communications and increased the number of messages that needed to be protected from the enemy dramatically, leading to the development of field ciphers which had to be easy to use without large confidential codebooks prone to capture on the battlefield. It was this environment which led to the development of Kerckhoffs's requirements. Auguste Kerckhoffs was a professor of German language at Ecole des Hautes Etudes Commerciales (HEC) in Paris. In early 1883, Kerckhoffs's article, La Cryptographie Militaire, was published in two parts in the Journal of Military Science, in which he stated six design rules for military ciphers. Translated from French, they are: The system must be practically, if not mathematically, indecipherable; It should not require secrecy, and it should not be a problem if it falls into enemy hands; It must be possible to communicate and remember the key without using written notes, and correspondents must be able to change or modify it at will; It must be applicable to telegraph communications; It must be portable, and should not require several persons to handle or operate; Lastly, given the circumstances in which it is to be used, the system must be easy to use and should not be stressful to use or require its users to know and comply with a long list of rules. Some are no longer relevant given the ability of computers to perform complex encryption. The second rule, now known as Kerckhoffs's principle, is still critically important. == Explanation of the principle == Kerckhoffs viewed cryptography as a rival to, and a better alternative than, steganographic encoding, which was common in the nineteenth century for hiding the meaning of military messages. One problem with encoding schemes is that they rely on humanly-held secrets such as "dictionaries" which disclose for example, the secret meaning of words. Steganographic-like dictionaries, once revealed, permanently compromise a corresponding encoding system. Another problem is that the risk of exposure increases as the number of users holding the secrets increases. Nineteenth century cryptography, in contrast, used simple tables which provided for the transposition of alphanumeric characters, generally given row-column intersections which could be modified by keys which were generally short, numeric, and could be committed to human memory. The system was considered "indecipherable" because tables and keys do not convey meaning by themselves. Secret messages can be compromised only if a matching set of table, key, and message falls into enemy hands in a relevant time frame. Kerckhoffs viewed tactical messages as only having a few hours of relevance. Systems are not necessarily compromised, because their components (i.e. alphanumeric character tables and keys) can be easily changed. === Advantage of secret keys === Using secure cryptography is supposed to replace the difficult problem of keeping messages secure with a much more manageable one, keeping relatively small keys secure. A system that requires long-term secrecy for something as large and complex as the whole design of a cryptographic system obviously cannot achieve that goal. It only replaces one hard problem with another. However, if a system is secure even when the enemy knows everything except the key, then all that is needed is to manage keeping the keys secret. There are a large number of ways the internal details of a widely used system could be discovered. The most obvious is that someone could bribe, blackmail, or otherwise threaten staff or customers into explaining the system. In war, for example, one side will probably capture some equipment and people from the other side. Each side will also use spies to gather information. If a method involves software, someone could do memory dumps or run the software under the control of a debugger in order to understand the method. If hardware is being used, someone could buy or steal some of the hardware and build whatever programs or gadgets needed to test it. Hardware can also be dismantled so that the chip details can be examined under the microscope. === Maintaining security === A generalization some make from Kerckhoffs's principle is: "The fewer and simpler the secrets that one must keep to ensure system security, the easier it is to maintain system security." Bruce Schneier ties it in with a belief that all security systems must be designed to fail as gracefully as possible: Kerckhoffs's principle applies beyond codes and ciphers to security systems in general: every secret creates a potential failure point. Secrecy, in other words, is a prime cause of brittleness—and therefore something likely to make a system prone to catastrophic collapse. Conversely, openness provides ductility. Any security system depends crucially on keeping some things secret. However, Kerckhoffs's principle points out that the things kept secret ought to be those least costly to change if inadvertently disclosed. For example, a cryptographic algorithm may be implemented by hardware and software that is widely distributed among users. If security depends on keeping that secret, then disclosure leads to major logistic difficulties in developing, testing, and distributing implementations of a new algorithm – it is "brittle". On the other hand, if keeping the algorithm secret is not important, but only the keys used with the algorithm must be secret, then disclosure of the keys simply requires the simpler, less costly process of generating and distributing new keys. == Applications == In accordance with Kerckhoffs's principle, the majority of civilian cryptography makes use of publicly known algorithms. By contrast, ciphers used to protect classified government or military information are often kept secret (see Type 1 encryption). However, it should not be assumed that government/military ciphers must be kept secret to maintain security. It is possible that they are intended to be as cryptographically sound as public algorithms, and the decision to keep them secret is in keeping with a layered security posture. == Security through obscurity == It is moderately common for companies to keep the inner workings of a system secret. Some argue this "security by obscurity" makes the product safer and less vulnerable to attack. A counter-argument is that keeping the innards secret may improve security in the short term, but in the long run, only systems that have been published and analyzed should be trusted. Steven Bellovin and Randy Bush commented: Security Through Obscurity Considered Dangerous Hiding security vulnerabilities in algorithms, software, and/or hardware decreases the likelihood they will be repaired and increases the likelihood that they can and will be exploited. Discouraging or outlawing discussion of weaknesses and vulnerabilities is extremely dangerous and deleterious to the security of computer systems, the network, and its citizens. Open Discussion Encourages Better Security The long history of cryptography and cryptoanalysis has shown time and time again that open discussion and analysis of algorithms exposes weaknesses not thought of by the original authors, and thereby leads to better and more secure algorithms. As Kerckhoffs noted about cipher systems in 1883 [Kerc83], "Il faut qu'il n'exige pas le secret, et qu'il puisse sans inconvénient tomber entre les mains de l'ennemi." (Roughly, "the system must not require secrecy and must be able to be stolen by the enemy without causing trouble.")

    Read more →
  • Caste census

    Caste census

    Caste census is a proposed census to be conducted in India by the Central Government of India. The proposed census was decided under the leadership of Prime Minister Narendra Modi by the cabinet committee of political affairs (CCPA) on 30 April 2025. It has been decided that a caste enumeration should be included with the forthcoming census. The exact time has not been declared yet. It is unclear that when the next census will be held. The decision of the cabinet was announced by the Central Railway Minister Ashwini Vaishnaw. It has been seen as a step that would help in drafting "equitable and targeted" policies by the present Central Government of India led by the Bhartiya Janta Party in India. The Central Home Minister Amit Shah has described the decision as a "historic decision". He has also described that the historic decision as “committed to social justice”. The leader of opposition Rahul Gandhi has welcomed the decision. He said "We have shown we can pressure govt" He has demanded a clear timeline for its completion. He has called it "The first step towards deep social reform". == Description == The caste census is a systematic recording of individuals’ caste identities during the nationwide census in the country. The Central minister Ashwini Vaishnaw expressed his view on the proposed census and said that it would "strengthen the social and economic structure of our society while the nation continues to progress”. The Caste census will happen for the first time in 100 years by the Central Government of India. It will be the part of the upcoming census in India. == History == According to Peabody, the first systematic caste-wise enumeration of households in the Indian subcontinent was conducted between 1658 and 1664 across seven districts of the then Marwar Kingdom, including Jodhpur city which was its capital. It was conducted by the then home minister Munhata Nainsi of the kingdom for the purpose of tax documentation. It was not to for classification of society or creation of social hierarchies but solving a tax related problem. During the period of the British rule in India, caste census was included in the decadal censuses to categorise the population by caste, religion and occupation. In 1871–72, the first detailed caste census was conducted by the government of British Raj in India. It was practiced between the period 1881 to 1931. The last caste census was conducted in the year 1931 in which 4,147 castes were recorded. The largest population in the whole of British India (including Pakistan and Bangladesh) was of Brahmins. The population of Brahmins was recorded more than 1.5 crores. After Brahmin community, the second place was of Jatav (Chamar)community. The population of Jatav was a little more than 1.23 crores. On the third place were Rajputs. The population of Rajputs was 81 lakhs. The Rajput caste was followed by the Kunbi caste of Maharashtra. The population of Kunbi caste was 64 lakhs and 34 thousands. The Kunbi caste was followed by Yadav (Ahir) caste. The population of Yadav (Ahir) community was 56 lakhs and 82 thousands. The Yadav (Ahir) caste was followed by Teli community. The population of Teli community was 42 lakhs and 58 thousands. The Teli community was followed by Gwala community. The population of the Gwala community was 40 lakhs. After the independence of India, the caste enumeration was stopped by the newly independent Government of India led by the prime minister Pandit Jawahar Lal Nehru in 1951. The caste enumeration was stopped to avoid reinforcing social divisions in the Indian society. But, there was an exception made for the enumeration of the Scheduled Castes (SCs) and Scheduled Tribes (STs) in the decadal censuses. Therefore, the enumeration of the Scheduled Castes and the Scheduled Tribes is being conducted in every census since 1951. In 1961, the Government of India permitted states for conducting their own surveys to compile OBC lists, but national caste census was not conducted.

    Read more →
  • Quantexa

    Quantexa

    Quantexa is a UK-based software company that develops artificial intelligence-based applications for data analytics and decision-making. The company was founded in 2016 and is headquartered in London, with operations in North America, Europe, and the Asia-Pacific region. As of 2025, Quantexa reported a valuation of $2.6 billion and provides services to organizations in over 70 countries. Investors include Warburg Pincus, HSBC, and the Ontario Teachers’ Pension Plan. == History == Quantexa was founded in London in 2016 by several co-founders, including Jamie Hutton, Richard Seewald, Imam Hoque, Felix Hoddinott, and Vishal Marria, who also serves as the company's chief executive officer. The company was established to develop tools intended to address limitations in traditional data analysis methods, particularly those related to identifying hidden connections across large datasets. The name "Quantexa" is derived from the company's focus on quantitative methods and data analysis. In 2023, Quantexa acquired Dublin-based AI firm Aylien. In April 2023, the company completed a Series E funding round, raising $129 million at a valuation of approximately $1.8 billion, marking its entry into "unicorn" status. In October 2024, the company reported annual recurring revenue (ARR) exceeding $100 million. In early 2025, Quantexa participated in the World Economic Forum's Unicorn Program, which supports high-growth technology companies. In March 2025, Quantexa completed a Series F funding round of $175 million, led by Teachers' Venture Growth, the venture arm of the Ontario Teachers' Pension Plan. That August, the company was reported to be considering a 2026 IPO. The company formed a partnership with Zurich in October 2025, the first insurer to add its AI-based Decision Intelligence platform to enhance fraud detection.

    Read more →
  • Manufacturing Automation Protocol

    Manufacturing Automation Protocol

    Manufacturing Automation Protocol (MAP) was a computer network standard released in 1982 for interconnection of devices from multiple manufacturers. It was developed by General Motors to combat the proliferation of incompatible communications standards used by suppliers of automation products such as programmable controllers. By 1985 demonstrations of interoperability were carried out and 21 vendors offered MAP products. In 1986 the Boeing corporation merged its Technical Office Protocol with the MAP standard, and the combined standard was referred to as "MAP/TOP". The standard was revised several times between the first issue in 1982 and MAP 3.0 in 1987, with significant technical changes that made interoperation between different revisions of the standard difficult. Although promoted and used by manufacturers such as General Motors, Boeing, and others, it lost market share to the contemporary Ethernet standard and was not widely adopted. Difficulties included changing protocol specifications, the expense of MAP interface links, and the speed penalty of a token-passing network. The token bus network protocol used by MAP became standardized as IEEE standard 802.4 but this committee disbanded in 2004 due to lack of industry attention.

    Read more →
  • Pepper (cryptography)

    Pepper (cryptography)

    In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate using another meachanism, such as a Hardware Security Module. Note that the National Institute of Standards and Technology refers to this value as a secret key rather than a pepper. A pepper is similar in concept to a salt or an encryption key. It is like a salt in that it is a randomized value that is added to a password hash, and it is similar to an encryption key in that it should be kept secret. A pepper performs a comparable role to a salt or an encryption key, but while a salt is not secret (merely unique) and can be stored alongside the hashed output, a pepper is secret and must not be stored with the output. The hash and salt are usually stored in a database, but, if stored, a pepper must be stored separately to prevent it from being obtained by the attacker in case of a database breach. == History == The idea of a site- or service-specific salt (in addition to a per-user salt) has a long history, with Steven M. Bellovin proposing a local parameter in a Bugtraq post in 1995. In 1996 Udi Manber also described the advantages of such a scheme, terming it a secret salt. However, he suggested not storing the value of the secret salt, but instead rediscovering it by trial and error at password verification time. The term pepper has been used, by analogy to salt, but with a variety of meanings. For example, when discussing a challenge-response scheme, pepper has been used for a salt-like quantity, though not used for password storage; it has been used for a data transmission technique where a pepper must be guessed; and even as a part of jokes. The term pepper was proposed for a secret or local parameter stored separately from the password in a discussion of protecting passwords from rainbow table attacks. This usage did not immediately catch on: for example, Fred Wenzel added support to Django password hashing for storage based on a combination of bcrypt and HMAC with separately stored nonces, without using the term. Usage has since become more common. == Types == There are multiple different types of pepper: A shared secret that is common to all users. A randomly-selected number that must be re-discovered on every password input. These mechanisms could be combined with password salting, iterated hashing or even one another. == Shared-secret pepper == Bellovin and Webster suggest prepend a shared secret to the password before hashing, which allows easy use of existing hash functions. For example, consider two users to be added to a database. This table contains two combinations of username and password. The password is not saved, and the 8-byte (64-bit) 44534C70C6883DE2 pepper is saved in a safe place separate from the output values of the hash, in this case SHA256. Unlike the salt, the pepper does not provide protection to users who use the same password, but protects against dictionary attacks, unless the attacker has the pepper value available. Since the same pepper is not shared between different applications, an attacker is unable to reuse the hashes of one compromised database to another. A complete scheme for saving passwords may include both salt and pepper use. For example, it has been suggested to combine the pepper by encrypting salted password hashes, which allows rotation of the pepper. In the case of a shared-secret pepper, a single compromised password (via password reuse or other attack) along with a user's salt can lead to an attack to discover the pepper, rendering it ineffective. If an attacker knows a plaintext password and a user's salt, as well as the algorithm used to hash the password, then discovering the pepper can be a matter of brute forcing the values of the pepper. This is why NIST recommends the secret value be at least 112 bits, so that discovering it by exhaustive search is prohibitively expensive. The pepper must be generated anew for every application it is deployed in, otherwise a breach of one application would result in lowered security of another application. Without knowledge of the pepper, other passwords in the database will be far more difficult to extract from their hashed values, as the attacker would need to guess the password as well as the pepper. A pepper adds security to a database of salts and hashes because unless the attacker is able to obtain the pepper, cracking even a single hash is intractable, no matter how weak the original password. Even with a list of (salt, hash) pairs, an attacker must also guess the secret pepper in order to find the password which produces the hash. The NIST specification for a secret salt suggests using a Password-Based Key Derivation Function (PBKDF) with an approved Pseudorandom Function such as HMAC with SHA-3 as the hash function of the HMAC. The NIST recommendation is also to perform at least 1000 iterations of the PBKDF, and a further minimum 1000 iterations using the secret salt in place of the non-secret salt. == Randomly-selected pepper that must be re-discovered == The aim of this mechanism is to slow down password the password verification step, thus slowing attacks. The aim is similar increasing the iteration count on bcrypt or Argon2, but the mechanism is different. The secret salt or pepper must be rediscovered by the verifier or attacker each time by guessing. In this situation, the password hashing function is calculated using both the password and the pepper. At password storage time, the pepper is chosen randomly from a range between 1 and R, the hash output is calculated using the password and the pepper. The hash output is stored with the username. The pepper is then discarded. At password verification time, the verifier is provided with a username and password to verify. The originally calculated hash is retrieved for the given username, and then the hash of the password and each value between 1 and R is calculated. If any of these hash values match the stored password hash, the password is considered valid. Note, the possible values of the pepper should not be tested in a fixed order known to an attacker, otherwise a timing attack may reveal the pepper. If the password is correct, the correct pepper will be found in R/2 hash evaluations on average. If the password is incorrect, all R values must be tested before the password can be rejected.

    Read more →
  • Critical data studies

    Critical data studies

    Critical data studies is the exploration of and engagement with social, cultural, and ethical challenges that arise when working with big data. It is through various unique perspectives and taking a critical approach that this form of study can be practiced. As its name implies, critical data studies draws heavily on the influence of critical theory, which has a strong focus on addressing the organization of power structures. This idea is then applied to the study of data. Interest in this unique field of critical data studies began in 2011 with scholars danah boyd and Kate Crawford posing various questions for the critical study of big data and recognizing its potential threatening impacts on society and culture. It was not until 2014, and more exploration and conversations, that critical data studies was officially coined by scholars Craig Dalton and Jim Thatcher. They put a large emphasis on understanding the context of big data in order to approach it more critically. Researchers such as David Ribes, Robert Soden, Seyram Avle, Sarah E. Fox, and Phoebe Sengers focus on understanding data as a historical artifact and taking an interdisciplinary approach towards critical data studies. Other key scholars in this discipline include Rob Kitchin and Tracey P. Lauriault who focus on reevaluating data through different spheres. Various critical frameworks that can be applied to analyze big data include Feminist, Anti-Racist, Queer, Indigenous, Decolonial, Anti-Ableist, as well as Symbolic and Synthetic data science. These frameworks help to make sense of the data by addressing power, biases, privacy, consent, and underrepresentation or misrepresentation concerns that exist in data as well as how to approach and analyze this data with a more equitable mindset. == Motivation == In their article in which they coin the term 'critical data studies,' Dalton and Thatcher also provide several justifications as to why data studies is a discipline worthy of a critical approach. First, 'big data' is an important aspect of twenty-first century society, and the analysis of 'big data' allows for a deeper understanding of what is happening and for what reasons. Big data is important to critical data studies because it is the type of data used within this field. Big data does not necessarily refer to a large data set, it can have a data set with millions of rows, but also a data set that just has a wide variety and expansive scope of data with a smaller type of dataset. As well as having whole populations in the data set and not just sample sizes. Furthermore, big data as a technological tool and the information that it yields are not neutral, according to Dalton and Thatcher, making it worthy of critical analysis in order to identify and address its biases. Building off this idea, another justification for a critical approach is that the relationship between big data and society is an important one, and therefore worthy of study. Ribes et. al. argue there is a need for an interdisciplinary understanding of data as a historical artifact as a motivating aspect of critical data studies.The overarching consensus in the Computer-Supported Cooperative Work (CSCW) field, is that people should speak for the data, and not let the data speak for itself. The sources of big data and it’s relationship to varied metadata can be a complicated one, which leads to data disorder and a need for an ethical analysis. Additionally, Iliadis and Russo (2016) have called for studying data assemblages. This is to say, data has innate technological, political, social, and economic histories that should be taken into consideration. Kitchin argues data is almost never raw, and it is almost always cooked, meaning that it is always spoken for by the data scientists utilizing it. Thus, Big Data should be open to a variety of perspectives, especially those of cultural and philosophical nature. Further, data contains hidden histories, ideologies, and philosophies. Big data technology can cause significant changes in society's structure and in the everyday lives of people, and, being a product of society, big data technology is worthy of sociological investigation. Moreover, data sets are almost never completely without any influence. Rather, data are shaped by the vision or goals of those gathering the data, and during the data collection process, certain things are quantified, stored, sorted and even discarded by the research team. A critical approach is thus necessary in order to understand and reveal the intent behind the information being presented.One of these critical approaches has been through feminist data studies. This method applies feminist principles to critical studies and data collecting and analysis. The goal of this is to address the power imbalance in data science and society. According to Catherine D’Ignazio and Lauren F. Klein, a power analysis can be performed by examining power, challenging power, evaluating emotion and embodiment, rethinking binaries and hierarchies, embracing pluralism, considering context, and making labor visible. Feminist data studies is part of the movement towards making data to benefit everyone and not to increase existing inequalities. Moreover, data alone cannot speak for themselves; in order to possess any concrete meaning, data must be accompanied by theoretical insight or alternative quantitative or qualitative research measures. Based on different social topics such as anti-racist data studies, critical data studies give a focus on those social issues concerning data. Specifically in anti-racist data studies they use a classification approach to get representation for those within that community. Desmond Upton Patton and others used their own classification system in the communities of Chicago to help target and reduce violence with young teens on twitter. They had students in those communities help them to decipher the terminology and emojis of these teens to target the language used in tweets that followed with violence outside of the computer screens. This is just one real world example of critical data studies and its application. Dalton and Thatcher argue that if one were to only think of data in terms of its exploitative power, there is no possibility of using data for revolutionary, liberatory purposes. Finally, Dalton and Thatcher propose that a critical approach in studying data allows for 'big data' to be combined with older, 'small data,' and thus create more thorough research, opening up more opportunities, questions and topics to be explored. == Issues and concerns for critical data scholars == Data plays a pivotal role in the emerging knowledge economy, driving productivity, competitiveness, efficiency, sustainability, and capital accumulation. The ethical, political, and economic dimensions of data dynamically evolve across space and time, influenced by changing regimes, technologies, and priorities. Technically, the focus lies on handling, storing, and analyzing vast data sets, utilizing machine learning-based data mining and analytics. This technological advancement raises concerns about data quality, encompassing validity, reliability, authenticity, usability, and lineage. The use of data in modern society brings about new ways of understanding and measuring the world, but also brings with it certain concerns or issues. Data scholars attempt to bring some of these issues to light in their quest to be critical of data. Technical and organizational issues could include the scope of the data set, meaning there is too little or too much data to work with, leading to inaccurate results. It becomes crucial for critical data scholars to carefully consider the adequacy of data volume for their analyses. The quality of the data itself is another facet of concern. The data itself could be of poor quality, such as an incomplete or messy data set with missing or inaccurate data values. This would lead researchers to have to make edits and assumptions about the data itself. Addressing these issues often requires scholars to make edits and assumptions about the data to ensure its reliability and relevance. Data scientists could have improper access to the actual data set, limiting their abilities to analyze it. Linnet Taylor explains how gaps in data can arise when people of varying levels of power have certain rights to their data sources. These people in power can control what data is collected, how it is displayed and how it is analyzed. The capabilities of the research team also play a crucial role in the quality of data analytics. The research team may have inadequate skills or organizational capabilities which leads to the actual analytics performed on the dataset to be biased. This can also lead to ecological fallacies, meaning an assumption is made about an individual based on data or results from a larger group of people. These technical and organizational challenges highlight the complexity of working with data and

    Read more →
  • Knowledge integration

    Knowledge integration

    Knowledge integration is the process of synthesizing multiple knowledge models (or representations) into a common model (representation). Compared to information integration, which involves merging information having different schemas and representation models, knowledge integration focuses more on synthesizing the understanding of a given subject from different perspectives. For example, multiple interpretations are possible of a set of student grades, typically each from a certain perspective. An overall, integrated view and understanding of this information can be achieved if these interpretations can be put under a common model, say, a student performance index. The Web-based Inquiry Science Environment (WISE), from the University of California at Berkeley has been developed along the lines of knowledge integration theory. Knowledge integration has also been studied as the process of incorporating new information into a body of existing knowledge with an interdisciplinary approach. This process involves determining how the new information and the existing knowledge interact, how existing knowledge should be modified to accommodate the new information, and how the new information should be modified in light of the existing knowledge. A learning agent that actively investigates the consequences of new information can detect and exploit a variety of learning opportunities; e.g., to resolve knowledge conflicts and to fill knowledge gaps. By exploiting these learning opportunities the learning agent is able to learn beyond the explicit content of the new information. The machine learning program KI, developed by Murray and Porter at the University of Texas at Austin, was created to study the use of automated and semi-automated knowledge integration to assist knowledge engineers constructing a large knowledge base. A possible technique which can be used is semantic matching. More recently, a technique useful to minimize the effort in mapping validation and visualization has been presented which is based on Minimal Mappings. Minimal mappings are high quality mappings such that i) all the other mappings can be computed from them in time linear in the size of the input graphs, and ii) none of them can be dropped without losing property i). The University of Waterloo operates a Bachelor of Knowledge Integration undergraduate degree program as an academic major or minor. The program started in 2008.

    Read more →
  • Data recovery

    Data recovery

    In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or overwritten data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS). Logical failures occur when the hard drive devices are functional but the user or automated-OS cannot retrieve or access data stored on them. Logical failures can occur due to corruption of the engineering chip, lost partitions, firmware failure, or failures during formatting/re-installation. Data recovery can be a very simple or technical challenge. This is why there are specific software companies specialized in this field that help to get back data on your system. == About == The most common data recovery scenarios involve an operating system failure, malfunction of a storage device, logical failure of storage devices, accidental damage or deletion, etc. (typically, on a single-drive, single-partition, single-OS system), in which case the ultimate goal is simply to copy all important files from the damaged media to another new drive. This can be accomplished using a Live CD, or DVD by booting directly from a ROM or a USB drive instead of the corrupted drive in question. Many Live CDs or DVDs provide a means to mount the system drive and backup drives or removable media, and to move the files from the system drive to the backup media with a file manager or optical disc authoring software. Such cases can often be mitigated by disk partitioning and consistently storing valuable data files (or copies of them) on a different partition from the replaceable OS system files. Another scenario involves a drive-level failure, such as a compromised file system or drive partition, or a hard disk drive failure. In any of these cases, the data is not easily read from the media devices. Depending on the situation, solutions involve repairing the logical file system, partition table, or master boot record, or updating the firmware or drive recovery techniques ranging from software-based recovery of corrupted data, to hardware- and software-based recovery of damaged service areas (also known as the hard disk drive's "firmware"), to hardware replacement on a physically damaged drive which allows for the extraction of data to a new drive. If a drive recovery is necessary, the drive itself has typically failed permanently, and the focus is rather on a one-time recovery, salvaging whatever data can be read. In a third scenario, files have been accidentally "deleted" from a storage medium by the users. Typically, the contents of deleted files are not removed immediately from the physical drive; instead, references to them in the directory structure are removed, and thereafter space the deleted data occupy is made available for later data overwriting. In the mind of end users, deleted files cannot be discoverable through a standard file manager, but the deleted data still technically exists on the physical drive. In the meantime, the original file contents remain, often several disconnected fragments, and may be recoverable if not overwritten by other data files. The term "data recovery" is also used in the context of forensic applications or espionage, where data which have been encrypted, hidden, or deleted, rather than damaged, are recovered. Sometimes data present in the computer gets encrypted or hidden due to reasons like virus attacks which can only be recovered by some computer forensic experts. == Physical damage == A wide variety of failures can cause physical damage to storage media, which may result from human errors and natural disasters. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer from a multitude of mechanical failures, such as head crashes, PCB failure, and failed motors; tapes can simply break. Physical damage to a hard drive, even in cases where a head crash has occurred, does not necessarily mean permanent data loss. However, in extreme cases, such as prolonged exposure to moisture and corrosion —like the lost Bitcoin hard drive of James Howells, buried in the Newport landfill for over a decade — recovery is usually impossible. In rare cases, forensic techniques such as magnetic force microscopy (MFM) have been explored to detect residual magnetic traces when data holds exceptional value. Other techniques employed by many professional data recovery companies can typically salvage most, if not all, of the data that had been lost when the failure occurred. Of course, there are exceptions to this, such as cases where severe damage to the hard drive platters may have occurred. However, if the hard drive can be repaired and a full image or clone created, then the logical file structure can be rebuilt in most instances. Most physical damage cannot be repaired by end users. For example, opening a hard disk drive in a normal environment can allow airborne dust to settle on the platter and become caught between the platter and the read/write head. During normal operation, read/write heads float 3 to 6 nanometers above the platter surface, and the average dust particles found in a normal environment are typically around 30,000 nanometers in diameter. When these dust particles get caught between the read/write heads and the platter, they can cause new head crashes that further damage the platter and thus compromise the recovery process. Furthermore, end users generally do not have the hardware or technical expertise required to make these repairs. Consequently, data recovery companies are often employed to salvage important data with the more reputable ones using class 100 dust- and static-free cleanrooms. === Recovery techniques === Recovering data from physically damaged hardware can involve multiple techniques. Some damage can be repaired by replacing parts in the hard disk. This alone may make the disk usable, but there may still be logical damage. A specialized disk-imaging procedure is used to recover every readable bit from the surface. Once this image is acquired and saved on a reliable medium, the image can be safely analyzed for logical damage and will possibly allow much of the original file system to be reconstructed. ==== Hardware repair ==== A common misconception is that a damaged printed circuit board (PCB) may be simply replaced during recovery procedures by an identical PCB from a healthy drive. While this may work in rare circumstances on hard disk drives manufactured before 2003, it will not work on newer drives. Electronics boards of modern drives usually contain drive-specific adaptation data (generally a map of bad sectors and tuning parameters) and other information required to properly access data on the drive. Replacement boards often need this information to effectively recover all of the data. The replacement board may need to be reprogrammed. Some manufacturers (Seagate, for example) store this information on a serial EEPROM chip, which can be removed and transferred to the replacement board. Each hard disk drive has what is called a system area or service area; this portion of the drive, which is not directly accessible to the end user, usually contains drive's firmware and adaptive data that helps the drive operate within normal parameters. One function of the system area is to log defective sectors within the drive; essentially telling the drive where it can and cannot write data. The sector lists are also stored on various chips attached to the PCB, and they are unique to each hard disk drive. If the data on the PCB do not match what is stored on the platter, then the drive will not calibrate properly. In most cases the drive heads will click because they are unable to find the data matching what is stored on the PCB. == Logical damage == The term "logical damage" refers to situations in which the error is not a problem in the hardware and requires software-level solutions. === Corrupt partitions and file systems, media errors === In some cases, data on a hard disk drive can be unreadable due to damage to the partition table or file system, or to (intermittent) media errors. In the majority of these cases, at least a portion of the original data can be recovered by repairing the damaged partition table or file system using specialized data recovery software such as TestDisk; software like ddrescue can image media despite intermittent errors, and image raw data when there is partition table or file system damage. This type of data recovery can be performed by people without expertise in drive hardware as it requires no special physica

    Read more →
  • Torus interconnect

    Torus interconnect

    A torus interconnect is a switch-less network topology for connecting processing nodes in a parallel computer system. == Introduction == In geometry, a torus is created by revolving a circle about an axis coplanar to the circle. While this is a general definition in geometry, the topological properties of this type of shape describes the network topology in its essence. === Geometry illustration === In the representations below, the first is a one dimension torus, a simple circle. The second is a two dimension torus, in the shape of a 'doughnut'. The animation illustrates how a two dimension torus is generated from a rectangle by connecting its two pairs of opposite edges. At one dimension, a torus topology is equivalent to a ring interconnect network, in the shape of a circle. At two dimensions, it becomes equivalent to a two dimension mesh, but with extra connection at the edge nodes. === Torus network topology === A torus interconnect is a switch-less topology that can be seen as a mesh interconnect with nodes arranged in a rectilinear array of N = 2, 3, or more dimensions, with processors connected to their nearest neighbors, and corresponding processors on opposite edges of the array connected.[1] In this lattice, each node has 2N connections. This topology is named for the lattice formed in this way, which is topologically homogeneous to an N-dimensional torus. == Visualization == The first 3 dimensions of torus network topology are easier to visualize and are described below: 1D Torus: one dimension, n nodes are connected in closed loop with each node connected to its two nearest neighbors. Communication can take place in two directions, +x and −x. A 1D Torus is the same as ring interconnection. 2D Torus: two dimensions with degree of four, the nodes are imagined laid out in a two-dimensional rectangular lattice of n rows and n columns, with each node connected to its four nearest neighbors, and corresponding nodes on opposite edges connected. Communication can take place in four directions, +x, −x, +y, and −y. The total nodes of a 2D Torus is n2. 3D Torus: three dimensions, the nodes are imagined in a three-dimensional lattice in the shape of a rectangular prism, with each node connected with its six neighbors, with corresponding nodes on opposing faces of the array connected. Each edge consists of n nodes. communication can take place in six directions, +x, −x, +y, −y, +z, −z. Each edge of a 3D Torus consist of n nodes. The total nodes of 3D Torus is n3. ND Torus: N dimensions, each node of an N dimension torus has 2N neighbors, Communication can take place in 2N directions. Each edge consists of n nodes. Total nodes of this torus is nN. The main motivation of having higher dimension of torus is to achieve higher bandwidth, lower latency, and higher scalability. Higher-dimensional arrays are difficult to visualize. The above ruleset shows that each higher dimension adds another pair of nearest neighbor connections to each node. == Performance == A number of supercomputers on the TOP500 list use three-dimensional torus networks, e.g. IBM's Blue Gene/L and Blue Gene/P, and the Cray XT3. IBM's Blue Gene/Q uses a five-dimensional torus network. Fujitsu's K computer and the PRIMEHPC FX10 use a proprietary three-dimensional torus 3D mesh interconnect called Tofu. === 3D Torus performance simulation === Sandeep Palur and Dr. Ioan Raicu from Illinois Institute of Technology conducted experiments to simulate 3D torus performance. Their experiments ran on a computer with 250GB RAM, 48 cores and x86_64 architecture. The simulator they used was ROSS (Rensselaer’s Optimistic Simulation System). They mainly focused on three aspects: Varying network size Varying number of servers Varying message size They concluded that throughput decreases with the increase of servers and network size. Otherwise, throughput increases with the increase of message size. === 6D Torus product performance === Fujitsu Limited developed a 6D torus computer model called "Tofu". In their model, a 6D torus can achieve 100 GB/s off-chip bandwidth, 12 times higher scalability than a 3D torus, and high fault tolerance. The model is used in the K computer and Fugaku. === Cost === While long wrap-around links may be the easiest way to visualize the connection topology, in practice, restrictions on cable lengths often make long wrap-around links impractical. Instead, directly connected nodes—including nodes that the above visualization places on opposite edges of a grid, connected by a long wrap-around link—are physically placed nearly adjacent to each other in a folded torus network. Every link in the folded torus network is very short—almost as short as the nearest-neighbor links in a simple grid interconnect—and therefore low-latency.

    Read more →