AI Assistant Zia Spark Icon

AI Assistant Zia Spark Icon — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Artificial intelligence content detection

    Artificial intelligence content detection

    Artificial intelligence detection software aims to determine whether some content (text, image, video, or audio) was generated using artificial intelligence (AI). This software is often unreliable. == Accuracy issues == Many AI detection tools have been shown to be unreliable in detecting AI-generated text. In a 2023 study conducted by Weber-Wulff et al., researchers evaluated 14 detection tools including Turnitin and GPTZero and found that "all scored below 80% of accuracy and only 5 over 70%." They also found that these tools tend to have a bias for classifying texts more as human than as AI, and that accuracy of these tools worsens upon paraphrasing. === False positives === In AI content detection, a false positive is when human-written work is incorrectly flagged as AI-written. Many AI detection platforms claim to have a minimal level of false positives, with Turnitin claiming a less than 1% false positive rate. However, later research by The Washington Post produced much higher rates of 50%, though they used a smaller sample size. False positives in an academic setting frequently lead to accusations of academic misconduct, which can have serious consequences for a student's academic record. Additionally, studies have shown evidence that many AI detection models are prone to give false positives to work written by people whose first language is not English, and also to neurodivergent people. In June 2023, Janelle Shane wrote that portions of her book You Look Like a Thing and I Love You were flagged as AI-generated. === False negatives === A false negative is a failure to identify documents with AI-written text. False negatives often happen as a result of a detection software's sensitivity level or because evasive techniques were used when generating the work to make it sound more human. False negatives are less of a concern academically, since they aren't likely to lead to accusations and ramifications. Notably, Turnitin stated they have a 15% false negative rate. == Text detection == For text, this is usually done to prevent alleged plagiarism, often by detecting repetition of words as telltale signs that a text was AI-generated (including hallucinations). Detection systems may also rely on stylistic and structural regularities associated with LLM output, such as unusually consistent grammar, formulaic transitions, repeated discourse markers, and recurring rhetorical templates. Some tools are designed less to establish authorship provenance than to flag prose that resembles common LLM-generated style patterns. They are often used by teachers marking their students, usually on an ad hoc basis. Following the release of ChatGPT and similar AI text generative software, many educational establishments have issued policies against the use of AI by students. AI text detection software is also used by those assessing job applicants, as well as online search engines, hiring, online moderation and publishing. Current detectors may sometimes be unreliable and have incorrectly marked work by humans as originating from AI while failing to detect AI-generated work in other instances. MIT Technology Review said that the technology "struggled to pick up ChatGPT-generated text that had been slightly rearranged by humans and obfuscated by a paraphrasing tool". AI text detection software has also been shown to discriminate against non-native speakers of English. Two students from the University of California, Davis, were referred to the university's Office of Student Success and Judicial Affairs (OSSJA) after their professors scanned their essays with positive results; the first with an AI detector called GPTZero, and the second with an AI detector integration in Turnitin. However, following media coverage, and a thorough investigation, the students were cleared of any wrongdoing. In April 2023, Cambridge University and other members of the Russell Group of universities in the United Kingdom opted out of Turnitin's AI text detection tool, after expressing concerns it was unreliable. The University of Texas at Austin opted out of the system six months later. In May 2023, a professor at Texas A&M University–Commerce used ChatGPT to detect whether his students' content was written by it, which ChatGPT said was the case. As such, he threatened to fail the class despite ChatGPT not being able to detect AI-generated writing. No students were prevented from graduating because of the issue, and all but one student (who admitted to using the software) were exonerated from accusations of having used ChatGPT in their content. In July 2023, a paper titled "GPT detectors are biased against non-native English writers" was released, reporting that GPTs discriminate against non-native English authors. The paper compared seven GPT detectors against essays from both non-native English speakers and essays from United States students. The essays from non-native English speakers had an average false positive rate of 61.3%. An article by Thomas Germain, published on Gizmodo in June 2024, reported job losses among freelance writers and journalists due to AI text detection software mistakenly classifying their work as AI-generated. In September 2024, Common Sense Media reported that generative AI detectors had a 20% false positive rate for Black students, compared to 10% of Latino students and 7% of White students. To improve the reliability of AI text detection, researchers have explored digital watermarking techniques. A 2023 paper titled "A Watermark for Large Language Models" presents a method to embed imperceptible watermarks into text generated by large language models (LLMs). This watermarking approach allows content to be flagged as AI-generated with a high level of accuracy, even when text is slightly paraphrased or modified. The technique is designed to be subtle and hard to detect for casual readers, thereby preserving readability, while providing a detectable signal for those employing specialized tools. However, while promising, watermarking faces challenges in remaining robust under adversarial transformations and ensuring compatibility across different LLMs. == Anti text detection == There is software available designed to bypass AI text detection. In practice, evasion may not require specialized bypass tools. Paraphrasing, style editing, and removal of repeated discourse markers can substantially reduce the effectiveness of detectors that rely on recognizable surface patterns. A study published in August 2023 analyzed 20 abstracts from papers published in the Eye Journal, which were then paraphrased using GPT-4.0. The AI-paraphrased abstracts were examined for plagiarism using QueText and for AI-generated content using Originality.AI. The texts were then re-processed through an adversarial software called Undetectable.ai in order to reduce the AI-detection scores. The study found that the AI detection tool, Originality.AI, identified text generated by GPT-4 with a mean accuracy of 91.3%. However, after reprocessing by Undetectable.ai, the detection accuracy of Originality.ai dropped to a mean accuracy of 27.8%. Some experts also believe that techniques like digital watermarking are ineffective because they can be removed or added to trigger false positives. "A Watermark for Large Language Models" paper by Kirchenbauer et al. (2023) also addresses potential vulnerabilities of watermarking techniques. The authors outline a range of adversarial tactics, including text insertion, deletion, and substitution attacks, that could be used to bypass watermark detection. These attacks vary in complexity, from simple paraphrasing to more sophisticated approaches involving tokenization and homoglyph alterations. The study highlights the challenge of maintaining watermark robustness against attackers who may employ automated paraphrasing tools or even specific language model replacements to alter text spans iteratively while retaining semantic similarity. Experimental results show that although such attacks can degrade watermark strength, they also come at the cost of text quality and increased computational resources. == Image, video, and audio detection == Several purported AI image detection software exist, to detect AI-generated images (for example, those originating from Midjourney or DALL-E). They are not completely reliable. Industry analyses have also noted that AI-driven image recognition systems often struggle in real-world environments, where inconsistent lighting, noise and variable visual inputs reduce detection reliability, a challenge highlighted in modern agricultural quality-control research. Others claim to identify video and audio deepfakes, but this technology is also not fully reliable yet either. Despite debate around the efficacy of watermarking, Google DeepMind is actively developing a detection software called SynthID, which works by inserting a digital watermark that is invisible to the human eye into the pixels of an image.

    Read more →
  • Time Warp Edit Distance

    Time Warp Edit Distance

    In the data analysis of time series, Time Warp Edit Distance (TWED) is a measure of similarity (or dissimilarity) between pairs of discrete time series, controlling the relative distortion of the time units of the two series using the physical notion of elasticity. In comparison to other distance measures, (e.g. DTW (dynamic time warping) or LCS (longest common subsequence problem)), TWED is a metric. Its computational time complexity is O ( n 2 ) {\displaystyle O(n^{2})} , but can be drastically reduced in some specific situations by using a corridor to reduce the search space. Its memory space complexity can be reduced to O ( n ) {\displaystyle O(n)} . It was first proposed in 2009 by P.-F. Marteau. == Definition == δ λ , ν ( A 1 p , B 1 q ) = M i n { δ λ , ν ( A 1 p − 1 , B 1 q ) + Γ ( a p ′ → Λ ) d e l e t e i n A δ λ , ν ( A 1 p − 1 , B 1 q − 1 ) + Γ ( a p ′ → b q ′ ) m a t c h o r s u b s t i t u t i o n δ λ , ν ( A 1 p , B 1 q − 1 ) + Γ ( Λ → b q ′ ) d e l e t e i n B {\displaystyle \delta _{\lambda ,\nu }(A_{1}^{p},B_{1}^{q})=Min{\begin{cases}\delta _{\lambda ,\nu }(A_{1}^{p-1},B_{1}^{q})+\Gamma (a_{p}^{'}\to \Lambda )&{\rm {delete\ in\ A}}\\\delta _{\lambda ,\nu }(A_{1}^{p-1},B_{1}^{q-1})+\Gamma (a_{p}^{'}\to b_{q}^{'})&{\rm {match\ or\ substitution}}\\\delta _{\lambda ,\nu }(A_{1}^{p},B_{1}^{q-1})+\Gamma (\Lambda \to b_{q}^{'})&{\rm {delete\ in\ B}}\end{cases}}} whereas Γ ( α p ′ → Λ ) = d L P ( a p ′ , a p − 1 ′ ) + ν ⋅ ( t a p − t a p − 1 ) + λ {\displaystyle \Gamma (\alpha _{p}^{'}\to \Lambda )=d_{LP}(a_{p}^{'},a_{p-1}^{'})+\nu \cdot (t_{a_{p}}-t_{a_{p-1}})+\lambda } Γ ( α p ′ → b q ′ ) = d L P ( a p ′ , b q ′ ) + d L P ( a p − 1 ′ , b q − 1 ′ ) + ν ⋅ ( | t a p − t b q | + | t a p − 1 − t b q − 1 | ) {\displaystyle \Gamma (\alpha _{p}^{'}\to b_{q}^{'})=d_{LP}(a_{p}^{'},b_{q}^{'})+d_{LP}(a_{p-1}^{'},b_{q-1}^{'})+\nu \cdot (|t_{a_{p}}-t_{b_{q}}|+|t_{a_{p-1}}-t_{b_{q-1}}|)} Γ ( Λ → b q ′ ) = d L P ( b p ′ , b p − 1 ′ ) + ν ⋅ ( t b q − t b q − 1 ) + λ {\displaystyle \Gamma (\Lambda \to b_{q}^{'})=d_{LP}(b_{p}^{'},b_{p-1}^{'})+\nu \cdot (t_{b_{q}}-t_{b_{q-1}})+\lambda } Whereas the recursion δ λ , ν {\displaystyle \delta _{\lambda ,\nu }} is initialized as: δ λ , ν ( A 1 0 , B 1 0 ) = 0 , {\displaystyle \delta _{\lambda ,\nu }(A_{1}^{0},B_{1}^{0})=0,} δ λ , ν ( A 1 0 , B 1 j ) = ∞ f o r j ≥ 1 {\displaystyle \delta _{\lambda ,\nu }(A_{1}^{0},B_{1}^{j})=\infty \ {\rm {{for\ }j\geq 1}}} δ λ , ν ( A 1 i , B 1 0 ) = ∞ f o r i ≥ 1 {\displaystyle \delta _{\lambda ,\nu }(A_{1}^{i},B_{1}^{0})=\infty \ {\rm {{for\ }i\geq 1}}} with a 0 ′ = b 0 ′ = 0 {\displaystyle a'_{0}=b'_{0}=0} === Implementations === An implementation of the TWED algorithm in C with a Python wrapper is available at TWED is also implemented into the Time Series Subsequence Search Python package (TSSEARCH for short) available at [1]. An R implementation of TWED has been integrated into the TraMineR, a R package for mining, describing and visualizing sequences of states or events, and more generally discrete sequence data. Additionally, cuTWED is a CUDA- accelerated implementation of TWED which uses an improved algorithm due to G. Wright (2020). This method is linear in memory and massively parallelized. cuTWED is written in CUDA C/C++, comes with Python bindings, and also includes Python bindings for Marteau's reference C implementation. ==== Python ==== Backtracking, to find the most cost-efficient path: ==== MATLAB ==== Backtracking, to find the most cost-efficient path:

    Read more →
  • Vinberg's algorithm

    Vinberg's algorithm

    In mathematics, Vinberg's algorithm is an algorithm, introduced by Ernest Borisovich Vinberg, for finding a fundamental domain of a hyperbolic reflection group. Conway (1983) used Vinberg's algorithm to describe the automorphism group of the 26-dimensional even unimodular Lorentzian lattice II25,1 in terms of the Leech lattice. == Description of the algorithm == Let Γ < I s o m ( H n ) {\displaystyle \Gamma <\mathrm {Isom} (\mathbb {H} ^{n})} be a hyperbolic reflection group. Choose any point v 0 ∈ H n {\displaystyle v_{0}\in \mathbb {H} ^{n}} ; we shall call it the basic (or initial) point. The fundamental domain P 0 {\displaystyle P_{0}} of its stabilizer Γ v 0 {\displaystyle \Gamma _{v_{0}}} is a polyhedral cone in H n {\displaystyle \mathbb {H} ^{n}} . Let H 1 , . . . , H m {\displaystyle H_{1},...,H_{m}} be the faces of this cone, and let a 1 , . . . , a m {\displaystyle a_{1},...,a_{m}} be outer normal vectors to it. Consider the half-spaces H k − = { x ∈ R n , 1 | ( x , a k ) ≤ 0 } . {\displaystyle H_{k}^{-}=\{x\in \mathbb {R} ^{n,1}|(x,a_{k})\leq 0\}.} There exists a unique fundamental polyhedron P {\displaystyle P} of Γ {\displaystyle \Gamma } contained in P 0 {\displaystyle P_{0}} and containing the point v 0 {\displaystyle v_{0}} . Its faces containing v 0 {\displaystyle v_{0}} are formed by faces H 1 , . . . , H m {\displaystyle H_{1},...,H_{m}} of the cone P 0 {\displaystyle P_{0}} . The other faces H m + 1 , . . . {\displaystyle H_{m+1},...} and the corresponding outward normals a m + 1 , . . . {\displaystyle a_{m+1},...} are constructed by induction. Namely, for H j {\displaystyle H_{j}} we take a mirror such that the root a j {\displaystyle a_{j}} orthogonal to it satisfies the conditions (1) ( v 0 , a j ) < 0 {\displaystyle (v_{0},a_{j})<0} ; (2) ( a i , a j ) ≤ 0 {\displaystyle (a_{i},a_{j})\leq 0} for all i < j {\displaystyle i Read more →

  • Known-item search

    Known-item search

    Known-item search is a specialization of information exploration which represents the activities carried out by searchers who have a particular item in mind. In the context of library catalogs, known‐item search means a search for an item for which the author or title is known. Although the concept of known-item search originated in library science, it is now applied in the context of web search and other online search activities. Known-item search is distinguished from exploratory search, in which a searcher is unfamiliar with the domain of their search goal, unsure about the ways to achieve their goal, and/or unsure about what their goal is.

    Read more →
  • Computational semantics

    Computational semantics

    Computational semantics is a subfield of computational linguistics. Its goal is to elucidate the cognitive mechanisms supporting the generation and interpretation of meaning in humans. It usually involves the creation of computational models that simulate particular semantic phenomena, and the evaluation of those models against data from human participants. While computational semantics is a scientific field, it has many applications in real-world settings and substantially overlaps with Artificial Intelligence. Broadly speaking, the discipline can be subdivided into areas that mirror the internal organization of linguistics. For example, lexical semantics and frame semantics have active research communities within computational linguistics. Some popular methodologies are also strongly inspired by traditional linguistics. Most prominently, the area of distributional semantics, which underpins investigations into embeddings and the internals of Large Language Models, has roots in the work of Zellig Harris. Some traditional topics of interest in computational semantics are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution. Methods employed usually draw from formal semantics or statistical semantics. Computational semantics has points of contact with the areas of lexical semantics (word-sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving). Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

    Read more →
  • Information logistics

    Information logistics

    Information Logistics (IL) deals with the flow of information between human or machine actors within or between any number of organizations that in turn form a value creating network (see, e.g.). IL is closely related to information management, information operations and information technology. == Definition == The term Information Logistics (IL) may be used in either of two ways: Firstly, it can be defined as "managing and controlling information handling processes optimally with respect to time (flow time and capacity), storage, distribution and presentation in such a way that it contributes to company results in concurrence with the costs of capturing (creation, searching, maintenance etc)." (Petri,2017) Thus IL utilizes logistic principles to optimize information handling. Secondly, IL can be seen as a concept using information technology to optimize logistics. A term which is closely related to the first meaning of Information Logistics is Data Logistics, a concept used in Computer Networking. "The study of solutions to problems in Computer Systems that flexibly span resources and services relating to Data Movement, Data Storage and Data Processing." [ref?] Systems that support general Data Logistics solutions thus must span the traditionally separate fields of Networking, File/Database Systems and Process Management. Data Logistics is a more general form of the term Logistical Networking, used as the name of a particular network storage architecture and software stack. == Goal == The goal of Information Logistics is to deliver the right product, consisting of the right information element, in the right format, at the right place at the right time for the right people at the right price and all of this is customer demand driven. If this goal is to be achieved, knowledge workers are best equipped with information for the task at hand for improved interaction with its customers and machines are enabled to respond automatically to meaningful information. Methods for achieving the goal are: the analysis of information demand intelligent information storage the optimization of the flow of information maintaining both security and organizational flexibility integrated information and billing solutions The expression was formed by the Indian mathematician and librarian S. R. Ranganathan . The supply of a product is part of the discipline Logistics. The purpose of this discipline is described as follows: Logistics is the teachings of the plans and the effective and efficient run of supply. The contemporary logistics focuses on the organization, planning, control and implementation of the flow of goods, money, information and people. Information Logistics focusses on information. Information (from Latin informare: "shape, shapes, instruct") means in a general sense everything that adds knowledge and thus reduce ignorance or lack of precision. In a stricter sense, raw data only becomes information to those who can interpret it. Interpreting relevant, related information produces insight that either leads to existing, or eventually builds new, knowledge. == Information element == An information element (IE) is an information component that is located in the organizational value chain. The combination of certain IEs leads to an information product (IP), which is any final product in the form of information that a person needs to have. When a higher number of different IEs are required, it often results in more planning problems in capacity and inherently leads to a non-delivery of the IP. To illustrate the concept of an IP, an example is shown of a bottleneck analysis in HR (by J. Willems 2008). Here, the illustration shows how the information elements (e.g. qualifications) build up the information product (e.g. HR file). == Data logistics == Data logistics is a concept that developed independently of information logistics in the 1990s, in response to the explosion of Internet content and traffic due to the invention of the World Wide Web (WWW). Some motivations for the emergence of interest in Data Logistics included: The incorporation of network hyperlinks into content encoded in HTML encouraged users to freely dereference those links without regard to, or in many cases without even having any knowledge of, the identity (much less the geographical or network topological location of) the target Web server. The growth in the volume of Web hits, combined with the steady increase in the size of Web-delivered objects such as images, audio and video clips resulted in the localized overloading of the bandwidth and processing resources of the local and/or wide area network and/or the Web server infrastructure. The resulting Internet bottleneck can cause Web clients to experience poor performance or complete denial of access to servers that host high volume sites (the so-called Slashdot effect). The growth in all Internet traffic, especially across international telecommunication links, resulted in stress to institutional infrastructure and high costs on networks that billed Internet traffic on a per-use basis. Much of this traffic was redundant, the results of repeated requests by many independent users to access the same stored files and content. Large files and content retrieved from distant Web servers was often delayed due to high delays experienced over long and complex Internet paths. These factors led to interest in the use of large scale storage (and to a lesser extent, processing) resources to cache the response to network requests, first at the Internet endpoint using a Web browser cache and later at intermediate network locations using shared network caches. This line of development also gave rise to Web server replication and other techniques for offloading and distributing the work of delivering large volume Web services to widely dispersed client communities, ultimately resulting in the creation of modern Content delivery networks. At the same time, research efforts in server replication and content delivery gave rise to a number of related projects and strategies, including Logistical Networking (LN). The name LN was intended as an analogy to physical supply chain logistics, in which goods are not only carried from source to destination on networks of roads, but are also stored at warehouses located throughout the transportation infrastructure. This led to a nomenclature in which LN network storage resources are termed "storage depots". The principles that underpin LN have been abstracted into the more general study of scheduling and optimization across the traditional infrastructure silos of Storage, Networking and Processing which was named Data Logistics. === Illustrative examples of data logistics === Data Caching and Replication are classic examples of Data Logistics solutions to problems in Computer Systems and Networking with high data access latencies or data transfer resource limitations. It works mainly across the areas of data transfer and data storage. Dynamic Compression in data transfer is another example which uses computational resources to minimize the bandwidth requirements of data transfer.

    Read more →
  • Principles for a Data Economy

    Principles for a Data Economy

    The Principles for a Data Economy – Data Rights and Transactions is a transatlantic legal project carried out jointly by the American Law Institute (ALI) and the European Law Institute (ELI). The Principles for a Data Economy deals with a range of different legal questions that arise in the data economy. Since data is different from other tradeable items, the Principles draw up legal rules for data transactions and data rights that take into account the interests of different stakeholders involved in the data economy. The Principles are designed to facilitate contractual relations as well as the drafting of model agreements and can guide courts and legislators worldwide. The project proposes a set of principles that can be implemented in any legal system and is designed to work in conjunction with any kind of data privacy/data protection law, intellectual property law or trade secret law. The Principles do not address or seek to change any of the substantive rules of these bodies of law. The Project Team consists of Neil B Cohen and Christiane Wendehorst (as Project Reporters) and Lord John Thomas as well as Steven O. Weise (as Project Chairs). == Characteristics of data == The law governing trades in commerce has historically focused on trade in items that are tangible like goods or on intangible assets, such as shares or licenses. However, data does not fit into any of these traditional categories, nor does it qualify as a service. It is often unclear how traditional legal rules and doctrines can apply to data, as data is different from other assets in many ways. For example, data can be multiplied at basically no cost and can be used in parallel for a variety of different purposes by many different people at the same time (data is a “non-rivalrous” resource). Uncertainty regarding the applicable rules to govern the data economy may inhibit innovation and growth and trouble stakeholders like data-driven industries, start-ups, and consumers. == Stakeholders in the data economy == The Principles have taken the basic types of players and relations which can be found in data ecosystems as a starting point to provide guidance in different situations. The central actors in the data economy are data controllers (also called “data holders”). They are in a position to access the data and decide for which purposes and means this data should be processed. A controller may exercise control all by itself or share it with co-controllers, such as under a data pooling arrangement. Data processors provide the processing of data on a controller’s behalf as a service. Another important group of stakeholders includes those that contribute to the generation of data (e.g. data subjects). Other players in the data economy include data assemblers or data intermediaries (e.g. data trusts). == History of the project and timeline == Before the official adoption of the project by ALI and ELI bodies in 2018, the project team carried out a Feasibility Study from October 2016 to February 2018. In the following years, the project team produced a number of drafts (e.g. “Preliminary Drafts” No. 1 to 4, “Tentative Draft No. 1”) and project progress were regularly discussed with advisory bodies and members of both the ALI and the ELI. The project reporters also included feedback and insights from industry stakeholders and experts that was gained after several meetings and workshops, hosted, inter alia by UNCITRAL, UNIDROIT and several national governmental institutions. Tentative Draft No. 2 was presented at the ALI Annual Meeting in May 2021 and approved by ALI membership. The latest draft ("Final Council Draft") was also approved by the ELI Council and ELI Membership. The Principles for a Data Economy were presented at an international conference with representatives from institutions such as the Uniform Law Commission (ULC), the European Commission, UNIDROIT, the OECD, the International Chamber of Commerce (ICC) and the World Economic Forum (WEF) in October 2021. == Project structure == The current draft (“Tentative Draft No. 2”) of the Principles consists of five Parts that each governs different aspects of the data economy: General Provisions, Data Contracts, Data Rights, Third Party Aspects of Data Activities, and Multi-State Issues. === General Provisions === Part I includes general provisions that apply to all other Parts of the Principles for a Data Economy. This Part sets out the purpose of the Principles: they aim to make existing law in the field of the data economy more coherent and support the development of the law in this field by courts and legislators worldwide. It is also clarified that the Principles have a wide scope of application and can be used in a variety of ways by stakeholders in the data economy. The Principles may, for example, serve private parties as a basis for contract formation, guide the deliberations of arbitral tribunals or inspire national legislation. Part I then defines several key terms, such as ‘digital data’ and ‘data right’. The scope of the Principles is limited to matters where information is recorded as an asset, resource or tradeable commodity and where large amounts of data, rather than single pieces of information, are concerned. This Part also clarifies that remedies with respect to data contracts and data rights are left to the applicable national law. === Data Contracts === Part II lists different types of contracts that often occur in the data economy and establishes two broad categories, namely contracts for the supply and sharing of data and contracts for services with regard to data. Contracts for the supply and sharing of data include, e.g. data transfer contracts or data pooling arrangements, while contracts for services with regard to data cover contracts for the processing of data or data intermediary contracts. The Principles provide default terms for each contract type, on issues such as the manner in which data should supply or which characteristics the data supplied should meet. These default terms 'automatically' become part of the contract unless the parties agree otherwise. === Data Rights === Part III governs legally protected interests of players in the data economy that stem from the characteristics of data as a resource (e.g. its non-rivalrous nature) or from public interest considerations. Such data rights may include the right to data access, the right to require the controller to desist from data activities or to correct incorrect/incomplete data, or even to receive an economic share in profits derived from the use of data. For example, the Principles deal with data rights of stakeholders that had a share in the co-generation of data and identify different factors to be considered in determining whether to afford a party a data right. The underlying idea that parties who have contributed to the generation of data should have some rights in the utilization of the data is also recognized by governmental institutions, such as by the Japanese Ministry of Economy, Trade and Industry (METI), and the term co-generated data, which was coined by the Principles for a Data Economy, has been adopted, inter alia by the European Commission, the German Data Ethics Commission and the Global Partnership on Artificial Intelligence (GPAI). This Part also deals with data rights for the public interest, such as data sharing rights in the field of innovation. === Third Party Aspects === Part IV governs different situations in which data transactions interfere with the rights of third parties. Such rights include intellectual property rights or rights derived from data privacy or data protection law. This Part sets out under which circumstances data activities should be considered wrongful vis à vis another party. For example, a data activity (like data processing or the onward supply of data) could be considered wrongful, if a controller interferes with the rights of data subjects that are protected by data-protection law. A data activity could also be wrongful if the controller is non-compliant with contractual limitations on data activities, enforceable by the protected party (e.g. a controller may only process data for a certain purpose). If someone obtained access to data by unauthorized means (i.e. data “theft”) this could also be considered wrongful. The Part on Third-Party Aspects also takes a detailed look at the effects of the onward supply of data can have on third parties, while balancing the protection of third parties on the one hand, with the interests of data recipients and the desire to encourage data sharing on the other. === Multi-State Issues === As transactions in the data economy are international by nature and hardly occur within one legal system alone, the Part V of the Principles also briefly touches upon the applicability of the rules and doctrines of private international law to such transactions. == Links == Website of the “Principles for a Data Economy – Data Rights and Transaction

    Read more →
  • Leiden algorithm

    Leiden algorithm

    The Leiden algorithm is a community detection algorithm developed by Traag et al at Leiden University. It was developed as a modification of the Louvain method. Like the Louvain method, the Leiden algorithm attempts to optimize modularity in extracting communities from networks; however, it addresses key issues present in the Louvain method, namely poorly connected communities and the resolution limit of modularity. == Improvement over Louvain method == Broadly, the Leiden algorithm uses the same two primary phases as the Louvain algorithm: a local node moving step (though, the method by which nodes are considered in Leiden is more efficient) and a graph aggregation step. However, to address the issues with poorly-connected communities and the merging of smaller communities into larger communities (the resolution limit of modularity), the Leiden algorithm employs an intermediate refinement phase in which communities may be split to guarantee that all communities are well-connected. Consider, for example, the following graph: Three communities are present in this graph (each color represents a community). Additionally, the center "bridge" node (represented with an extra circle) is a member of the community represented by blue nodes. Now consider the result of a node-moving step which merges the communities denoted by red and green nodes into a single community (as the two communities are highly connected): Notably, the center "bridge" node is now a member of the larger red community after node moving occurs (due to the greedy nature of the local node moving algorithm). In the Louvain method, such a merging would be followed immediately by the graph aggregation phase. However, this causes a disconnection between two different sections of the community represented by blue nodes. In the Leiden algorithm, the graph is instead refined: The Leiden algorithm's refinement step ensures that the center "bridge" node is kept in the blue community to ensure that it remains intact and connected, despite the potential improvement in modularity from adding the center "bridge" node to the red community. == Graph components == Before defining the Leiden algorithm, it will be helpful to define some of the components of a graph. === Vertices and edges === A graph is composed of vertices (nodes) and edges. Each edge is connected to two vertices, and each vertex may be connected to zero or more edges. Edges are typically represented by straight lines, while nodes are represented by circles or points. In set notation, let V {\displaystyle V} be the set of vertices, and E {\displaystyle E} be the set of edges: V := { v 1 , v 2 , … , v n } E := { e i j , e i k , … , e k l } {\displaystyle {\begin{aligned}V&:=\{v_{1},v_{2},\dots ,v_{n}\}\\E&:=\{e_{ij},e_{ik},\dots ,e_{kl}\}\end{aligned}}} where e i j {\displaystyle e_{ij}} is the directed edge from vertex v i {\displaystyle v_{i}} to vertex v j {\displaystyle v_{j}} . We can also write this as an ordered pair: e i j := ( v i , v j ) {\displaystyle {\begin{aligned}e_{ij}&:=(v_{i},v_{j})\end{aligned}}} === Community === A community is a unique set of nodes: C i ⊆ V C i ⋂ C j = ∅ ∀ i ≠ j {\displaystyle {\begin{aligned}C_{i}&\subseteq V\\C_{i}&\bigcap C_{j}=\emptyset ~\forall ~i\neq j\end{aligned}}} and the union of all communities must be the total set of vertices: V = ⋃ i = 1 C i {\displaystyle {\begin{aligned}V&=\bigcup _{i=1}C_{i}\end{aligned}}} === Partition === A partition is the set of all communities: P = { C 1 , C 2 , … , C n } {\displaystyle {\begin{aligned}{\mathcal {P}}&=\{C_{1},C_{2},\dots ,C_{n}\}\end{aligned}}} == Partition quality == How communities are partitioned is an integral part on the Leiden algorithm. How partitions are decided can depend on how their quality is measured. Additionally, many of these metrics contain parameters of their own that can change the outcome of their communities. === Modularity === Modularity is a highly used quality metric for assessing how well a set of communities partition a graph. The equation for this metric is defined for an adjacency matrix, A, as: Q = 1 2 m ∑ i j ( A i j − k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q={\frac {1}{2m}}\sum _{ij}(A_{ij}-{\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: A i j {\displaystyle A_{ij}} represents the edge weight between nodes i {\displaystyle i} and j {\displaystyle j} ; see Adjacency matrix; k i {\displaystyle k_{i}} and k j {\displaystyle k_{j}} are the sum of the weights of the edges attached to nodes i {\displaystyle i} and j {\displaystyle j} , respectively; m {\displaystyle m} is the sum of all of the edge weights in the graph; c i {\displaystyle c_{i}} and c j {\displaystyle c_{j}} are the communities to which the nodes i {\displaystyle i} and j {\displaystyle j} belong; and δ {\displaystyle \delta } is Kronecker delta function: δ ( c i , c j ) = { 1 if c i and c j are the same community 0 otherwise {\displaystyle {\begin{aligned}\delta (c_{i},c_{j})&={\begin{cases}1&{\text{if }}c_{i}{\text{ and }}c_{j}{\text{ are the same community}}\\0&{\text{otherwise}}\end{cases}}\end{aligned}}} === Reichardt Bornholdt Potts Model (RB) === One of the most well used metrics for the Leiden algorithm is the Reichardt Bornholdt Potts Model (RB). This model is used by default in most mainstream Leiden algorithm libraries under the name RBConfigurationVertexPartition. This model introduces a resolution parameter γ {\displaystyle \gamma } and is highly similar to the equation for modularity. This model is defined by the following quality function for an adjacency matrix, A, as: Q = ∑ i j ( A i j − γ k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q=\sum _{ij}(A_{ij}-\gamma {\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: γ {\displaystyle \gamma } represents a linear resolution parameter === Constant Potts Model (CPM) === Another metric similar to RB is the Constant Potts Model (CPM). This metric also relies on a resolution parameter γ {\displaystyle \gamma } The quality function is defined as: H = − ∑ i j ( A i j w i j − γ ) δ ( c i , c j ) {\displaystyle H=-\sum _{ij}(A_{ij}w_{ij}-\gamma )\delta (c_{i},c_{j})} === Understanding Potts Model resolution parameters/Resolution limit === Typically Potts models such as RB or CPM include a resolution parameter in their calculation. Potts models are introduced as a response to the resolution limit problem that is present in modularity maximization based community detection. The resolution limit problem is that, for some graphs, maximizing modularity may cause substructures of a graph to merge and become a single community and thus smaller structures are lost. These resolution parameters allow modularity adjacent methods to be modified to suit the requirements of the user applying the Leiden algorithm to account for small substructures at a certain granularity. The figure on the right illustrates why resolution can be a helpful parameter when using modularity based quality metrics. In the first graph, modularity only captures the large scale structures of the graph; however, in the second example, a more granular quality metric could potentially detect all substructures in a graph. == Algorithm == The Leiden algorithm starts with a graph of disorganized nodes (a) and sorts it by partitioning them to maximize modularity (the difference in quality between the generated partition and a hypothetical randomized partition of communities). The method it uses is similar to the Louvain algorithm, except that after moving each node it also considers that node's neighbors that are not already in the community it was placed in. This process results in our first partition (b), also referred to as P {\displaystyle {\mathcal {P}}} . Then the algorithm refines this partition by first placing each node into its own individual community and then moving them from one community to another to maximize modularity. It does this iteratively until each node has been visited and moved, and each community has been refined - this creates partition (c), which is the initial partition of P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} . Then an aggregate network (d) is created by turning each community into a node. P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} is used as the basis for the aggregate network while P {\displaystyle {\mathcal {P}}} is used to create its initial partition. Because we use the original partition P {\displaystyle {\mathcal {P}}} in this step, we must retain it so that it can be used in future iterations. These steps together form the first iteration of the algorithm. In subsequent iterations, the nodes of the aggregate network (which each represent a community) are once again placed into their own individual communities and then sorted according to modularity to form a new P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} , forming (e) in the above graphic. In the case depicted by the graph, the nodes were already sorted optimally, so no change too

    Read more →
  • Are You Dead?

    Are You Dead?

    Are You Dead? (Chinese: 死了么; pinyin: Sǐleme), also known by its English name Demumu, is a Chinese application designed for young people living alone. It requires setting up one emergency contact and sends automatic notifications if the user has not checked in via the app for consecutive days. The app was released on the App Store on 10 June 2025. In early January 2026, the application gained popularity due to its name and the issue of safety for people living alone, and ranked high on the list of paid applications in the Chinese region of the Apple App Store before being removed. The app's rise in popularity sparked discussions about taboos about death in China. == History == Are You Dead? was founded and operated independently by three people born in the 1990s, and developed in a way that involved remote collaboration in their spare time. According to the New Yellow River report, Guo, the product manager, said that the application was designed for young people and that the inspiration came from the discussion of netizens on social platforms about "an app that everyone must have and will definitely download" that he observed two or three years ago. The name was also "not their original creation". After realizing its potential demand and social significance, the team successfully registered the name and completed the product development in about a month. Regarding the development entity, the New Yellow River cited information from the Apple App Store that the application was developed by Yuejing (Zhengzhou) Technology Service Co., Ltd. According to Tianyancha information, the company was established in March 2025 with a registered capital of 100,000 yuan. === Rise in popularity === The app has been generating buzz on social media since 9 January 2026, due to its name and the topic of safety for people living alone. Around 10 January, it topped the Apple paid app chart. As of 10:00 a.m. on January 11, it ranked first in the App Store paid app chart. It also ranked highly in the utility app chart; it ranked first or second in the paid utility app charts in the United States, Singapore and Hong Kong, and first or fourth in Australia and Spain. The app was subsequently removed from the Apple App Store in China. In terms of functionality and usage, First Financial praised the product for its "simple interface and single function," but pointed out that the interface lacks a display of consecutive check-in days, and there is also the possibility that users may forget to check in, leading to the mistaken issuance of reminders. In addition, since the application mainly relies on email reminders and lacks SMS or telephone notifications, it does not conform to Chinese social habits; the untimely notifications also make the application more like a "death notification" tool, losing its early warning significance for emergency rescue. Hu Xijin, former editor-in-chief of the Global Times, commented on the application on Weibo that it is "really good and can help many lonely elderly people." The Beijing News Quick Review pointed out that the role of technical tools is limited and needs to be connected with real support such as community patrols and liaison mechanisms. Due to the price increase, there have also been questions about the motivation for the price increase. The app's rise in popularity sparked discussions about taboos about death in China. Regarding the popularity of the application, both Southern Metropolis Daily and The Beijing News commented that it reflects the public issue of the risks of living alone and reflects the general anxiety of the living alone group about dying alone. Shangguan News further pointed out that although such technology products provide a certain "low-cost sense of security", their "cold notifications" may not only cause false alarms, but also highlight the embarrassing reality that "there is no one to fill in the emergency contact". It also emphasized that algorithms or applications cannot bring true happiness and called on society to reconstruct a support network full of humanistic care while relying on technology. The name of the application has also sparked controversy. Most netizens believe that the name "Are You Dead?" is unlucky and makes it awkward to share the application. They suggest changing it to a milder name such as "Are You Alive?". Hu Xijin also said that the name change could "give the elderly who use it more psychological comfort" and "believe that the application will become more popular after the name change". Some people also believe that this straightforward name just points out the real dilemma faced by people living alone and has a special meaning. BBC News commented that the name "Are You Dead" is playing a word game with Ele.me (Chinese: 饿了么; pinyin: Èleme) and the pronunciation is also similar. Legal professionals believe that its name is highly similar to Ele.me and may cause confusion. They also raised the possibility of trademark infringement and unfair competition. However, the developers said that the application is developed for young people and death is not a sensitive topic. They will "consider launching a new application that is more suitable for middle-aged and elderly people". They have not yet received any name change requests from relevant departments. On the evening of 13 January 2026, the Are You Dead? team announced that it would change its name to the English brand name Demumu in the upcoming new version. On 11 January, the development team also issued a statement through its official Weibo account, stating that it would study the renaming suggestion and plan to enrich the SMS reminder function, consider adding the message function and explore the direction of age-friendly products; it also stated that it would launch an 8 yuan paid plan to cover the costs of SMS, servers, etc., and welcomed investors to discuss cooperation. In terms of financing and valuation, it plans to sell 10% of the company's shares for 1 million yuan and proposed a valuation of 10 million yuan. On the evening of January 15, the application was removed from the app store in mainland China. == Functions == The application does not require users to enter phone numbers or other information to register. After filling in their name and setting an emergency contact, users can click the sign-in button every day. If they fail to sign in for two consecutive days, the system will send an email reminder to the emergency contact the next day. In addition, users can also bind a smart bracelet to monitor physiological signs, pre-designate a hearse driver and funeral music, and trigger the "one-click body collection" function when no pulse is detected. The application was initially available for free download, but a one yuan paid download option was introduced at the end of 2025. In January 2026, the application team issued a statement saying that an 8 yuan paid option would be launched based on the costs of SMS, servers, etc.

    Read more →
  • UI data binding

    UI data binding

    UI data binding is a software design pattern to simplify development of GUI applications. UI data binding binds UI elements to an application domain model. Most frameworks employ the Observer pattern as the underlying binding mechanism. To work efficiently, UI data binding has to address input validation and data type mapping. A bound control is a widget whose value is tied or bound to a field in a recordset (e.g., a column in a row of a table). Changes made to data within the control are automatically saved to the database when the control's exit event triggers. == Example == == Data binding frameworks and tools == === Delphi === DSharp third-party data binding tool OpenWire Visual Live Binding - third-party visual data binding tool === Java === JFace Data Binding JavaFX Property === .NET === Windows Forms data binding overview WPF data binding overview Avalonia Unity 3D data binding framework (available in modifications for NGUI, iGUI and EZGUI libraries) === JavaScript === Angular AngularJS Backbone.js Ember.js Datum.js knockout.js Meteor, via its Blaze live update engine OpenUI5 React Vue.js

    Read more →
  • Known-item search

    Known-item search

    Known-item search is a specialization of information exploration which represents the activities carried out by searchers who have a particular item in mind. In the context of library catalogs, known‐item search means a search for an item for which the author or title is known. Although the concept of known-item search originated in library science, it is now applied in the context of web search and other online search activities. Known-item search is distinguished from exploratory search, in which a searcher is unfamiliar with the domain of their search goal, unsure about the ways to achieve their goal, and/or unsure about what their goal is.

    Read more →
  • Discoverability

    Discoverability

    Discoverability is the degree to which something, especially a piece of content or information, can be found in a search of a file, database, or other information system. Discoverability is a concern in library and information science, many aspects of digital media, software and web development, and in marketing, since products and services cannot be used if people cannot find it or do not understand what it can be used for. In human-computer interaction the term is further used to describe the discoverability of interactions, features and interactive systems overall . Metadata, or "information about information", such as a book's title, a product's description, or a website's keywords, affects how discoverable something is on a database or online. Adding metadata to a product that is available online can make it easier for end users to find the product. For example, if a song file is made available online, making the title, band name, genre, year of release, and other pertinent information available in connection with this song means the file can be retrieved more easily. The organization of information through the implementation of alphabetical structures or the integration of content into search engines exemplifies strategies employed to enhance the discoverability of information. The concept of discoverability, while related to but distinct from accessibility and usability, which are other qualities that affect the usefulness of a piece of information, is a critical aspect of information retrieval. == Etymology == The concept of "discoverability" in an information science and online context is a loose borrowing from the concept of the similar name in the legal profession. In law, "discovery" is a pre-trial procedure in a lawsuit in which each party, through the law of civil procedure, can obtain evidence from the other party or parties by means of discovery devices such as a request for answers to interrogatories, request for production of documents, request for admissions and depositions. Discovery can be obtained from non-parties using subpoenas. When a discovery request is objected to, the requesting party may seek the assistance of the court by filing a motion to compel discovery. == Purpose == The usability of any piece of information directly relates to how discoverable it is, either in a "walled garden" database or on the open Internet. The quality of information available on this database or on the Internet depends upon the quality of the meta-information about each item, product, or service. In the case of a service, because of the emphasis placed on service reusability, opportunities should exist for reuse of this service. However, reuse is only possible if information is discoverable in the first place. To make items, products, and services discoverable, the process is as follows: Document the information about the item, product or service (the metadata) in a consistent manner. Store the documented information (metadata) in a searchable repository. while technically a human-searchable repository, such as a printed paper list would qualify, "searchable repository" is usually taken to mean a computer-searchable repository, such as a database that a human user can search using some type of search engine or "find" feature. Enable search for the documented information in an efficient manner. supports number 2, because while reading through a printed paper list by hand might be feasible in a theoretical sense, it is not time and cost-efficient in comparison with computer-based searching. Apart from increasing the reuse potential of the services, discoverability is also required to avoid development of solution logic that is already contained in an existing service. To design services that are not only discoverable but also provide interpretable information about their capabilities, the service discoverability principle provides guidelines that could be applied during the service-oriented analysis phase of the service delivery process. === Specific to digital media === In relation to audiovisual content, according to the meaning given by the Canadian Radio-television and Telecommunications Commission (CRTC) for the purpose of its 2016 Discoverability Summit, discoverability can be summed up to the intrinsic ability of given content to "stand out of the lot", or to position itself so as to be easily found and discovered. A piece of audiovisual content can be a movie, a TV series, music, a book (eBook), an audio book or podcast. When audiovisual content such as a digital file for a TV show, movie, or song, is made available online, if the content is "tagged" with identifying information such as the names of the key artists (e.g., actors, directors and screenwriters for TV shows and movies; singers, musicians and record producers for songs) and the genres (for movies genres, music genres, etc.). When users interact with online content, algorithms typically determine what types of content the user is interested in, and then a computer program suggests "more like this", which is other content that the user may be interested in. Different websites and systems have different algorithms, but one approach, used by Amazon (company) for its online store, is to indicate to a user: "customers who bought x also bought y" (affinity analysis, collaborative filtering). This example is oriented around online purchasing behaviour, but an algorithm could also be programmed to provide suggestions based on other factors (e.g., searching, viewing, etc.). Discoverability is typically referred to in connection with search engines. A highly "discoverable" piece of content would appear at the top, or near the top of a user's search results. A related concept is the role of "recommendation engines", which give a user recommendations based on his/her previous online activity. Discoverability applies to computers and devices that can access the Internet, including various console video game systems and mobile devices such as tablets and smartphones. When producers make an effort to promote content (e.g., a TV show, film, song, or video game), they can use traditional marketing (billboards, TV ads, radio ads) and digital ads (pop-up ads, pre-roll ads, etc.), or a mix of traditional and digital marketing. Even before the user's intervention by searching for a certain content or type of content, discoverability is the prime factor which contributes to whether a piece of audiovisual content will be likely to be found in the various digital modes of content consumption. As of 2017, modes of searching include looking on Netflix for movies, Spotify for music, Audible for audio books, etc., although the concept can also more generally be applied to content found on Twitter, Tumblr, Instagram, and other websites. It involves more than a content's mere presence on a given platform; it can involve associating this content with "keywords" (tags), search algorithms, positioning within different categories, metadata, etc. Thus, discoverability enables as much as it promotes. For audiovisual content broadcast or streamed on digital media using the Internet, discoverability includes the underlying concepts of information science and programming architecture, which are at the very foundation of the search for a specific product, information or content. === Human-Computer Interaction === In human–computer interaction (HCI), discoverability refers to the ability of users to perceive and comprehend a system, function, or input method upon encountering it, despite a lack of prior awareness or knowledge, whether through intentional effort or serendipitously . The concept was popularised by Don Norman, who framed it around whether users can determine what actions are possible and how to perform them . Discoverability is considered a precondition for learnability, though the two concepts are frequently conflated in the literature . == Applications == === Within a webpage === Within a specific webpage or software application ("app"), the discoverability of a feature, content or link depends on a range of factors, including the size, colour, highlighting features, and position within the page. When colour is used to communicate the importance of a feature or link, designers typically use other elements as well, such as shadows or bolding, for individuals, who cannot see certain colours. Just as traditional paper printing created other physical locations that stood out, such as being "above the fold" of a newspaper versus "below the fold", a web page or app's screenview may have certain locations that give features additional visibility to users, such as being right at the bottom of the web page or screen. The positional advantages or disadvantages of various locations depend on different cultures and languages (e.g., left to right vs. right to left). Some locations have become established, such as having toolbars at the top of a screen or webpage. Some designers have argued t

    Read more →
  • ARIS Express

    ARIS Express

    ARIS Express is a free-of-charge modeling tool for business process analysis and management. It supports different modeling notations such as BPMN 2, Event-driven Process Chains (EPC), Organizational charts, process landscapes, whiteboards, etc. ARIS Express was initially developed by IDS Scheer, which was bought by Software AG in December 2010. The tool is provided as freeware on the ARIS Community webpage. ARIS Express is notable - having been mentioned in research published by Schumm, Garcia, Krumnow and Greenwood amongst others. == History == ARIS Express was first announced on April 28, 2009 in a press release by IDS Scheer. The first release was on July 28, 2009 in a public beta test on ARIS Community. Only people, who registered before for the beta test were allowed to download and test this beta version. This closed beta test was followed with another public beta test. The official release of ARIS Express 1.0 was on September 9, 2009. In this first stable version, features such as Microsoft Visio import were added, which were not present in the version for the public beta test. On February 26, 2010, ARIS Express 2.0 was released. Major changes compared to version 1.0 include BPMN 2 support, integrated spellchecking and ARISalign integration. On May 25, 2010, version 2.1 of ARIS Express was released. This update improves BPMN 2 support, provides a new online help system for instant feedback, better ARISalign integration and some new symbols in different diagrams. Along with the release, a poster showing the most important modeling concepts supported by ARIS Express was released. In addition, an executable setup is provided for Microsoft Windows-based systems. Beginning of July, an update was released as ARIS Express 2.2, providing bug fixes only. ARIS Express version 2.2 is the current stable release. An official press release published mid of August 2010 said there are more than 50,000 downloads of ARIS Express. On February 2, 2011, version 2.3 of ARIS Express was released. This new version changes the file format of ARIS Express so that models can be shown in an interactive model viewer in ARIS Community. The release announcement contained no details about additional features or changes. == Functionality == === Overview === ARIS Express is a standalone single-user application. It is divided in a home screen and a modeling environment. The home screen is used to create new models or open recently edited ones. The modeling environment is used to edit diagrams. === Supported notations === The following notations are supported by ARIS Express. Users can create diagrams containing an unlimited number of modeling objects. BPMN 2 Collaboration Diagrams Event-driven Process Chains (EPC) Organizational charts Process landscape (value-added chain diagram) Data model in ERM notation IT infrastructure (network diagram) System landscape (component diagram) Whiteboard General diagram === Noteworthy features === Besides common features such as creating new diagrams, saving them as files or adding objects to the modeling canvas, ARIS Express also provides some noteworthy features, which can't be found in most comparable modeling tools. fragments - Often used modeling constructs such as an exclusive decision in a process model can be stored as fragments so that they are available for direct reuse in another model. smart designs - The flow of a process model or hierarchies of other models can be captured in a spreadsheet-like interface. While entering the data in the spreadsheet, the model is generated and laid out in the background while typing. mini toolbar - While moving the mouse pointer over an object in a diagram, a small toolbar is shown allowing quick access to the most important modeling actions. Microsoft Visio import - Diagrams created with Microsoft Visio 2007 or above can be imported to and edited in ARIS Express. A Microsoft Visio export is not provided. ARISalign import - Models created on the online collaboration platform ARISalign can be opened and edited in ARIS Express. === Exports === ARIS Express can export diagrams to different formats such as: PDF JPEG PNG EMF ADF ADF is the file format of ARIS Express. The professional tools of ARIS Platform are able to import diagrams stored in the ADF format. Yet, there are major limitations during import - namely, each object in diagram will be treated as unique object, despite having same type and name, forcing redrawing large sections of diagrams after import. Besides export formats, it is also possible to use the clipboard to copy and paste an ARIS Express diagram into typical office suites such as Microsoft PowerPoint. == Technology == ARIS Express is a Java-based application, which shares some of the features of ARIS Platform products such as ARIS Business Architect and ARIS Business Designer. In contrast to ARIS Platform products, ARIS Express doesn't use a central database for model storage. Instead, each diagram is stored in an ADF file. ARIS Express uses Java Web Start. After download, the application can be started immediately without installation procedure. For Microsoft Windows based systems, an ordinary setup is provided, too. ARIS Express requires Java 1.6.10 or above. On first startup, the user must enter a valid ARIS Community account to register the application. Creating an ARIS Community account is free-of-charge. After installation, no Internet connection is needed to use ARIS Express. ARIS Express uses a mechanism provided by Java Web Start to automatically update the application as soon as a new version becomes available and the user is connected to the Internet during startup. There are reports that this automated update failed while upgrading from version 1.0 to version 2.0. As ARIS Express is based on Java Web Start, it can be installed on any platform supported by Java. The ARIS Community and other Internet sources have reports of successful deployment of ARIS Express on other operating systems than Microsoft Windows. However, ARIS Express is officially supported only on Microsoft Windows. == Miscellaneous == A quick reference sheet is available for ARIS Express. The poster shows all supported diagrams plus the most important modelling concepts for each supported modelling language. ARIS Express contains a hidden game, a so-called Easter Egg. The game can be started by clicking several times on the product logo in the about dialog. Highscores achieved in the game can be submitted to a special page in ARIS Community. A Firefox Personas is available for ARIS Express.

    Read more →
  • Conceptualization (information science)

    Conceptualization (information science)

    In information science, a conceptualization is an abstract simplified view of some selected parts of the world, containing the objects, concepts, and other entities that are presumed of interest for some particular purpose and the relationships between them. An explicit specification of a conceptualization is an ontology, and it may occur that a conceptualization can be realized by several distinct ontologies. An ontological commitment in describing ontological comparisons is taken to refer to that subset of elements of an ontology shared with all the others. "An ontology is language-dependent", its objects and interrelations described within the language it uses, while a conceptualization is always the same, more general, its concepts existing "independently of the language used to describe it". The relation between these terms is shown in the figure to the right. Not all workers in knowledge engineering use the term "conceptualization", but instead refer to the conceptualization itself, or to the ontological commitment of all its realizations, as an overarching ontology. == Purpose and implementation == As a higher level abstraction, a conceptualization facilitates the discussion and comparison of its various ontologies, facilitating knowledge sharing and reuse. Each ontology based upon the same overarching conceptualization maps the conceptualization into specific elements and their relationships. The question then arises as to how to describe the "conceptualization" in terms that can encompass multiple ontologies. This issue has been called the Tower of Babel problem, that is, how can persons used to one ontology talk with others using a different ontology? This problem is easily grasped, but a general resolution is not at hand. It can be a "bottom-up" or a "top-down" approach, or something in between. However, in more artificial situations, such as information systems, the idea of a "conceptualization" and the "ontological commitment" of various ontologies that realize the "conceptualization" is possible. The formation of a conceptualization and its ontologies involves these steps: specification of the conceptualization ontology concepts: every definition involves the definitions of other terms relationships between the concepts: this step maps conceptual relationships onto the ontology structure groups of concepts: this step may lead to the creation of sub-ontologies formal description of ontology commitments, for example, to make them computer readable An example of moving conception into a language leading to a variety of ontologies is the expression of a process in pseudocode (a strictly structured form of ordinary language) leading to implementation in several different formal computer languages like Lisp or Fortran. The pseudocode makes it easier to understand the instructions and compare implementations, but the formal languages make possible the compilation of the ideas as computer instructions. Another example is mathematics, where a very general formulation (the analog of a conceptualization) is illustrated with "applications" that are more specialized examples. For instance, aspects of a function space can be illustrated using a vector space or a topological space that introduce interpretations of the "elements" of the conceptualization and additional relationships between them but preserve the connections required in the function space.

    Read more →
  • Algorithm characterizations

    Algorithm characterizations

    Algorithm characterizations are attempts to formalize the word algorithm. Algorithm does not have a generally accepted formal definition. Researchers are actively working on this problem. This article will present some of the "characterizations" of the notion of "algorithm" in more detail. == The problem of definition == Over the last 200 years, the definition of the algorithm has become more complicated and detailed as researchers have tried to pin down the term. Indeed, there may be more than one type of "algorithm". But most agree that algorithm has something to do with defining generalized processes for the creation of "output" integers from other "input" integers – "input parameters" arbitrary and infinite in extent, or limited in extent but still variable—by the manipulation of distinguishable symbols (counting numbers) with finite collections of rules that a person can perform with paper and pencil. The most common number-manipulation schemes—both in formal mathematics and in routine life—are: (1) the recursive functions calculated by a person with paper and pencil, and (2) the Turing machine or its Turing equivalents—the primitive register-machine or "counter-machine" model, the random-access machine model (RAM), the random-access stored-program machine model (RASP) and its functional equivalent "the computer". When we are doing "arithmetic" we are really calculating by the use of "recursive functions" in the shorthand algorithms we learned in grade school, for example, adding and subtracting. The proofs that every "recursive function" we can calculate by hand we can compute by machine and vice versa—note the usage of the words calculate versus compute—is remarkable. But this equivalence together with the thesis (unproven assertion) that this includes every calculation/computation indicates why so much emphasis has been placed upon the use of Turing-equivalent machines in the definition of specific algorithms, and why the definition of "algorithm" itself often refers back to "the Turing machine". This is discussed in more detail under Stephen Kleene's characterization. The following are summaries of the more famous characterizations (Kleene, Markov, Knuth) together with those that introduce novel elements—elements that further expand the definition or contribute to a more precise definition. [ A mathematical problem and its result can be considered as two points in a space, and the solution consists of a sequence of steps or a path linking them. Quality of the solution is a function of the path. There might be more than one attribute defined for the path, e.g. length, complexity of shape, an ease of generalizing, difficulty, and so on. ] == Chomsky hierarchy == There is more consensus on the "characterization" of the notion of "simple algorithm". All algorithms need to be specified in a formal language, and the "simplicity notion" arises from the simplicity of the language. The Chomsky (1956) hierarchy is a containment hierarchy of classes of formal grammars that generate formal languages. It is used for classifying of programming languages and abstract machines. From the Chomsky hierarchy perspective, if the algorithm can be specified on a simpler language (than unrestricted), it can be characterized by this kind of language, else it is a typical "unrestricted algorithm". Examples: a "general purpose" macro language, like M4 is unrestricted (Turing complete), but the C preprocessor macro language is not, so any algorithm expressed in C preprocessor is a "simple algorithm". See also Relationships between complexity classes. == Features of a good algorithm == The following are desirable features of a well-defined algorithm, as discussed in Scheider and Gersting (1995): Unambiguous Operations: an algorithm must have specific, outlined steps. The steps should be exact enough to precisely specify what to do at each step. Well-Ordered: The exact order of operations performed in an algorithm should be concretely defined. Feasibility: All steps of an algorithm should be possible (also known as effectively computable). Input: an algorithm should be able to accept a well-defined set of inputs. Output: an algorithm should produce some result as an output, so that its correctness can be reasoned about. Finiteness: an algorithm should terminate after a finite number of instructions. Properties of specific algorithms that may be desirable include space and time efficiency, generality (i.e. being able to handle many inputs), or determinism. == 1881 John Venn's negative reaction to W. Stanley Jevons's Logical Machine of 1870 == In early 1870 W. Stanley Jevons presented a "Logical Machine" (Jevons 1880:200) for analyzing a syllogism or other logical form e.g. an argument reduced to a Boolean equation. By means of what Couturat (1914) called a "sort of logical piano [,] ... the equalities which represent the premises ... are "played" on a keyboard like that of a typewriter. ... When all the premises have been "played", the panel shows only those constituents whose sum is equal to 1, that is, ... its logical whole. This mechanical method has the advantage over VENN's geometrical method..." (Couturat 1914:75). For his part John Venn, a logician contemporary to Jevons, was less than thrilled, opining that "it does not seem to me that any contrivances at present known or likely to be discovered really deserve the name of logical machines" (italics added, Venn 1881:120). But of historical use to the developing notion of "algorithm" is his explanation for his negative reaction with respect to a machine that "may subserve a really valuable purpose by enabling us to avoid otherwise inevitable labor": (1) "There is, first, the statement of our data in accurate logical language", (2) "Then secondly, we have to throw these statements into a form fit for the engine to work with – in this case the reduction of each proposition to its elementary denials", (3) "Thirdly, there is the combination or further treatment of our premises after such reduction," (4) "Finally, the results have to be interpreted or read off. This last generally gives rise to much opening for skill and sagacity." He concludes that "I cannot see that any machine can hope to help us except in the third of these steps; so that it seems very doubtful whether any thing of this sort really deserves the name of a logical engine."(Venn 1881:119–121). == 1943, 1952 Stephen Kleene's characterization == This section is longer and more detailed than the others because of its importance to the topic: Kleene was the first to propose that all calculations/computations—of every sort, the totality of—can equivalently be (i) calculated by use of five "primitive recursive operators" plus one special operator called the mu-operator, or be (ii) computed by the actions of a Turing machine or an equivalent model. Furthermore, he opined that either of these would stand as a definition of algorithm. A reader first confronting the words that follow may well be confused, so a brief explanation is in order. Calculation means done by hand, computation means done by Turing machine (or equivalent). (Sometimes an author slips and interchanges the words). A "function" can be thought of as an "input-output box" into which a person puts natural numbers called "arguments" or "parameters" (but only the counting numbers including 0—the nonnegative integers) and gets out a single nonnegative integer (conventionally called "the answer"). Think of the "function-box" as a little man either calculating by hand using "general recursion" or computing by Turing machine (or an equivalent machine). "Effectively calculable/computable" is more generic and means "calculable/computable by some procedure, method, technique ... whatever...". "General recursive" was Kleene's way of writing what today is called just "recursion"; however, "primitive recursion"—calculation by use of the five recursive operators—is a lesser form of recursion that lacks access to the sixth, additional, mu-operator that is needed only in rare instances. Thus most of life goes on requiring only the "primitive recursive functions." === 1943 "Thesis I", 1952 "Church's Thesis" === In 1943 Kleene proposed what has come to be known as Church's thesis: "Thesis I. Every effectively calculable function (effectively decidable predicate) is general recursive" (First stated by Kleene in 1943 (reprinted page 274 in Davis, ed. The Undecidable; appears also verbatim in Kleene (1952) p.300) In a nutshell: to calculate any function the only operations a person needs (technically, formally) are the 6 primitive operators of "general" recursion (nowadays called the operators of the mu recursive functions). Kleene's first statement of this was under the section title "12. Algorithmic theories". He would later amplify it in his text (1952) as follows: "Thesis I and its converse provide the exact definition of the notion of a calculation (decision) procedure or algorithm, for the

    Read more →