AI Assistant Intellij

AI Assistant Intellij — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Write or Die

    Write or Die

    Write or Die is an online web application designed to combat writer's block by letting users of the application punish themselves if they slow down or stop typing in the application's window. How severe the punishments are depends on the mode the user chooses, which ranges from "Gentle" to "Kamikaze". It was reviewed by publications PCWorld, the Los Angeles Times and The Guardian, and it was most notably used by writers Helen Oyeyemi and David Nicholls. The creator, Jeff Printy, explained that he wrote the application because he wants "to be published and make a living as a writer."

    Read more →
  • Maritime Informatics

    Maritime Informatics

    Maritime Informatics is a thematic topic within the broader discipline of informatics. It can be considered as both a field of study and domain of application. As an application domain, it is the outlet of innovations originating from data science and artificial intelligence; as a field of study, it is positioned between computer science and marine engineering. == Beginnings of maritime informatics == As a result of the increasing levels of digitalisation occurring in the maritime sector starting around 2010 and stimulated by the EU-endorsed MonaLisa project for sea traffic management (STM), a number of academics and shipping industry leaders recognised that the maritime transportation sector would benefit from a specific field of study and application to be known as Maritime Informatics - the use of information systems, data sharing and data analytics in the business and operations of maritime transportation. They considered that it would lead to improvements in efficiency, safety, resilience, and ecological sustainability - all of which are currently lacking for many aspects of sea transport. One of the first public airings of the concept of Maritime Informatics was a presentation delivered on 11 September 2014 in Gothenburg, Sweden. A proposal for an inaugural minitrack on Maritime Informatics was accepted for the 2015 Americas Conference on Information Systems in Puerto Rico where three papers were presented. Since then numerous publications has been brought forward captured at www.maritimeinformatics.org and in late 2020 the first reference book on Maritime Informatics was co-written by 81 expert contributors (47 practitioners and 34 researchers) from 20 countries. Most impactful authors and journals in the domain have been documented in a review paper. Dimitrios Zissis, Luca Cazzanti and Leonardo M. Millefiori are the top three authors; top journals and conferences include Ocean Engineering, Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, Sensors, the international Conference On Engineering, Technology And Innovation, Expert Systems With Applications, IEEE Access, and Journal of Navigation. == Background == The shipping industry has several particular organisational aspects that are recognised and taken into account in maritime informatics: It is predominantly a self-organising ecosystem Many activities are undertaken as part of episodic tight coupling There is a so-called maritime stack There is increasing pressure to balance capital productivity and energy efficiency There is the potential virtuous interplay between different types of systems == Data sharing == Digital data sharing is key to the all-important, arguably fundamental, data analytics aspects of maritime informatics because it opens the way for better access to relevant and reliable data. As in land-based commerce, digital data sharing is a growing phenomenon in maritime operations - though there is a way to go. It is enabling greater transparency for all those involved in the transportation of goods and passengers, not least being the end-customer. This leads to better and more informed decision-making and planning by all those involved. The push for digitalisation and data sharing is being pursued both by governments and the commercial sector. For example, the Member States of the IMO agreed a mandatory requirement for their governments to introduce electronic information exchange between ships and ports as from 8 April 2019. Meanwhile, commercial operators, particularly in the container lines are putting systems in place for sharing data for mutual benefit in their operations. Data sharing is an important aspect of the Port Collaborative Decision Making (PortCDM) and Port Call Optimization initiatives, both of which seek to improve the coordination, synchronization and efficiency of the port call process by enabling a common and shared situational awareness among all those involved. == Standardisation == The availability and sharing of relevant digital data underpins maritime informatics and is key to more effective and efficient coordination and synchronisation in the predominantly self-organising ecosystem that is maritime transportation. For this to occur, a high priority underpinning maritime informatics is the encouragement of standardised digital data exchange and data sharing, leading, in turn, to improvements in shipping analytics. Improved availability of data will support better historical analysis, now-casting and forecasting. The International Maritime Organization (IMO) FAL Committee is taking the lead in ensuring that the common terms used in the various standards being developed or in use in the maritime sector are compatible and therefore interoperable as far as is practicable, by creating and maintaining The IMO Compendium on Facilitation and Electronic Business. The IMO Compendium consists of an IMO Data Set and IMO Reference Data Model agreed by the main organisations involved in the development of standards for the electronic exchange of information related to the FAL Convention: the World Customs Organization (WCO), the United Nations Economic Commission for Europe (UNECE) and the International Organization for Standardization (ISO). There are several other prominent international governmental and non-governmental organisations actively contributing to the ongoing standardisation and harmonisation process including the UN Electronic Data Interchange for Administration, Commerce and Transport (UN EDIFACT), the Digital Container Shipping Association (DCSA), the International Harbour Masters Association (IHMA) and BIMCO - the world's largest direct-membership organisation for shipowners, charterers, shipbrokers and agents.

    Read more →
  • Systematic review

    Systematic review

    A systematic review is a scholarly synthesis of the evidence on a clearly presented topic using critical methods to identify, define and assess research on the topic. A systematic review extracts and interprets data from published studies on the topic (in the scientific literature), then analyzes, describes, critically appraises and summarizes interpretations into a refined evidence-based conclusion. For example, a systematic review of randomized controlled trials is a way of summarizing and implementing evidence-based medicine. Systematic reviews, sometimes along with meta-analyses, are generally considered the highest level of evidence in medical research. While a systematic review may be applied in the biomedical or health care context, it may also be used where an assessment of a precisely defined subject can advance understanding in a field of research. A systematic review may examine clinical tests, public health interventions, environmental interventions, social interventions, adverse effects, qualitative evidence syntheses, methodological reviews, policy reviews, and economic evaluations. Systematic reviews are closely related to meta-analyses, and often the same instance will combine both (being published with a subtitle of "a systematic review and meta-analysis"). The distinction between the two is that a meta-analysis uses statistical methods to induce a single number from the pooled data set (such as an effect size), whereas the strict definition of a systematic review excludes that step. However, in practice, when one is mentioned, the other may often be involved, as it takes a systematic review to assemble the information that a meta-analysis analyzes, and people sometimes refer to an instance as a systematic review, even if it includes the meta-analytical component. An understanding of systematic reviews and how to implement them in practice is common for professionals in health care, public health, and public policy. Systematic reviews contrast with a type of review often called a narrative review. Systematic reviews and narrative reviews both review the literature (the scientific literature), but the term literature review without further specification refers to a narrative review. == Characteristics == A systematic review can be designed to provide a thorough summary of current literature relevant to a research question. A systematic review uses a rigorous and transparent approach for research synthesis, with the aim of assessing and, where possible, minimizing bias in the findings. While many systematic reviews are based on an explicit quantitative meta-analysis of available data, there are also qualitative reviews and other types of mixed-methods reviews that adhere to standards for gathering, analyzing, and reporting evidence. Systematic reviews of quantitative data or mixed-method reviews sometimes use statistical techniques (meta-analysis) to combine results of eligible studies. Scoring levels are sometimes used to rate the quality of the evidence depending on the methodology used, although this is discouraged by the Cochrane Library. As evidence rating can be subjective, multiple people may be consulted to resolve any scoring differences between how evidence is rated. The EPPI-Centre, Cochrane, and the Joanna Briggs Institute have been influential in developing methods for combining both qualitative and quantitative research in systematic reviews. Several reporting guidelines exist to standardise reporting about how systematic reviews are conducted. Such reporting guidelines are not quality assessment or appraisal tools. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement suggests a standardized way to ensure a transparent and complete reporting of systematic reviews, and is now required for this kind of research by more than 170 medical journals worldwide. The latest version of this commonly used statement corresponds to PRISMA 2020 (the respective article was published in 2021). Several specialized PRISMA guideline extensions have been developed to support particular types of studies or aspects of the review process, including PRISMA-P for review protocols and PRISMA-ScR for scoping reviews. A list of PRISMA guideline extensions is hosted by the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network. However, the PRISMA guidelines have been found to be limited to intervention research and the guidelines have to be changed in order to fit non-intervention research. As a result, Non-Interventional, Reproducible, and Open (NIRO) Systematic Reviews was created to counter this limitation. For qualitative reviews, reporting guidelines include ENTREQ (Enhancing transparency in reporting the synthesis of qualitative research) for qualitative evidence syntheses; RAMESES (Realist And MEta-narrative Evidence Syntheses: Evolving Standards) for meta-narrative and realist reviews; and eMERGe (Improving reporting of Meta-Ethnography) for meta-ethnograph. Developments in systematic reviews during the 21st century included realist reviews and the meta-narrative approach, both of which addressed problems of variation in methods and heterogeneity existing on some subjects. == Types == There are over 30 types of systematic review and Table 1 below non-exhaustingly summarises some of these. There is not always consensus on the boundaries and distinctions between the approaches described below. === Scoping reviews === Scoping reviews are distinct from systematic reviews in several ways. A scoping review is an attempt to search for concepts by mapping the language and data which surrounds those concepts and adjusting the search method iteratively to synthesize evidence and assess the scope of an area of inquiry. This can mean that the concept search and method (including data extraction, organisation and analysis) are refined throughout the process, sometimes requiring deviations from any protocol or original research plan. A scoping review may often be a preliminary stage before a systematic review, which 'scopes' out an area of inquiry and maps the language and key concepts to determine if a systematic review is possible or appropriate, or to lay the groundwork for a full systematic review. The goal can be to assess how much data or evidence is available regarding a certain area of interest. This process is further complicated if it is mapping concepts across multiple languages or cultures. As a scoping review should be systematically conducted and reported (with a transparent and repeatable method), some academic publishers categorize them as a kind of 'systematic review', which may cause confusion. Scoping reviews are helpful when it is not possible to carry out a systematic synthesis of research findings, for example, when there are no published clinical trials in the area of inquiry. Scoping reviews are helpful when determining if it is possible or appropriate to carry out a systematic review, and are a useful method when an area of inquiry is very broad, for example, exploring how the public are involved in all stages systematic reviews. There is still a lack of clarity when defining the exact method of a scoping review as it is both an iterative process and is still relatively new. There have been several attempts to improve the standardisation of the method, for example via a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline extension for scoping reviews (PRISMA-ScR). PROSPERO (the International Prospective Register of Systematic Reviews) does not permit the submission of protocols of scoping reviews, although some journals will publish protocols for scoping reviews. == Stages == While there are multiple kinds of systematic review methods, the main stages of a review can be summarised as follows: === Defining the research question === Some reported that the 'best practices' involve 'defining an answerable question' and publishing the protocol of the review before initiating it to reduce the risk of unplanned research duplication and to enable transparency and consistency between methodology and protocol. Clinical reviews of quantitative data are often structured using the mnemonic PICO, which stands for 'Population or Problem', 'Intervention or Exposure', 'Comparison', and 'Outcome', with other variations existing for other kinds of research. For qualitative reviews, PICo is 'Population or Problem', 'Interest', and 'Context'. === Searching for sources === Relevant criteria can include selecting research that is of good quality and answers the defined question. The search strategy should be designed to retrieve literature that matches the protocol's specified inclusion and exclusion criteria. The methodology section of a systematic review should list all of the databases and citation indices that were searched. The titles and abstracts of identified articles can be checked against predetermined criteria for eligibility and r

    Read more →
  • Open Compute Project

    Open Compute Project

    The Open Compute Project (OCP) is an organization that facilitates the sharing of data center product designs and industry best practices among companies. Founded in 2011, OCP has significantly influenced the design and operation of large-scale computing facilities worldwide. As of February 2025, over 400 companies across the world are members of OCP, including Arm, Meta, IBM, Wiwynn, Intel, Nokia, Google, Microsoft, Seagate Technology, Dell, Rackspace, Hewlett Packard Enterprise, NVIDIA, Cisco, Goldman Sachs, Fidelity, Lenovo, Accton Technology Corporation and Alibaba Group. == Structure == The Open Compute Project Foundation is a 501(c)(6) non-profit incorporated in the state of Delaware, United States. OCP has multiple committees, including the board of directors, advisory board and steering committee to govern its operations. As of July 2020, there are seven members who serve on the board of directors which is made up of one individual member and six organizational members. Mark Roenigk (Facebook) is the Foundation's president and chairman. Andy Bechtolsheim is the individual member. In addition to Mark Roenigk who represents Facebook, other organizations on the Open Compute board of directors include Intel (Rebecca Weekly), Microsoft (Kushagra Vaid), Google (Partha Ranganathan), and Rackspace (Jim Hawkins). A list of members can be found on the OCP website. == History == The Open Compute Project began at Facebook (now Meta) in 2009 as an internal project called "Project Freedom". The hardware designs and engineering teams were led by Amir Michael (Manager, Hardware Design) and sponsored by Jonathan Heiliger (VP, Technical Operations) and Frank Frankovsky (Director, Hardware Design and Infrastructure). The three would later open source the designs of Project Freedom and co-found the Open Compute Project. The project was announced at a press event at Facebook's headquarters in Palo Alto on April 7, 2011. == OCP projects == The Open Compute Project Foundation maintains a number of OCP projects, such as: === Server designs === In 2013, two years after the Open Compute Project had started, it was noted that the goal of a more modular server design was "still a long way from live data centers". However, by then some aspects published had been used in Facebook's Prineville data center to improve energy efficiency, as measured by the power usage effectiveness index defined by The Green Grid. Efforts to advance server compute node designs included one for Intel processors and one for AMD processors. Also in 2013, Calxeda contributed a design with ARM architecture processors. Since then, several generations of OCP server designs have been deployed: Wildcat (Intel), Spitfire (AMD), Windmill (Intel E5-2600), Watermark (AMD), Winterfell (Intel E5-2600 v2) and Leopard (Intel E5-2600 v3). === OCP Accelerator Module === OCP Accelerator Module (OAM) is a design specification for hardware architectures that implement artificial intelligence systems that require high module-to-module bandwidth. OAM is used in some of AMD's Instinct accelerator modules. === Rack and power designs === Designs for a mechanical mounting system to replace standard 19-inch racks have been published, with a cabinet the same outside width (600 mm) and depth as existing racks, but with an interior space allowing for wider equipment chassis with a 537 mm width (21 inches). This allows more equipment to fit in the same volume and improves air flow. Compute chassis sizes are defined in multiples of an OpenU or OU, which is 48 mm, slightly taller than the 44 mm rack unit defined for 19-inch racks. As of March 2026, the most current base mechanical definition is the Open Rack V3.1 Specification. At the time the base specification was released, Meta also defined in greater depth the specifications for the rectifiers and power shelf. Specifications for the power monitoring interface (PMI), a communications interface enabling upstream communications between the rectifiers and battery backup unit(BBU) were published by Meta that same year, with Delta Electronics as the main technical contributor to the BBU spec. However, since 2022 the AI boom in the data center has created higher power requirements in order to satisfy the demands of AI accelerators that have been released. As of September 2024, Meta is in the process of updating its Open Rack v3 rectifier, power shelf, battery backup and power management interface specifications to accommodate this increased energy demand. In May 2024, at an Open Compute regional summit, Meta and Rittal outlined their plans for development of their High Power Rack (HPR) ecosystem in conjunction with rack, power and cable partners, increasing power capacity in the rack to 92 kilowatts or more. At the same meeting, Delta Electronics and Advanced Energy reported on their progress in developing new Open Compute standard specifications for power shelf and rectifier designs for HPR applications. Rittal also outlined their collaboration with Meta in designing airflow containment, busbar designs and grounding schemes for the new HPR requirements. === Data storage === Open Vault storage building blocks (also called "Knox") offer high disk densities, with 30 drives in a 2 OU Open Rack chassis designed for easy disk drive replacement. The 3.5 inch disks are stored in two drawers, five across and three deep in each drawer, with connections via serial attached SCSI. There is a "cold storage" variant where idle disks power down to reduce energy consumption. Another design concept was contributed by Hyve Solutions, a division of Synnex, in 2012. At the OCP Summit 2016 Facebook, together with Taiwanese ODM Wistron's spin-off Wiwynn, introduced "Lightning", a flexible NVMe JBOF (just a bunch of flash), based on the existing Open Vault (Knox) design. === Energy efficient data centers === The OCP has published data center designs for energy efficiency. These include power distribution at three-phase 277/480 VAC, which eliminates one transformer stage in typical North American data centers, a single voltage (12.5 VDC) power supply designed to work with 277/480 VAC input, and 48 VDC battery backup. For European (and other 230V countries) datacenters, there is a specification for 230/400 VAC power distribution and its conversion to 12.5 VDC. === Open networking switches === On May 8, 2013, an effort to define an open network switch was announced. The plan was to allow Facebook to load its own operating system software onto its top-of-rack switches. Press reports predicted that more expensive and higher-performance switches would continue to be popular, while less expensive products treated more like a commodity. The first attempt at an open networking switch by Facebook was designed together with Taiwanese ODM Accton using Broadcom Trident II chip and is called "Wedge"; the Linux OS that it runs is called "FBOSS". Later switch contributions include "6-pack" and Wedge-100, based on Broadcom Tomahawk chips. Similar switch hardware designs have been contributed by: Accton Technology Corporation (and its Edgecore Networks subsidiary), Mellanox Technologies, Interface Masters Technologies, Agema Systems. Capable of running Open Network Install Environment (ONIE)-compatible network operating systems such as Cumulus Linux, Switch Light OS by Big Switch Networks, or PICOS by Pica8. A similar project for a custom switch for the Google platform had been rumored, and evolved to use the OpenFlow protocol. === Servers === A sub-project for Mezzanine (NIC) OCP NIC 3.0 specification 1v00 was released in late 2019 establishing three form factors: SFF, TSFF, and LFF. == Litigation == In March, 2015, BladeRoom Group Limited and Bripco (UK) Limited sued Facebook, Emerson Electric Co. and others alleging that Facebook has disclosed BladeRoom and Bripco's trade secrets for prefabricated data centers in the Open Compute Project. Facebook petitioned for the lawsuit to be dismissed, but this was rejected in 2017. A confidential mid-trial settlement was agreed in April 2018.

    Read more →
  • Lossless join decomposition

    Lossless join decomposition

    In database design, a lossless join decomposition is a decomposition of a relation r {\displaystyle r} into relations r 1 , r 2 {\displaystyle r_{1},r_{2}} such that a natural join of the two smaller relations yields back the original relation. This is central in removing redundancy safely from databases while preserving the original data. Lossless join can also be called non-additive. == Definition == A relation r {\displaystyle r} on schema R {\displaystyle R} decomposes losslessly onto schemas R 1 {\displaystyle R_{1}} and R 2 {\displaystyle R_{2}} if π R 1 ( r ) ⋈ π R 2 ( r ) = r {\displaystyle \pi _{R_{1}}(r)\bowtie \pi _{R_{2}}(r)=r} , that is r {\displaystyle r} is the natural join of its projections onto the smaller schemas. A pair ( R 1 , R 2 ) {\displaystyle (R_{1},R_{2})} is a lossless-join decomposition of R {\displaystyle R} or said to have a lossless join with respect to a set of functional dependencies F {\displaystyle F} if any relation r ( R ) {\displaystyle r(R)} that satisfies F {\displaystyle F} decomposes losslessly onto R 1 {\displaystyle R_{1}} and R 2 {\displaystyle R_{2}} . Decompositions into more than two schemas can be defined in the same way. == Criteria == A decomposition R = R 1 ∪ R 2 {\displaystyle R=R_{1}\cup R_{2}} has a lossless join with respect to F {\displaystyle F} if and only if the closure of R 1 ∩ R 2 {\displaystyle R_{1}\cap R_{2}} includes R 1 ∖ R 2 {\displaystyle R_{1}\setminus R_{2}} or R 2 ∖ R 1 {\displaystyle R_{2}\setminus R_{1}} . In other words, one of the following must hold: ( R 1 ∩ R 2 ) → ( R 1 ∖ R 2 ) ∈ F + {\displaystyle (R_{1}\cap R_{2})\to (R_{1}\setminus R_{2})\in F^{+}} ( R 1 ∩ R 2 ) → ( R 2 ∖ R 1 ) ∈ F + {\displaystyle (R_{1}\cap R_{2})\to (R_{2}\setminus R_{1})\in F^{+}} === Criteria for multiple sub-schemas === Multiple sub-schemas R 1 , R 2 , . . . , R n {\displaystyle R_{1},R_{2},...,R_{n}} have a lossless join if there is some way in which we can repeatedly perform lossless joins until all the schemas have been joined into a single schema. Once we have a new sub-schema made from a lossless join, we are not allowed to use any of its isolated sub-schema to join with any of the other schemas. For example, if we can do a lossless join on a pair of schemas R i , R j {\displaystyle R_{i},R_{j}} to form a new schema R i , j {\displaystyle R_{i,j}} , we use this new schema (rather than R i {\displaystyle R_{i}} or R j {\displaystyle R_{j}} ) to form a lossless join with another schema R k {\displaystyle R_{k}} (which may already be joined (e.g., R k , l {\displaystyle R_{k,l}} )). == Example == Let R = { A , B , C , D } {\displaystyle R=\{A,B,C,D\}} be the relation schema, with attributes A, B, C and D. Let F = { A → B C } {\displaystyle F=\{A\rightarrow BC\}} be the set of functional dependencies. Decomposition into R 1 = { A , B , C } {\displaystyle R_{1}=\{A,B,C\}} and R 2 = { A , D } {\displaystyle R_{2}=\{A,D\}} is lossless under F because R 1 ∩ R 2 = A {\displaystyle R_{1}\cap R_{2}=A} and we have a functional dependency A → B C {\displaystyle A\rightarrow BC} . In other words, we have proven that ( R 1 ∩ R 2 → R 1 ∖ R 2 ) ∈ F + {\displaystyle (R_{1}\cap R_{2}\rightarrow R_{1}\setminus R_{2})\in F^{+}} .

    Read more →
  • Collective operation

    Collective operation

    Collective operations are building blocks for interaction patterns, that are often used in SPMD algorithms in the parallel programming context. Hence, there is an interest in efficient realizations of these operations. A realization of the collective operations is provided by the Message Passing Interface (MPI). == Definitions == In all asymptotic runtime functions, we denote the latency α {\displaystyle \alpha } (or startup time per message, independent of message size), the communication cost per word β {\displaystyle \beta } , the number of processing units p {\displaystyle p} and the input size per node n {\displaystyle n} . In cases where we have initial messages on more than one node we assume that all local messages are of the same size. To address individual processing units we use p i ∈ { p 0 , p 1 , … , p p − 1 } {\displaystyle p_{i}\in \{p_{0},p_{1},\dots ,p_{p-1}\}} . If we do not have an equal distribution, i.e. node p i {\displaystyle p_{i}} has a message of size n i {\displaystyle n_{i}} , we get an upper bound for the runtime by setting n = max ( n 0 , n 1 , … , n p − 1 ) {\displaystyle n=\max(n_{0},n_{1},\dots ,n_{p-1})} . A distributed memory model is assumed. The concepts are similar for the shared memory model. However, shared memory systems can provide hardware support for some operations like broadcast (§ Broadcast) for example, which allows convenient concurrent read. Thus, new algorithmic possibilities can become available. == Broadcast == The broadcast pattern is used to distribute data from one processing unit to all processing units, which is often needed in SPMD parallel programs to dispense input or global values. Broadcast can be interpreted as an inverse version of the reduce pattern (§ Reduce). Initially only root r {\displaystyle r} with i d {\displaystyle id} 0 {\displaystyle 0} stores message m {\displaystyle m} . During broadcast m {\displaystyle m} is sent to the remaining processing units, so that eventually m {\displaystyle m} is available to all processing units. Since an implementation by means of a sequential for-loop with p − 1 {\displaystyle p-1} iterations becomes a bottleneck, divide-and-conquer approaches are common. One possibility is to utilize a binomial tree structure with the requirement that p {\displaystyle p} has to be a power of two. When a processing unit is responsible for sending m {\displaystyle m} to processing units i . . j {\displaystyle i..j} , it sends m {\displaystyle m} to processing unit ⌈ ( i + j ) / 2 ⌉ {\displaystyle \left\lceil (i+j)/2\right\rceil } and delegates responsibility for the processing units ⌈ ( i + j ) / 2 ⌉ . . j {\displaystyle \left\lceil (i+j)/2\right\rceil ..j} to it, while its own responsibility is cut down to i . . ⌈ ( i + j ) / 2 ⌉ − 1 {\displaystyle i..\left\lceil (i+j)/2\right\rceil -1} . Binomial trees have a problem with long messages m {\displaystyle m} . The receiving unit of m {\displaystyle m} can only propagate the message to other units, after it received the whole message. In the meantime, the communication network is not utilized. Therefore pipelining on binary trees is used, where m {\displaystyle m} is split into an array of k {\displaystyle k} packets of size ⌈ n / k ⌉ {\displaystyle \left\lceil n/k\right\rceil } . The packets are then broadcast one after another, so that data is distributed fast in the communication network. Pipelined broadcast on balanced binary tree is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} , whereas for the non-pipelined case it takes O ( ( α + β n ) log ⁡ p ) {\displaystyle {\mathcal {O}}((\alpha +\beta n)\log p)} cost. == Reduce == The reduce pattern is used to collect data or partial results from different processing units and to combine them into a global result by a chosen operator. Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} initially. All m i {\displaystyle m_{i}} are aggregated by ⊗ {\displaystyle \otimes } and the result is eventually stored on p 0 {\displaystyle p_{0}} . The reduction operator ⊗ {\displaystyle \otimes } must be associative at least. Some algorithms require a commutative operator with a neutral element. Operators like s u m {\displaystyle sum} , m i n {\displaystyle min} , m a x {\displaystyle max} are common. Implementation considerations are similar to broadcast (§ Broadcast). For pipelining on binary trees the message must be representable as a vector of smaller object for component-wise reduction. Pipelined reduce on a balanced binary tree is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} . == All-Reduce == The all-reduce pattern (also called allreduce) is used if the result of a reduce operation (§ Reduce) must be distributed to all processing units. Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} initially. All m i {\displaystyle m_{i}} are aggregated by an operator ⊗ {\displaystyle \otimes } and the result is eventually stored on all p i {\displaystyle p_{i}} . Analog to the reduce operation, the operator ⊗ {\displaystyle \otimes } must be at least associative. All-reduce can be interpreted as a reduce operation with a subsequent broadcast (§ Broadcast). For long messages a corresponding implementation is suitable, whereas for short messages, the latency can be reduced by using a hypercube (Hypercube (communication pattern) § All-Gather/ All-Reduce) topology, if p {\displaystyle p} is a power of two. All-reduce can also be implemented with a butterfly algorithm and achieve optimal latency and bandwidth. All-reduce is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} , since reduce and broadcast are possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} with pipelining on balanced binary trees. All-reduce implemented with a butterfly algorithm achieves the same asymptotic runtime. == Prefix-Sum/Scan == The prefix-sum or scan operation is used to collect data or partial results from different processing units and to compute intermediate results by an operator, which are stored on those processing units. It can be seen as a generalization of the reduce operation (§ Reduce). Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} . The operator ⊗ {\displaystyle \otimes } must be at least associative, whereas some algorithms require also a commutative operator and a neutral element. Common operators are s u m {\displaystyle sum} , m i n {\displaystyle min} and m a x {\displaystyle max} . Eventually processing unit p i {\displaystyle p_{i}} stores the prefix sum ⊗ i ′ <= i {\displaystyle \otimes _{i'<=i}} m i ′ {\displaystyle m_{i'}} . In the case of the so-called exclusive prefix sum, processing unit p i {\displaystyle p_{i}} stores the prefix sum ⊗ i ′ < i {\displaystyle \otimes _{i' Read more →

  • Artificial intelligence in architecture

    Artificial intelligence in architecture

    Artificial intelligence in architecture is the use of artificial intelligence in automation, design, and planning in the architectural process or in assisting human skills in the field of architecture. AI has been used by some architects for design, and has been proposed as a way to automate planning and routine tasks in the field. == Implications == === Benefits === Artificial intelligence, according to ArchDaily, is said to potentially significantly augment the architectural profession through its ability to improve the design and planning process as well as increasing productivity. Through its ability to handle a large amount of data, AI is said to potentially allow architects a range of design choices with criteria considerations such as budget, requirements adjusted to space, and sustainability goals calculated as part of the design process. ArchDaily said this may allow the design of optimized alternatives that can then undergo human review. AI tools are also said to potentially allow architects to assimilate urban and environmental data to inform their designs, streamlining initial stages of project planning and increasing efficiency and productivity. The advances in generative design through the input of specific prompts allow architects to produce visual designs, including photorealistic images, and thus render and explore various material choices and spatial configurations. ArchDaily noted this could speed the creative process as well as allow for experimentation and sophistication in the design. Additionally, AI's capacity for pattern recognition and coding could aid architects in organizing design resources and developing custom applications, thus enhancing the efficiency and collaboration between both architects and AI. AI is thought to also be able to contribute to the sustainability of buildings by analyzing various factors and following recommended energy-efficient modifications, thus pushing the industry towards greener practices. The use of AI in building maintenance, project management, and the creation of immersive virtual reality experiences are also thought of as potentially augmenting the architectural design process and workflow. Examples include the use of text-to-image systems such as Midjourney to create detailed architectural images, and the use of AI optimization systems from companies such as Finch3D and Autodesk to automatically generate floor plans from simple programmatic inputs. In contrast to digital-only creative practices, the high materiality of architectural outputs requires transitions from ephemeral digital files to permanent physical structures that are subject to strict safety regulations, material constraints, sensory intuition, and site-specific cultural contexts, making full automation difficult. Early adopters such as architect Stephen Coorlas have actively challenged the boundaries of architectural practice through AI. His early experimental initiative, Speculations on AI and Architecture, confronts the discipline's traditional workflows by training text-to-image AI tools such as Midjourney, Luma AI, and PromeAI to generate more nuanced architectural illustrations including construction documents, architectural details, and assembly sequences for various structures. Coorlas inputs precise terminology and architectural language to provoke the AI into producing axonometric drawings that resemble conventional documentation, then experiments with animating the outputs using AI generated depth maps and other AI image-to-3D wireframe tools. Stephen's inventive process invites architects and designers to reconsider authorship, automation, and the future of visual communication in the built environment. Rather than treating AI as a peripheral tool, Stephen has advocated for AI to be a speculative collaborator capable of engaging with discipline-specific challenges. His work contributes to the growing discourse on generative design, parametric optimization, and the philosophical implications of machine-assisted creativity raising urgent questions about how such technologies will reshape architectural agency, precision, and pedagogy. Another prominent advocate is Architect Andrew Kudless, who in an interview to Dezeen recounted that he uses AI to innovate in architectural design by incorporating materials and scenes not usually present in initial plans, which he believes can significantly alter client presentations. He told Dezeen he believes one should show clients renderings from the onset, with AI assisting in this work, arguing that changes in design should be a positive aspect of the client-designer relationship by actively involving clients in the process. Additionally, Kudless highlighted the AI's potential to facilitate labor in architectural firms, particularly in automating rendering tasks, thus reducing the workload on junior staff while maintaining control over the creative output. === Emergent aesthetics === In an interview for the AItopia series to Dezeen, designer Tim Fu discussed the transformative potential of AI in architecture, and proposed a future where AI could herald a "neoclassical futurist" style, blending the grandeur of classical aesthetics with futuristic design. Through his collaborative project, The AI Stone Carver, Fu showcased how AI can innovate traditional practices by generating design concepts that are then realized through human craftsmanship, such as stone carving by mason Till Apfel. This approach, he believed, celebrated the fusion of diverse architectural styles and also emphasized the unique capabilities of AI in enhancing creative design processes. Fu told Dezeen he envisions the integration of AI in design as a means to revive the ornamentation and detailed aesthetics characteristic of classical architecture, moving away from minimalism, which he said dominates contemporary architecture. He argued that AI's involvement in the ideation phase of design allows for a reversal in the roles of machine and human, enabling architects and designers to focus on creating more intricate and ornamental structures. Fu's optimistic outlook extended to the broader impact of AI on the architectural field, seeing it as an indispensable tool that will shift rather than replace human roles, enriching the field with innovative designs that pay homage to the beauty and qualities of classical architecture not present in contemporary architecture while embracing new technologies. This perspective resonates with designers like Manas Bhatia, whose explorations similarly embrace generative AI as a co-creator and a medium to express ideas, blend architectural traditions, and speculate spatial futures. === Concerns === As AI continues to expand its presence across various industries, its impact on the architectural profession has become a topic of growing discussion. These discussions focus on how AI processes may influence traditional architectural practices, potentially altering job roles, and shaping the nature of creativity. While AI-driven processes may increase efficiency in some aspects of the profession, they also raise questions about the potential loss of unique design perspectives. These thoughts have been countered by many prominent creative figures in the realm of AI architecture, such as Stephen Coorlas, Tim Fu, Hassan Ragab, and Manas Bhatia who have showcased the amplification of creativity in design and potential benefits in terms of restoring creative power to the designer. A key concern is that AI-powered tools could diminish the need for human involvement in specific tasks traditionally performed by architects. This has led to speculation that the profession may increasingly shift toward roles focused on oversight, coordination, and strategic decision-making rather than hands-on design work. In some design scenarios, algorithmically generated solutions can be adjusted to prioritize efficiency and cost-effectiveness, which some argue may overshadow the creative and contextual nuances that define individual architectural styles. As with any discipline though, it has been determined that AI can be configured to provide beneficial results based on inputs and end goals the architect or designer assigns it. There are also concerns about the potential for AI to exacerbate inequalities within the architectural profession. For instance, larger firms with greater resources to invest in advanced AI technologies may gain a competitive edge over smaller firms and independent architects. This dynamic could contribute to industry consolidation, potentially limiting the diversity of architectural practice and stifling innovation. Ethical considerations in regard to cultural sensitivity have also been raised due to the datasets used to train AI. Without proper vetting of data or implementing failsafe overrides, AI generated outcomes can trend toward overly documented and prioritized content.

    Read more →
  • Relational data stream management system

    Relational data stream management system

    A relational data stream management system (RDSMS) is a distributed, in-memory data stream management system (DSMS) that is designed to use standards-compliant SQL queries to process unstructured and structured data streams in real-time. Unlike SQL queries executed in a traditional RDBMS, which return a result and exit, SQL queries executed in a RDSMS do not exit, generating results continuously as new data become available. Continuous SQL queries in a RDSMS use the SQL Window function to analyze, join and aggregate data streams over fixed or sliding windows. Windows can be specified as time-based or row-based. == RDSMS SQL Query Examples == Continuous SQL queries in a RDSMS conform to the ANSI SQL standards. The most common RDSMS SQL query is performed with the declarative SELECT statement. A continuous SQL SELECT operates on data across one or more data streams, with optional keywords and clauses that include FROM with an optional JOIN subclause to specify the rules for joining multiple data streams, the WHERE clause and comparison predicate to restrict the records returned by the query, GROUP BY to project streams with common values into a smaller set, HAVING to filter records resulting from a GROUP BY, and ORDER BY to sort the results. The following is an example of a continuous data stream aggregation using a SELECT query that aggregates a sensor stream from a weather monitoring station. The SELECTquery aggregates the minimum, maximum and average temperature values over a one-second time period, returning a continuous stream of aggregated results at one second intervals. RDSMS SQL queries also operate on data streams over time or row-based windows. The following example shows a second continuous SQL query using the WINDOW clause with a one-second duration. The WINDOW clause changes the behavior of the query, to output a result for each new record as it arrives. Hence the output is a stream of incrementally updated results with zero result latency.

    Read more →
  • Environmental impact of AI

    Environmental impact of AI

    The environmental impact of the design, training, deployment and use of artificial intelligence includes the greenhouse gas emissions from generating electricity for data centres and computing hardware, operational and upstream water use, and material impacts from hardware manufacturing, mining and electronic waste. Estimating AI's environmental effects can be difficult because results depend on how impacts are measured, including whether accounting includes only model computation or also data-centre overhead, idle capacity, hardware manufacture, and local electricity supply. As these issues have received greater attention, governments and regulators have increasingly considered data-centre reporting requirements, energy-efficiency standards, and broader transparency measures for AI-related resource use. == Carbon footprint and energy use == AI-related energy use arises at multiple stages, including model training, fine-tuning, inference, storage, networking, and supporting infrastructure such as cooling and power conversion. === Individual level === Published estimates of energy use per AI request vary widely across models, tasks and measurement methods. A benchmark study presented at the 2024 ACM Conference on Fairness, Accountability, and Transparency found substantial differences between task types, with lower energy use for some text tasks and much higher energy use for image generation in the study's test conditions. In that benchmark, simple classification tasks consumed about 0.002–0.007 Wh per prompt on average (about 9% of a smartphone charge for 1,000 prompts), while text generation and text summarisation each used about 0.05 Wh per prompt; image generation averaged 2.91 Wh per prompt, and the least efficient image model in the study used 11.49 Wh per image (roughly equivalent to half a smartphone charge). First-party measurements in production environments have also been published. A 2025 Google study on Gemini assistant serving reported median per-prompt energy, emissions, and water-use estimates under the authors' accounting framework, while noting that different system boundaries can produce substantially different results. The study reported a median text-prompt estimate of about 0.24 Wh, which is roughly as much energy as watching nine seconds of television. The study also stated that software and infrastructure improvements reduced energy use by a factor of 33 and carbon emissions by a factor of 44 for a typical prompt over one year within the authors' framework. Researchers at the University of Michigan measured the energy consumption of various Meta Llama 3.1 models released in 2024 and found that smaller language models (8 billion parameters) use about 114 joules (0.03167 Wh) per response, while larger models (405 billion parameters) require up to 6,700 joules (1.861 Wh) per response. This corresponds to the energy needed to run a microwave oven for roughly one-tenth of a second and eight seconds, respectively. Comparisons between AI systems and human labour for specific tasks have produced mixed results and remain sensitive to assumptions about output quality, workload and system boundaries. A 2024 study in Scientific Reports reported 130 to 2900 times lower estimated carbon emissions for selected AI systems than for human writers and illustrators under its assumptions. A later Scientific Reports paper reported a counterexample for programming tasks under its assumptions, finding 5 to 19 times higher estimated emissions for the evaluated AI system than for human programmers on the benchmark used in that study. === System level === ==== Energy use and efficiency ==== AI electricity intensity depends not only on model architecture but also on hardware and facility efficiency. Data-centre operators commonly report Power usage effectiveness (PUE), which measures the ratio of total facility energy to IT equipment energy; a lower PUE indicates less overhead energy for cooling and other supporting infrastructure. Operators may also publish metrics and case studies on hardware efficiency, cooling systems and power sourcing. In its 2024 environmental report, Google stated that its 2023 total greenhouse gas emissions increased 13% year over year, primarily because of increased data-centre energy consumption and supply-chain emissions, while also reporting lower PUE than industry averages for its own facilities. The International Energy Agency has also reported that data centres remain a relatively small share of global electricity use overall, but that their local effects can be much more pronounced because demand is geographically concentrated. ==== Carbon footprint ==== At system level, AI contributes to rising electricity demand in data centres and related infrastructure. The International Energy Agency estimated that data centres used about 415 TWh of electricity in 2024, or around 1.5% of global electricity consumption, and projected that data-centre electricity use could rise to about 945 TWh by 2030, with AI identified as the main driver of that growth alongside other digital services. The carbon footprint of AI systems depends strongly on electricity sources, hardware efficiency, utilisation rates, and what stages are included in the accounting. Training large models can require substantial electricity, while total lifecycle impacts also depend on deployment scale and the amount of inference performed after training. Early analyses of frontier-model development reported rapid historical growth in training compute for selected systems, although later trends have depended on changes in model design, hardware and efficiency gains. Accounting methods that include upstream or embodied impacts, such as hardware manufacture and facilities construction, can materially affect estimates of AI-related emissions. === Decisions and strategies by individual companies === Large technology companies have reported that the expansion of AI and cloud infrastructure affects their sustainability targets, electricity demand, and resource use. Google, for example, attributed part of its emissions growth in 2023 to increased data-centre energy consumption and supply-chain emissions in its 2024 environmental report. Cloud and AI companies have also announced measures intended to reduce environmental impacts, including investment in more efficient hardware, low-carbon electricity procurement, alternative cooling systems, and water stewardship programmes. The extent, comparability, and third-party verification of such disclosures vary between firms and jurisdictions. == Water usage == Data centres can use water directly for cooling and indirectly through the water used in electricity generation, depending on the local energy mix. Public reporting on data-centre water use has often been inconsistent, making comparisons between operators and regions difficult. To standardise operational reporting, The Green Grid proposed the metric water usage effectiveness (WUE), defined as annual site water use divided by IT equipment energy use. WUE does not by itself measure local water stress, source sustainability, or all upstream water impacts. Studies of AI water use also distinguish between water withdrawal and water consumption. Research on AI-specific water use has argued that the water footprint of AI systems can be difficult to observe and may vary substantially by location, cooling design, and electricity source. A 2025 Communications of the ACM article summarised methods for estimating AI water footprints and emphasised the distinction between water withdrawal and water consumption. Li and colleagues estimated that global AI water withdrawal could reach 4.2–6.6 billion cubic metres in 2027 under the scenarios examined in their article. Using GPT-3, released by OpenAI in 2020, as an example, they estimated that training the model in Microsoft's U.S. data centres could consume about 700,000 litres of onsite water and about 5.4 million litres in total when offsite electricity-related water use was included; they also estimated that 10–50 medium-length GPT-3 responses could consume about 500 mL of water, depending on when and where the model was deployed. Published prompt-level estimates have also varied by system and accounting framework: the 2025 Google study on Gemini assistant serving reported a median text-prompt estimate of about 0.26 mL under its framework. Location can materially affect the significance of data-centre water use. Research on U.S. data centres found that one-fifth of servers' direct water footprint came from moderately to highly water-stressed watersheds, while nearly half of servers were fully or partially powered by plants located in water-stressed regions. A 2025 Reuters report, citing data from Verisk Maplecroft and NatureFinance, said that an average mid-sized data centre uses about 1.4 million litres of water per day for cooling and that Phoenix would experience a 32% increase in annual water stress if currently pl

    Read more →
  • Sardinas–Patterson algorithm

    Sardinas–Patterson algorithm

    In coding theory, the Sardinas–Patterson algorithm is a classical algorithm for determining in polynomial time whether a given variable-length code is uniquely decodable, named after August Albert Sardinas and George W. Patterson, who published it in 1953. The algorithm carries out a systematic search for a string which admits two different decompositions into codewords. As Knuth reports, the algorithm was rediscovered about ten years later in 1963 by Floyd, despite the fact that it was at the time already well known in coding theory. == Idea of the algorithm == Consider the code { a ↦ 1 , b ↦ 011 , c ↦ 01110 , d ↦ 1110 , e ↦ 10011 } {\displaystyle \{\,{\texttt {a}}\mapsto {\texttt {1}},{\texttt {b}}\mapsto {\texttt {011}},{\texttt {c}}\mapsto {\texttt {01110}},{\texttt {d}}\mapsto {\texttt {1110}},{\texttt {e}}\mapsto {\texttt {10011}}\,\}} . This code, which is based on an example by Berstel, is an example of a code which is not uniquely decodable, since the string 011101110011 can be interpreted as the sequence of codewords 01110 – 1110 – 011, but also as the sequence of codewords 011 – 1 – 011 – 10011. Two possible decodings of this encoded string are thus given by cdb and babe. In general, a codeword can be found by the following idea: In the first round, we choose two codewords x 1 {\displaystyle x_{1}} and y 1 {\displaystyle y_{1}} such that x 1 {\displaystyle x_{1}} is a prefix of y 1 {\displaystyle y_{1}} , that is, x 1 w = y 1 {\displaystyle x_{1}w=y_{1}} for some "dangling suffix" w {\displaystyle w} . If one tries first x 1 = 011 {\displaystyle x_{1}={\texttt {011}}} and y 1 = 01110 {\displaystyle y_{1}={\texttt {01110}}} , the dangling suffix is w = 10 {\displaystyle {\texttt {w}}={\texttt {10}}} . If we manage to find two sequences x 2 , … , x p {\displaystyle x_{2},\ldots ,x_{p}} and y 2 , … , y q {\displaystyle y_{2},\ldots ,y_{q}} of codewords such that x 2 ⋯ x p = w y 2 ⋯ y q {\displaystyle x_{2}\cdots x_{p}=wy_{2}\cdots y_{q}} , then we are finished: For then the string x = x 1 x 2 ⋯ x p {\displaystyle x=x_{1}x_{2}\cdots x_{p}} can alternatively be decomposed as y 1 y 2 ⋯ y q {\displaystyle y_{1}y_{2}\cdots y_{q}} , and we have found the desired string having at least two different decompositions into codewords. In the second round, we try out two different approaches: the first trial is to look for a codeword that has w as prefix. Then we obtain a new dangling suffix w, with which we can continue our search. If we eventually encounter a dangling suffix that is itself a codeword (or the empty word), then the search will terminate, as we know there exists a string with two decompositions. The second trial is to seek for a codeword that is itself a prefix of w. In our example, we have w = 10 {\displaystyle w={\texttt {10}}} , and the sequence 1 is a codeword. We can thus also continue with w = 0 {\displaystyle w={\texttt {0}}} as the new dangling suffix. == Precise description of the algorithm == The algorithm is described most conveniently using quotients of formal languages. In general, for two sets of strings D and N, the (left) quotient N − 1 D {\displaystyle N^{-1}D} is defined as the residual words obtained from D by removing some prefix in N. Formally, N − 1 D = { y ∣ x y ∈ D and x ∈ N } {\displaystyle N^{-1}D=\{\,y\mid xy\in D~{\textrm {and}}~x\in N\,\}} . Now let C {\displaystyle C} denote the (finite) set of codewords in the given code. The algorithm proceeds in rounds, where we maintain in each round not only one dangling suffix as described above, but the (finite) set of all potential dangling suffixes. Starting with round i = 1 {\displaystyle i=1} , the set of potential dangling suffixes will be denoted by S i {\displaystyle S_{i}} . The sets S i {\displaystyle S_{i}} are defined inductively as follows: S 1 = C − 1 C ∖ { ε } {\displaystyle S_{1}=C^{-1}C\setminus \{\varepsilon \}} . Here, the symbol ε {\displaystyle \varepsilon } denotes the empty word. S i + 1 = C − 1 S i ∪ S i − 1 C {\displaystyle S_{i+1}=C^{-1}S_{i}\cup S_{i}^{-1}C} , for all i ≥ 1 {\displaystyle i\geq 1} . The algorithm computes the sets S i {\displaystyle S_{i}} in increasing order of i {\displaystyle i} . As soon as one of the S i {\displaystyle S_{i}} contains a word from C or the empty word, then the algorithm terminates and answers that the given code is not uniquely decodable. Otherwise, once a set S i {\displaystyle S_{i}} equals a previously encountered set S j {\displaystyle S_{j}} with j < i {\displaystyle j Read more →

  • Ontology merging

    Ontology merging

    Ontology merging defines the act of bringing together two conceptually divergent ontologies or the instance data associated to two ontologies. This is similar to work in database merging (schema matching). This merging process can be performed in a number of ways, manually, semi automatically, or automatically. Manual ontology merging although ideal is extremely labour-intensive and current research attempts to find semi or entirely automated techniques to merge ontologies. These techniques are statistically driven often taking into account similarity of concepts and raw similarity of instances through textual string metrics and semantic knowledge. These techniques are similar to those used in information integration employing string metrics from open source similarity libraries.

    Read more →
  • Skyline operator

    Skyline operator

    The skyline operator is the subject of an optimization problem and computes the Pareto optimum on tuples with multiple dimensions. This operator is an extension to SQL proposed by Börzsönyi et al. to filter results from a database to keep only those objects that are not dominated by any other point on all dimensions. The name skyline comes from the view on Manhattan from the Hudson River, where those buildings can be seen that are not hidden by any other. A building is visible if it is not dominated by a building that is taller or closer to the river (two dimensions, distance to the river minimized, height maximized). Another application of the skyline operator involves selecting a hotel for a holiday. The user wants the hotel to be both cheap and close to the beach. However, hotels that are close to the beach may also be expensive. In this case, the skyline operator would only present those hotels that are not worse than any other hotel in both price and distance to the beach. == Formal specification == The skyline operator returns tuples that are not dominated by any other tuple. A tuple dominates another if it is at least as good in all dimensions and better in at least one dimension. Formally, we can think of each tuple as a vector p , q ∈ R n {\displaystyle p,q\in \mathbb {R} ^{n}} . p {\displaystyle p} dominates q {\displaystyle q} (written: p ≻ q {\displaystyle p\succ q} ) if p {\displaystyle p} is at least as good as q {\displaystyle q} in every dimension, and superior in at least one: p ≻ q ⇔ ∀ i ∈ [ n ] . p [ i ] ⪰ q [ i ] ∧ ∃ j ∈ [ n ] . p [ j ] ≻ q [ j ] . {\displaystyle p\succ q\Leftrightarrow \forall i\in [n].p[i]\succeq q[i]\wedge \exists j\in [n].p[j]\succ q[j].} Dominance ( p ≻ q {\displaystyle p\succ q} ) can be defined as any strict partial ordering, for example greater (with ≻:=> {\displaystyle \succ :=>} and ⪰:=≥ {\displaystyle \succeq :=\geq } ) or less (with ≻:=< {\displaystyle \succ :=<} and ⪰:=≤ {\displaystyle \succeq :=\leq } ). Assuming two dimensions and defining dominance in both dimensions as greater, we can compute the skyline in SQL-92 as follows: == Proposed syntax == As an extension to SQL, Börzsönyi et al. proposed the following syntax for the skyline operator: where d1, ... dm denote the dimensions of the skyline and MIN, MAX and DIFF specify whether the value in that dimension should be minimised, maximised or simply be different. Without an SQL extension, the SQL query requires an antijoin with not exists: == Implementation == The skyline operator can be implemented directly in SQL using current SQL constructs, but this has been shown to be very slow in disk-based database systems. Other algorithms have been proposed that make use of divide and conquer, indices, MapReduce and general-purpose computing on graphics cards. Skyline queries on data streams (i.e. continuous skyline queries) have been studied in the context of parallel query processing on multicores, owing to their wide diffusion in real-time decision making problems and data streaming analytics. Exasol features a native implementation.

    Read more →
  • Actifsource

    Actifsource

    Actifsource is a domain-specific modeling workbench. It is realized as plug-in for the software development environment Eclipse. Actifsource supports the creation of multiple domain models which can be linked together. It comes with a UML-like graphical editor to create domain-specific languages and a general graphical editor to edit structures in the created languages. It supports code generation using user-defined generic code templates which are directly linked to the domain models. Code generation is integrated into Eclipse's incremental build process. == Interoperability == Actifsource can use models from other modelling tools by importing and exporting the ecore format which is defined by the Eclipse Modeling Framework. == Licensing policy == There are two versions of actifsource available: The free community edition which can be used freely for non-commercial projects and the enterprise edition which contains additional features. The enterprise edition comes with customer support and maintenance for a limited period of time. This package allows the customers to upgrade to new versions and maintenance releases during their support period.

    Read more →
  • Navigational database

    Navigational database

    A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential. One of the earliest navigational databases was Integrated Data Store (IDS), which was developed by Bachman for General Electric in the 1960s. IDS became the basis for the CODASYL database model in 1969. Although Bachman described the concept of navigation in abstract terms, the idea of navigational access came to be associated strongly with the procedural design of the CODASYL Data Manipulation Language. Writing in 1982, for example, Tsichritzis and Lochovsky state that "The notion of currency is central to the concept of navigation." By the notion of currency, they refer to the idea that a program maintains (explicitly or implicitly) a current position in any sequence of records that it is processing, and that operations such as GET NEXT and GET PRIOR retrieve records relative to this current position, while also changing the current position to the record that is retrieved. Navigational database programming thus came to be seen as intrinsically procedural; and moreover to depend on the maintenance of an implicit set of global variables (currency indicators) holding the current state. As such, the approach was seen as diametrically opposed to the declarative programming style used by the relational model. The declarative nature of relational languages such as SQL offered better programmer productivity and a higher level of data independence (that is, the ability of programs to continue working as the database structure evolves.) Navigational interfaces, as a result, were gradually eclipsed during the 1980s by declarative query languages. During the 1990s it started becoming clear that for certain applications handling complex data (for example, spatial databases and engineering databases), the relational calculus had limitations. At that time, a reappraisal of the entire database market began, with several companies describing the new systems using the marketing term NoSQL. Many of these systems introduced data manipulation languages which, while far removed from the CODASYL DML with its currency indicators, could be understood as implementing Bachman's "navigational" vision. Some of these languages are procedural; others (such as XPath) are entirely declarative. Offshoots of the navigational concept, such as the graph database, found new uses in modern transaction processing workloads. == Description == Navigational access is traditionally associated with the network model and hierarchical model of database, and conventionally describes data manipulation APIs in which records (or objects) are processed one at a time, iteratively. The essential characteristic as described by Bachman, however, is finding records by virtue of their relationship to other records: so an interface can still be navigational if it has set-oriented features. From this viewpoint, the key difference between navigational data manipulation languages and relational languages is the use of explicit named relationships rather than value-based joins: for department with name="Sales", find all employees in set department-employees versus find employees, departments where employee.department-code = department.code and department.name="Sales". In practice, however, most navigational APIs have been procedural: the above query would be executed using procedural logic along the lines of the following pseudo-code: On this viewpoint, the key difference between navigational APIs and the relational model (implemented in relational databases) is that relational APIs use "declarative" or logic programming techniques that ask the system what to fetch, while navigational APIs instruct the system in a sequence of steps how to reach the required records. Most criticisms of navigational APIs fall into one of two categories: Usability: application code quickly becomes unreadable and difficult to debug Data independence: application code needs to change whenever the data structure changes For many years the primary defence of navigational APIs was performance. Database systems that support navigational APIs often use internal storage structures that contain physical links or pointers from one record to another. While such structures may allow very efficient navigation, they have disadvantages because it becomes difficult to reorganize the physical placement of data. It is quite possible to implement navigational APIs without low-level pointer chasing (Bachman's paper envisaged logical relationships being implemented just as in relational systems, using primary keys and foreign keys), so the two ideas should not be conflated. But without the performance benefits of low-level pointers, navigational APIs become harder to justify. Hierarchical models often construct primary keys for records by concatenating the keys that appear at each level in the hierarchy. Such composite identifiers are found in computer file names (/usr/david/docs/index.txt), in URIs, in the Dewey decimal system, and for that matter in postal addresses. Such a composite key can be considered as representing a navigational path to a record; but equally, it can be considered as a simple primary key allowing associative access. As relational systems came to prominence in the 1980s, navigational APIs (and in particular, procedural APIs) were criticized and fell out of favour. The 1990s, however, brought a new wave of object-oriented databases that often provided both declarative and procedural interfaces. One explanation for this is that they were often used to represent graph-structured information (for example spatial data and engineering data) where access is inherently recursive: the mathematics originally underpinning SQL (specifically, first-order predicate calculus) does not have sufficient power to support recursive queries, even those as simple as a transitive closure. More recent SQL implementations do support hierarchical and recursive queries. A current example of a popular navigational API can be found in the Document Object Model (DOM) often used in web browsers and closely associated with JavaScript. The DOM is essentially an in-memory hierarchical database with an API that is both procedural and navigational. By contrast, the same data (XML or HTML) can be accessed using XPath, which can be categorized as declarative and navigational: data is accessed by following relationships, but the calling program does not issue a sequence of instructions to be followed in order. Languages such as SPARQL used to retrieve Linked Data from the Semantic Web are also simultaneously declarative and navigational. == Examples == IBM Information Management System IDMS

    Read more →
  • Bibliographic database

    Bibliographic database

    A bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like journal and newspaper articles, conference proceedings, reports, government and legal publications, patents and books. In contrast to library catalogue entries, a majority of the records in bibliographic databases describe articles and conference papers rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts. A bibliographic database may cover a wide range of topics or one academic field like computer science. A significant number of bibliographic databases are marketed under a trade name by licensing agreement from vendors, or directly from their makers: the indexing and abstracting services. Many bibliographic databases have evolved into digital libraries, providing the full text of the organised contents:for instance CORE also organises and mirrors scholarly articles and OurResearch develops a search engine for open access content in Unpaywall. Others merge with non-bibliographic and scholarly databases to create more complete disciplinary search engine systems, such as Chemical Abstracts or Entrez. == History == Prior to the mid-20th century, individuals searching for published literature had to rely on printed bibliographic indexes, generated manually from index cards. During the early 1960s computers were used to digitize text for the first time; the purpose was to reduce the cost and time required to publish two American abstracting journals, the Index Medicus of the National Library of Medicine and the Scientific and Technical Aerospace Reports of the National Aeronautics and Space Administration (NASA). By the late 1960s, such bodies of digitized alphanumeric information, known as bibliographic and numeric databases, constituted a new type of information resource. Online interactive retrieval became commercially viable in the early 1970s over private telecommunications networks. The first services offered a few databases of indexes and abstracts of scholarly literature. These databases contained bibliographic descriptions of journal articles that were searchable by keywords in author and title, and sometimes by journal name or subject heading. The user interfaces were crude, the access was expensive, and searching was done by librarians on behalf of "end users".

    Read more →