AI Generator Of Trump

AI Generator Of Trump — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Tapingo

    Tapingo

    Tapingo was an American mobile commerce application that offers advance ordering for pickup and food delivery services for college campuses. The company was acquired by Grubhub in September 2018 for approximately $150 million. Following the acquisition, Tapingo’s campus-ordering functionality was integrated into the Grubhub app (Grubhub Campus Dining) and the Tapingo service was discontinued during 2019. Tapingo is differentiated from other on-demand delivery/logistics companies, such as Waiter.com, Postmates, or DoorDash, by focusing its efforts on serving the college market. Through Tapingo, users can browse menus, place orders, pay for the meal and schedule the pickup or have it delivered. On certain campuses, students are able to use their university's meal dollars to pay for food. In the spring of 2012, Tapingo first launched its services on five campuses (Santa Clara University, Loyola Marymount University, Biola University, the University of Maine, and California Lutheran University), and has since expanded to more than 200 college campuses across the U.S. and Canada, serving 100 markets. To date, Tapingo has received venture funding from Carmel Ventures, Khosla Ventures, Kinzon Capital, DCM Ventures and Qualcomm Ventures. In fall 2015, Tapingo announced expansion plans through major partnership deals with national brands like Chipotle Mexican Grill and 7-Eleven, regional restaurants such as Taco Bueno, and global foodservice provider Aramark.

    Read more →
  • Data management plan

    Data management plan

    A data management plan or DMP is a formal document that outlines how data are to be handled both during a research project, and after the project is completed. The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this may lead to data being well-managed in the present, and prepared for preservation in the future. DMPs were originally used in 1966 to manage aeronautical and engineering projects' data collection and analysis, and expanded across engineering and scientific disciplines in the 1970s and 1980s. Up until the early 2000s, DMPs were used "for projects of great technical complexity, and for limited mid-study data collection and processing purposes". In the 2000s and later, E-research and economic policies drove the development and uptake of DMPs. == Importance == Preparing a data management plan before data are collected is claimed to ensure that data are in the correct format, organized well, and better annotated. This could arguably save time in the long term because there is no need to re-organize, re-format, or try to remember details about data. It is also claimed to increase research efficiency since both the data collector and other researchers might be able to understand and use well-annotated data in the future. One component of a data management plan is data archiving and preservation. By deciding on an archive ahead of time, the data collector can format data during collection to make its future submission to a database easier. If data are preserved, they are more relevant since they can be re-used by other researchers. It also allows the data collector to direct requests for data to the database, rather than address requests individually. A frequent argument in favor of preservation is that data that are preserved have the potential to lead to new, unanticipated discoveries, and they prevent duplication of scientific studies that have already been conducted. Data archiving also provides insurance against loss by the data collector. In the 2010s, funding agencies increasingly required data management plans as part of the proposal and evaluation process, despite little or no evidence of their efficacy. == Major components == "There is no general and definitive list of topics that should be covered in a DMP for a research project", and researchers are often left to their own devices as to how to fill out a DMP. === Information about data and data format === A description of data to be produced by the project. This might include (but is not limited to) data that are: Experimental Observational Raw or derived Physical collections Models Simulations Curriculum materials Software Images How will the data be acquired? When and where will they be acquired? After collection, how will the data be processed? Include information about Software used Algorithms Scientific workflows File formats that will be used, justify those formats, and describe the naming conventions used. Quality assurance & quality control measures that will be taken during sample collection, analysis, and processing. If existing data are used, what are their origins? How will the data collected be combined with existing data? What is the relationship between the data collected and existing data? How will the data be managed in the short-term? Consider the following: Version control for files Backing up data and data products Security & protection of data and data products Who will be responsible for management === Metadata content and format === Metadata are the contextual details, including any information important for using data. This may include descriptions of temporal and spatial details, instruments, parameters, units, files, etc. Metadata is commonly referred to as "data about data". Issues to be considered include: How detailed has the metadata to be in order to make the data meaningful? How will the metadata be created and/or captured? Examples include lab notebooks, GPS hand-held units, Auto-saved files on instruments, etc. What format will be used for the metadata? What are the metadata standards commonly used in the respective scientific discipline? There should be justification for the format chosen. === Policies for access, sharing, and re-use === Describe any obligations that exist for sharing data collected. These may include obligations from funding agencies, institutions, other professional organizations, and legal requirements. Include information about how data will be shared, including when the data will be accessible, how long the data will be available, how access can be gained, and any rights that the data collector reserves for using data. Address any ethical or privacy issues with data sharing Address intellectual property & copyright issues. Who owns the copyright? What are the institutional, publisher, and/or funding agency policies associated with intellectual property? Are there embargoes for political, commercial, or patent reasons? Describe the intended future uses/users for the data Indicate how the data should be cited by others. How will the issue of persistent citation be addressed? For example, if the data will be deposited in a public archive, will the dataset have a persistent identifier (e.g., ARK, DOI, Handle, PURL, URN) assigned to it? === Long-term storage and data management === Researchers should identify an appropriate archive for the long-term preservation of their data. By identifying the archive early in the project, the data can be formatted, transformed, and documented appropriately to meet the requirements of the archive. Researchers should consult colleagues and professional societies in their discipline to determine the most appropriate database, and include a backup archive in their data management plan in case their first choice goes out of existence. Early in the project, the primary researcher should identify what data will be preserved in an archive. Usually, preserving the data in its most raw form is desirable, although data derivatives and products can also be preserved. An individual should be identified as the primary contact person for archived data, and ensure contact information is always kept up-to-date in case there are requests for data or information about data. === Budget === Data management and preservation costs may be considerable, depending on the nature of the project. By anticipating costs ahead of time, researchers ensure that the data will be properly managed and archived. Potential expenses that should be considered are Human resources and staff as they handle data preparation, management, documentation, and preservation Hardware and/or software needed for data management, backing up, security, documentation, and preservation Costs associated with submitting the data to an archive The data management plan should include how these costs will be paid. == NSF Data Management Plan == All grant proposals submitted to National Science Foundation (NSF) must include a Data Management Plan that is no more than two pages. This is a supplement (not part of the 15-page proposal) and should describe how the proposal will conform to the Award and Administration Guide policy (see below). It may include the following: The types of data The standards to be used for data and metadata format and content Policies for access and sharing Policies and provisions for re-use Plans for archiving data Policy summarized from the NSF Award and Administration Guide, Section 4 (Dissemination and Sharing of Research Results): Promptly publish with appropriate authorship Share data, samples, physical collections, and supporting materials with others, within a reasonable time frame Share software and inventions Investigators can keep their legal rights over their intellectual property, but they still have to make their results, data, and collections available to others Policies will be implemented via Proposal review Award negotiations and conditions Support/incentives == ESRC Data Management Plan == Since 1995, the UK's Economic and Social Research Council (ESRC) have had a research data policy in place. The current ESRC Research Data Policy states that research data created as a result of ESRC-funded research should be openly available to the scientific community to the maximum extent possible, through long-term preservation and high-quality data management. ESRC requires a data management plan for all research award applications where new data are being created. Such plans are designed to promote a structured approach to data management throughout the data lifecycle, resulting in better quality data that is ready to archive for sharing and re-use. The UK Data Service, the ESRC's flagship data service, provides practical guidance on research data management planning suitable for social science researchers in the UK and around the world. ESRC has a longstanding arrangement with the UK Data A

    Read more →
  • Metadirectory

    Metadirectory

    A metadirectory system provides for the flow of data between one or more directory services and databases in order to maintain synchronization of that data. It is an important part of identity management systems. The data being synchronized typically are collections of entries that contain user profiles and possibly authentication or policy information. Most metadirectory deployments synchronize data into at least one LDAP-based directory server, to ensure that LDAP-based applications such as single sign-on and portal servers have access to recent data, even if the data is mastered in a non-LDAP data source. Metadirectory products support filtering and transformation of data in transit. Most identity management suites from commercial vendors include a metadirectory product, or a user provisioning product.

    Read more →
  • Master data management

    Master data management

    Master data management (MDM) is a discipline in which business and information technology collaborate to ensure the uniformity, accuracy, stewardship, semantic consistency, and accountability of the enterprise's official shared master data assets. == Reasons for master data management == Data consistency and accuracy: MDM ensures that the organization's critical data is consistent and accurate across all systems, reducing discrepancies and errors caused by multiple, siloed copies of the same data. Improved decision-making: By providing a single version of the truth (SVOT), MDM enables organizations to deliver the right data to decision makers, allowing them to clearly understand business performance and make informed, data-driven decisions. Operational efficiency: With the consistent and accurate data provided by an MDM, operational processes such as reporting and inventory management can be automated to improve efficiency. Employee learning, onboarding, and customer service also become more efficient, as MDM data facilitates rapid, accurate, and thorough information retrieval, permitting more employee time to be spent on work. Regulatory compliance: MDM tries to help organizations comply with industry standards and regulations by ensuring that master data is accurately recorded, maintained, and audited. However, issues with data quality, classification, and reconciliation may require data transformation. As with other Extract, Transform, Load-based data movements, these processes are expensive and inefficient, reducing return on investment for a project. == Business unit and product line segmentation == As a result of business unit and product line segmentation, the same entity (whether a customer, supplier, or product) will be included in different product lines. This leads to data redundancy and even confusion. For example, a customer takes out a mortgage at a bank. If the marketing and customer service departments have separate databases, advertisements might still be sent to the customer, even though they've already signed up. The two parts of the bank are unaware, and the customer is sent irrelevant communications. Record linkage can associate different records corresponding to the same entity, mitigating this issue. == Mergers and acquisitions == One of the most common problems for master data management is company growth through mergers or acquisitions. Reconciling these separate master data systems can present difficulties, as existing applications have dependencies on the master databases. Ideally, database administrators resolve this problem through deduplication of the master data as part of the merger. Over time, as further mergers and acquisitions occur, the problem can multiply. Data reconciliation processes can become extremely complex or even unreliable. Some organizations end up with 10, 15, or even 100 separate and poorly integrated master databases. This can cause serious problems in customer satisfaction, operational efficiency, decision support, and regulatory compliance. Another problem involves determining the proper degrees of detail and normalization to include in the master data schema. For example, in a federated Human Resources environment, the enterprise software may focus on storing people's data as current status, adding a few fields to identify the date of hire, date of last promotion, etc. However, this simplification can introduce business-impacting errors into dependent systems for planning and forecasting. The stakeholders of such systems may be forced to build a parallel network of new interfaces to track the onboarding of new hires, planned retirements, and divestment, which works against one of the aims of master data management. == People, processes and technology == Master data management is enabled by technology, but is more than the technologies that enable it. An organization's master data management capability will also include people and processes in its definition. === People === Several roles should be staffed within MDM. Most prominently, the Data Owner and the Data Steward. Several people would likely be allocated to each role and each person responsible for a subset of Master Data (e.g. one data owner for employee master data, another for customer master data). The Data Owner is responsible for the requirements for data definition, data quality, data security, etc. as well as for compliance with data governance and data management procedures. The Data Owner should also be funding improvement projects in case of deviations from the requirements. The Data Steward is running the master data management on behalf of the data owner and probably also being an advisor to the Data Owner. === Processes === Master data management can be viewed as a "discipline for specialized quality improvement" defined by the policies and procedures put in place by a data governance organization. It has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing master data throughout an organization to ensure a common understanding, consistency, accuracy and control, in the ongoing maintenance and application use of that data. Processes commonly seen in master data management include source identification, data collection, data transformation, normalization, rule administration, error detection and correction, data consolidation, data storage, data distribution, data classification, taxonomy services, item master creation, schema mapping, product codification, data enrichment, hierarchy management, business semantics management and data governance. === Technology === A master data management tool can be used to support master data management by removing duplicates, standardizing data (mass maintaining), and incorporating rules to eliminate incorrect data from entering the system to create an authoritative source of master data. Master data are the products, accounts, and parties for which the business transactions are completed. Where the technology approach produces a "golden record" or relies on a "source of record" or "system of record", it is common to talk of where the data is "mastered". This is accepted terminology in the information technology industry, but care should be taken, both with specialists and with the wider stakeholder community, to avoid confusing the concept of "master data" with that of "mastering data". ==== Implementation models ==== There are several models for implementing a technology solution for master data management. These depend on an organization's core business, its corporate structure, and its goals. These include: Source of record Registry Consolidation Coexistence Transaction/centralized ===== Source of record ===== This model identifies a single application, database, or simpler source (e.g. a spreadsheet) as being the "source of record" (or "system of record" where solely application databases are relied on). The benefit of this model is its conceptual simplicity, but it may not fit with the realities of complex master data distribution in large organizations. The source of record can be federated, for example by groups of attributes (so that different attributes of a master data entity may have different sources of record) or geographically (so that different parts of an organization may have different master sources). Federation is only applicable in certain use cases, where there is a clear delineation of which subsets of records will be found in which sources. The source of record model can be applied more widely than simply to master data, for example to reference data. ==== Transmission of master data ==== There are several ways in which master data may be collated and distributed to other systems. This includes: Data consolidation – The process of capturing master data from multiple sources and integrating it into a single hub (operational data store) for replication to other destination systems. Data federation – The process of providing a single virtual view of master data from one or more sources to one or more destination systems. Data propagation – The process of copying master data from one system to another, typically through point-to-point interfaces in legacy systems. == Change management in implementation == Challenges in adopting master data management within large organizations often arise when stakeholders disagree on a "single version of the truth" concept is not affirmed by stakeholders, who believe that their local definition of the master data is necessary. For example, the product hierarchy used to manage inventory may be entirely different from the product hierarchies used to support marketing efforts or pay sales representatives. It is above all necessary to identify if different master data is genuinely required. If it is required, then the solution implemented (technology and process) must be able to allow multiple versions of the truth to exist but will prov

    Read more →
  • Spotify Kids

    Spotify Kids

    Spotify Kids is a Swedish kid-friendly Music streaming service developed by Spotify. It offers curated content for children, including music, audiobooks, lullabies, and bedtime stories, while providing their parents with parental controls. The service is only available to subscribers to Spotify's Premium Family subscription plan. == Function == Spotify Kids is a Swedish Kid-friendly Music Streaming Service that allows children to browse Spotify with parental controls. Using the app, parents can view their children's listening history, block specific songs, and share playlists with their children. The app also includes sing-along songs, playlists designed for young children, and curated audiobooks, lullabies, and bedtime stories. Access is included in Spotify's Premium Family subscription plan, and is exclusive to subscribers to the plan. Users can configure the app for a specific age group upon first launch. The playlists on Spotify Kids are curated by groups including Discovery Kids, Nickelodeon, Universal Pictures, and The Walt Disney Company. All content on the Spotify Kids app is curated by editors. As of March 2021, there were roughly 8,000 songs available on the platform. The design of the Spotify Kids app is colorful, and user interface varies depending on the age group for which the app is configured. Spotify Kids is designed to comply with consent and data collection regulations for apps used by children. TechCrunch explains that it is "designed on a grand scale to drive subscriptions to Spotify's top-tier $14.99-per-month Premium Family Plan." == Release == After being beta tested in Ireland in October 2019, it was released as a beta across the United Kingdom on February 11, 2020. It was later released in Sweden, Denmark, Australia, New Zealand, Mexico, Argentina, and Brazil. On March 31, 2021, it was made available in France, Canada, and the United States.

    Read more →
  • DONE

    DONE

    The Data-based Online Nonlinear Extremumseeker (DONE) algorithm is a black-box optimization algorithm. DONE models the unknown cost function and attempts to find an optimum of the underlying function. The DONE algorithm is suitable for optimizing costly and noisy functions and does not require derivatives. An advantage of DONE over similar algorithms, such as Bayesian optimization, is that the computational cost per iteration is independent of the number of function evaluations. == Methods == The DONE algorithm was first proposed by Hans Verstraete and Sander Wahls in 2015. The algorithm fits a surrogate model based on random Fourier features and then uses a well-known L-BFGS algorithm to find an optimum of the surrogate model. == Applications == DONE was first demonstrated for maximizing the signal in optical coherence tomography measurements, but has since then been applied to various other applications. For example, it was used to help extending the field of view in light sheet fluorescence microscopy.

    Read more →
  • Metadata

    Metadata

    Metadata (or metainformation) is data (or information) that defines and describes the characteristics of other data. It often helps to describe, explain, locate, or otherwise make data easier to retrieve, use, or manage. For example, the title, author, and publication date of a book are metadata about the book. But, while a data asset is finite, its metadata is infinite. As such, efforts to define, classify types, or structure metadata are expressed as examples in the context of its use. The term "metadata" has a history dating to the 1960s where it occurred in computer science and in popular culture. Different types of metadata serve different functions. For example, descriptive metadata for a document might include the author, creation date, file size and keywords. Metadata has various purposes. It can help users find relevant information and discover resources. It can also help organize electronic resources, provide digital identification, and archive and preserve resources. Metadata allows users to access resources by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information". Metadata of telecommunication activities including Internet traffic is very widely collected by various national governmental organizations. This data is used for the purposes of traffic analysis and can be used for mass surveillance. Unique metadata standards exist for different disciplines (e.g., museum collections, digital audio files, websites, etc.). Describing the contents and context of data or data files increases its usefulness. For example, a web page may include metadata specifying what software language the page is written in (e.g., HTML), what tools were used to create it, what subjects the page is about, and where to find more information about the subject. This metadata can automatically improve the reader's experience and make it easier for users to find the web page online. A CD may include metadata providing information about the musicians, singers, and songwriters whose work appears on the disc. In many countries, government organizations routinely store metadata about emails, telephone calls, web pages, video traffic, IP connections, and cell phone locations. == Types == There are many distinct types of metadata, including: Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. Administrative metadata – the information to help manage a resource, like resource type, and permissions, and when and how it was created. Reference metadata – the information about the contents and quality of statistical data. Statistical metadata – also called process data, may describe processes that collect, process, or produce statistical data. Legal metadata – provides information about the creator, copyright holder, and public licensing, if provided. Metadata is not strictly bound to one of these categories, as it can describe a piece of data in many other ways. While the metadata application is manifold, covering a large variety of fields, there are specialized and well-accepted models to specify types of metadata. Bretherton & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata. Structural metadata describes the structure of database objects such as tables, columns, keys and indexes. Guide metadata helps humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball, metadata can be divided into three categories: technical metadata (or internal metadata), business metadata (or external metadata), and process metadata. Dan Linstedt, creator of the data vault methodology, says business metadata "...provide[s] definition of the functionality, definition of the data, definition of the elements, and definition of how the data is used within business...business metadata includes business requirements, time-lines, business metrics, business process flows, and business terminology." Business metadata is important because it can greatly facilitate the usefulness of the data to business people. A simple example of business metadata is a glossary entry. Hover functionality in an application or web form can enable a glossary definition to be shown when cursor is on a field or term. Other examples of business metadata include annotation ability within applications. For example, a business user may be viewing a business intelligence (BI) report and notice a trend in the data. The user may have background knowledge as to why this trend occurs. Some business intelligence tools enable the user to create an annotation within the report that explains the trend. Such an annotation can enhance other users' understanding of the data. This example is especially powerful because it is created by a business user for the use of other business people. NISO distinguishes three types of metadata: descriptive, structural, and administrative. Descriptive metadata is typically used for discovery and identification, as information to search and locate an object, such as title, authors, subjects, keywords, and publisher. Structural metadata describes how the components of an object are organized. An example of structural metadata would be how pages are ordered to form chapters of a book. Finally, administrative metadata gives information to help manage the source. Administrative metadata refers to the technical information, such as file type, or when and how the file was created. Two sub-types of administrative metadata are rights management metadata and preservation metadata. Rights management metadata explains intellectual property rights, while preservation metadata contains information to preserve and save a resource. Statistical data repositories have their own requirements for metadata in order to describe not only the source and quality of the data but also what statistical processes were used to create the data, which is of particular importance to the statistical community in order to both validate and improve the process of statistical data production. An additional type of metadata beginning to be more developed is accessibility metadata. Accessibility metadata is not a new concept to libraries; however, advances in universal design have raised its profile. Projects like Cloud4All and GPII identified the lack of common terminologies and models to describe the needs and preferences of users and information that fits those needs as a major gap in providing universal access solutions. Those types of information are accessibility metadata. The Schema.org website has incorporated several accessibility properties based on IMS Global Access for All Information Model Data Element Specification. While the efforts to describe and standardize the varied accessibility needs of information seekers are beginning to become more robust, their adoption into established metadata schemas has not been as developed. For example, while Dublin Core (DC)'s "audience" and MARC 21's "reading level" could be used to identify resources suitable for users with dyslexia and DC's "format" could be used to identify resources available in braille, audio, or large print formats, there is more work to be done. == History == Metadata was traditionally used in the card catalogs of libraries until the 1980s when libraries converted their catalog data to digital databases. In the 2000s, as data and information were increasingly stored digitally, this digital data was described using metadata standards. An early description of "meta data" for computer systems was written by David Griffel and Stuart McIntosh at the MIT Center for International Studies in 1967: "In summary then, we have statements in an object language about subject descriptions of data and token codes for the data. We also have statements in a meta language describing the data relationships and transformations, and ought/is relations between norm and data." == Definition == Metadata means "data about data". Metadata is defined as the data providing information about one or more aspects of the data; it is used to summarize basic information about data that can make tracking and working with specific data easier. Some examples include: Means of creation of the data Source of the data Time and date of creation Creator or author of the data Location on a computer network where the data was created Standards used Data quality For example, a digital image may include metadata that describes the size of the image, its color depth, resolution,

    Read more →
  • Navigational database

    Navigational database

    A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential. One of the earliest navigational databases was Integrated Data Store (IDS), which was developed by Bachman for General Electric in the 1960s. IDS became the basis for the CODASYL database model in 1969. Although Bachman described the concept of navigation in abstract terms, the idea of navigational access came to be associated strongly with the procedural design of the CODASYL Data Manipulation Language. Writing in 1982, for example, Tsichritzis and Lochovsky state that "The notion of currency is central to the concept of navigation." By the notion of currency, they refer to the idea that a program maintains (explicitly or implicitly) a current position in any sequence of records that it is processing, and that operations such as GET NEXT and GET PRIOR retrieve records relative to this current position, while also changing the current position to the record that is retrieved. Navigational database programming thus came to be seen as intrinsically procedural; and moreover to depend on the maintenance of an implicit set of global variables (currency indicators) holding the current state. As such, the approach was seen as diametrically opposed to the declarative programming style used by the relational model. The declarative nature of relational languages such as SQL offered better programmer productivity and a higher level of data independence (that is, the ability of programs to continue working as the database structure evolves.) Navigational interfaces, as a result, were gradually eclipsed during the 1980s by declarative query languages. During the 1990s it started becoming clear that for certain applications handling complex data (for example, spatial databases and engineering databases), the relational calculus had limitations. At that time, a reappraisal of the entire database market began, with several companies describing the new systems using the marketing term NoSQL. Many of these systems introduced data manipulation languages which, while far removed from the CODASYL DML with its currency indicators, could be understood as implementing Bachman's "navigational" vision. Some of these languages are procedural; others (such as XPath) are entirely declarative. Offshoots of the navigational concept, such as the graph database, found new uses in modern transaction processing workloads. == Description == Navigational access is traditionally associated with the network model and hierarchical model of database, and conventionally describes data manipulation APIs in which records (or objects) are processed one at a time, iteratively. The essential characteristic as described by Bachman, however, is finding records by virtue of their relationship to other records: so an interface can still be navigational if it has set-oriented features. From this viewpoint, the key difference between navigational data manipulation languages and relational languages is the use of explicit named relationships rather than value-based joins: for department with name="Sales", find all employees in set department-employees versus find employees, departments where employee.department-code = department.code and department.name="Sales". In practice, however, most navigational APIs have been procedural: the above query would be executed using procedural logic along the lines of the following pseudo-code: On this viewpoint, the key difference between navigational APIs and the relational model (implemented in relational databases) is that relational APIs use "declarative" or logic programming techniques that ask the system what to fetch, while navigational APIs instruct the system in a sequence of steps how to reach the required records. Most criticisms of navigational APIs fall into one of two categories: Usability: application code quickly becomes unreadable and difficult to debug Data independence: application code needs to change whenever the data structure changes For many years the primary defence of navigational APIs was performance. Database systems that support navigational APIs often use internal storage structures that contain physical links or pointers from one record to another. While such structures may allow very efficient navigation, they have disadvantages because it becomes difficult to reorganize the physical placement of data. It is quite possible to implement navigational APIs without low-level pointer chasing (Bachman's paper envisaged logical relationships being implemented just as in relational systems, using primary keys and foreign keys), so the two ideas should not be conflated. But without the performance benefits of low-level pointers, navigational APIs become harder to justify. Hierarchical models often construct primary keys for records by concatenating the keys that appear at each level in the hierarchy. Such composite identifiers are found in computer file names (/usr/david/docs/index.txt), in URIs, in the Dewey decimal system, and for that matter in postal addresses. Such a composite key can be considered as representing a navigational path to a record; but equally, it can be considered as a simple primary key allowing associative access. As relational systems came to prominence in the 1980s, navigational APIs (and in particular, procedural APIs) were criticized and fell out of favour. The 1990s, however, brought a new wave of object-oriented databases that often provided both declarative and procedural interfaces. One explanation for this is that they were often used to represent graph-structured information (for example spatial data and engineering data) where access is inherently recursive: the mathematics originally underpinning SQL (specifically, first-order predicate calculus) does not have sufficient power to support recursive queries, even those as simple as a transitive closure. More recent SQL implementations do support hierarchical and recursive queries. A current example of a popular navigational API can be found in the Document Object Model (DOM) often used in web browsers and closely associated with JavaScript. The DOM is essentially an in-memory hierarchical database with an API that is both procedural and navigational. By contrast, the same data (XML or HTML) can be accessed using XPath, which can be categorized as declarative and navigational: data is accessed by following relationships, but the calling program does not issue a sequence of instructions to be followed in order. Languages such as SPARQL used to retrieve Linked Data from the Semantic Web are also simultaneously declarative and navigational. == Examples == IBM Information Management System IDMS

    Read more →
  • Automated decision-making

    Automated decision-making

    Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra

    Read more →
  • Document capture software

    Document capture software

    Document capture software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process. == Typical features == Typical features of Document Capture Software include: Barcode recognition Patch Code recognition Separation Optical Character Recognition (OCR) Optical Mark Recognition (OMR) Quality Assurance Indexing Migration === Goal for implementation of a document capture solution === The goal for implementing a document capture solution is to reduce the amount of time spent scanning, separating, enhancing, organizing, classifying, normalizing, and collecting information from document collections, and to produce metadata along with an image/PDF file, and/or OCR text. This information is then migrated to a file share, FTP site, database, Document Management or Enterprise Content Management system. These systems often provide a search function, allowing search of the assets based on the produced metadata, and then viewed using document imaging software. == General document capture system solutions == === Integration with document management system === ECM (Enterprise Content management) and their DMS component (Document Management System) are being adopted by many organizations as a corporate document management system for all types of electronic files, e.g. MS word, PDF ... However, much of the information held by organisations is on paper and this needs to be integrated within the same document repository. By converting paper documents into digital format through scanning, organizations convert paper into image formats such as TIF, JPG, and PDF, and also extract valuable index information or business data from the document using OCR technology. Digital documents and associated metadata can easily be stored in the ECM in a variety of formats. The most popular of these formats is PDF which not only provides an accurate representation of the document but also allows all the OCR text in the document to be stored behind the PDF image. This format is known as PDF with hidden text or text-searchable PDF. This allows users to search for documents by using keywords in the metadata fields or by searching the content of PDF files across the repository. ==== Advantages of scanning documents into a ECM/DMS ==== Information held on paper is usually just as valuable to organisations as the electronic documents that are generated internally. Often this information represents a large proportion of the day to day correspondence with suppliers and customers. Having the ability to manage and share this information internally through a document management system such as SharePoint or a CMIS-compatible repository improves collaboration between departments or employees and also eliminates the risk of losing this information through disasters such as floods or fire. Organisations adopting an ECM/DMS often implement electronic workflow which allows the information held on paper to be included as part of an electronic business process and incorporated into a customer record file along with other associated office documents and emails. For business critical documents, such as purchase orders and supplier invoices, digitising documents helps speed up business transactions as well as reduce manual effort involved in keying data into business systems, such as CRM, ERP and Accounting. Scanned invoices can also be routed to managers for payment approval via email or an electronic workflow. == Electronic document capture == In the earlier implementations of Document Capture Software, the technology focused solely on the digitization and capture of information from paper documents. Document images were acquired from document scanners via TWAIN/ISIS drivers. Only image-based file formats like TIF, JPG, and BMP were typically compatible with these solutions. But in recent years, as the volume of electronically-created documents and the number of proprietary file formats continues to increase at exponential rates, the need for handling documents existing in electronic formats has grown. The relevant document capture products have adapted to function with non-image file formats with the end-goal of creating a unified processing workflow capable of handling all incoming documents The ability to import files from a variety of sources is one example of such adaptation. Importing documents from ECM/DMS software solutions, email servers, FTP, and EDI is now as much of a requirement of document capture software as is paper capture. The normalization of output files to text-based PDF format is now another critical factor in long-term archival of proprietary electronic file formats. Normalization expands access and usage of files to users throughout the enterprise, rather than only those that created the original electronic file.

    Read more →
  • Xulvi-Brunet–Sokolov algorithm

    Xulvi-Brunet–Sokolov algorithm

    Xulvi-Brunet and Sokolov's algorithm generates networks with chosen degree correlations. This method is based on link rewiring, in which the desired degree is governed by parameter ρ. By varying this single parameter it is possible to generate networks from random (when ρ = 0) to perfectly assortative or disassortative (when ρ = 1). This algorithm allows to keep network's degree distribution unchanged when changing the value of ρ. == Assortative model == In assortative networks, well-connected nodes are likely to be connected to other highly connected nodes. Social networks are examples of assortative networks. This means that an assortative network has the property that almost all nodes with the same degree are linked only between themselves. The Xulvi-Brunet–Sokolov algorithm for this type of networks is the following. In a given network, two links connecting four different nodes are chosen randomly. These nodes are ordered by their degrees. Then, with probability ρ, the links are randomly rewired in such a way that one link connects the two nodes with the smaller degrees and the other connects the two nodes with the larger degrees. If one or both of these links already existed in the network, the step is discarded and is repeated again. Thus, there will be no self-connected nodes or multiple links connecting the same two nodes. Different degrees of assortativity of a network can be achieved by changing the parameter ρ. Assortative networks are characterized by highly connected groups of nodes with similar degree. As assortativity grows, the average path length and clustering coefficient increase. == Disassortative model == In disassortative networks, highly connected nodes tend to connect to less-well-connected nodes with larger probability than in uncorrelated networks. Examples of such networks include biological networks. The Xulvi-Brunet and Sokolov's algorithm for this type of networks is similar to the one for assortative networks with one minor change. As before, two links of four nodes are randomly chosen and the nodes are ordered with respect to their degrees. However, in this case, the links are rewired (with probability p) such that one link connects the highest connected node with the node with the lowest degree and the other link connects the two remaining nodes randomly with probability 1 − ρ. Similarly, if the new links already existed, the previous step is repeated. This algorithm does not change the degree of nodes and thus the degree distribution of the network.

    Read more →
  • Semantic query

    Semantic query

    Semantic queries allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results (possibly the distinctive selection of one single piece of information) or to answer more fuzzy and wide open questions through pattern matching and digital reasoning. Semantic queries work on named graphs, linked data or triples. This enables the query to process the actual relationships between information and infer the answers from the network of data. This is in contrast to semantic search, which uses semantics (meaning of language constructs) in unstructured text to produce a better search result. (See natural language processing.) From a technical point of view, semantic queries are precise relational-type operations much like a database query. They work on structured data and therefore have the possibility to utilize comprehensive features like operators (e.g. >, < and =), namespaces, pattern matching, subclassing, transitive relations, semantic rules and contextual full text search. The semantic web technology stack of the W3C is offering SPARQL to formulate semantic queries in a syntax similar to SQL. Semantic queries are used in triplestores, graph databases, semantic wikis, natural language and artificial intelligence systems. == Background == Relational databases represent all relationships between data in an implicit manner only. For example, the relationships between customers and products (stored in two content-tables and connected with an additional link-table) only come into existence in a query statement (SQL in the case of relational databases) written by a developer. Writing the query demands exact knowledge of the database schema. Linked-Data represent all relationships between data in an explicit manner. In the above example, no query code needs to be written. The correct product for each customer can be fetched automatically. Whereas this simple example is trivial, the real power of linked-data comes into play when a network of information is created (customers with their geo-spatial information like city, state and country; products with their categories within sub- and super-categories). Now the system can automatically answer more complex queries and analytics that look for the connection of a particular location with a product category. The development effort for this query is omitted. Executing a semantic query is conducted by walking the network of information and finding matches (also called Data Graph Traversal). Another important aspect of semantic queries is that the type of the relationship can be used to incorporate intelligence into the system. The relationship between a customer and a product has a fundamentally different nature than the relationship between a neighbourhood and its city. The latter enables the semantic query engine to infer that a customer living in Manhattan is also living in New York City whereas other relationships might have more complicated patterns and "contextual analytics". This process is called inference or reasoning and is the ability of the software to derive new information based on given facts. == Articles == Velez, Golda (2008). "Semantics Help Wall Street Cope With Data Overload". Wall Street & Technology. wallstreetandtech.com. Zhifeng, Xiao (2009). "Spatial information semantic query based on SPARQL". In Liu, Yaolin; Tang, Xinming (eds.). International Symposium on Spatial Analysis, Spatial-Temporal Data Modeling, and Data Mining. Vol. 7492. SPIE. pp. 74921P. Bibcode:2009SPIE.7492E..60X. doi:10.1117/12.838556. S2CID 62191842. Aquin, Mathieu (2010). "Watson, more than a Semantic Web search engine" (PDF). Semantic Web Journal. Dworetzky, Tom (2011). "How Siri Works: iPhone's 'Brain' Comes from Natural Language Processing". International Business Times. Horwitt, Elisabeth (2011). "The semantic Web gets down to business". computerworld.com. Rodriguez, Marko (2011). "Graph Pattern Matching with Gremlin". Marko A. Rodriguez. markorodriguez.com on Graph Computing. Sequeda, Juan (2011). "SPARQL Nuts & Bolts". Cambridge Semantics. Freitas, Andre (2012). "Querying Heterogeneous Datasets on the Linked Data Web" (PDF). IEEE Internet Computing. Kauppinen, Tomi (2012). "Using the SPARQL Package in R to handle Spatial Linked Data". linkedscience.org. Lorentz, Alissa (2013). "With Big Data, Context is a Big Issue". Wired.

    Read more →
  • Decision list

    Decision list

    Decision lists are a representation for Boolean functions which can be easily learned from examples. Single term decision lists are more expressive than disjunctions and conjunctions; however, 1-term decision lists are less expressive than the general disjunctive normal form and the conjunctive normal form. The language specified by a k-length decision list includes as a subset the language specified by a k-depth decision tree. Learning decision lists can be used for attribute efficient learning, a type of machine learning. == Definition == A decision list (DL) of length r is of the form: if f1 then output b1 else if f2 then output b2 ... else if fr then output br where fi is the ith formula and bi is the ith boolean for i ∈ { 1... r } {\displaystyle i\in \{1...r\}} . The last if-then-else is the default case, which means formula fr is always equal to true. A k-DL is a decision list where all of formulas have at most k terms. Sometimes "decision list" is used to refer to a 1-DL, where all of the formulas are either a variable or its negation.

    Read more →
  • BRS/Search

    BRS/Search

    BRS/Search is a full-text database and information retrieval system. BRS/Search uses a fully inverted indexing system to store, locate, and retrieve unstructured data. It was the search engine that in 1977 powered Bibliographic Retrieval Services (BRS) commercial operations with 20 databases (including the first national commercial availability of MEDLINE); it has changed ownership several times during its development and is currently sold as Livelink ECM Discovery Server by Open Text Corporation. == Early development == Development on what was to become BRS began as Biomedical Communications Network (BCN) at the State University of New York at Albany (SUNY). BCN, which went online in 1968, provided on-line access to nine databases, including MEDLINE and BIOSIS Previews, to large universities and medical schools primarily in the Northeast of the USA. State funding for the project was withdrawn in 1975, and Bibliographic Retrieval Services (BRS) was formed as a non-profit concern the following year. It was incorporated in May 1976 as a for-profit corporation with Ron Quake as president, Jan Egeland as vice president in charge of marketing and training, and Lloyd Palmer as vice president of systems. == BRS commercial operations == In December 1976, the First BRS User Meeting was held in Syracuse, New York, and by January 1977 BRS started commercial operations with 20 databases (including the first national commercial availability of MEDLINE) and 9 million records, using modified IBM STAIRS (STorage And Information Retrieval System) software, Telenet for telecommunications, and timesharing mainframe computers of Carrier Corporation. In October 1980 BRS was sold by Egeland and Quake to Indian Head, Inc., a subsidiary of the Dutch company Thyssen-Bornemisza Group. == 1989–1993 == In 1989 Robert Maxwell acquired BRS and the BRS/Search software; he announced the planned incorporation of the ORBIT Search Service and BRS Information Technologies and renamed the whole group Maxwell Online, Inc. At that time BRS Information Technologies was serving the medical and academic library marketplace with over 150 databases. Maxwell later bought the publishing company Macmillan and put Maxwell Online under Macmillan. In the same year BRS/LINK (hypertext connection of databases; first application delivering full text) was announced. The initial BRS/LINK application "relates the citation in a bibliographic database to its full-text article in a second database," and "eliminates the need to re-execute a search strategy in the second database in order to find the corresponding full-text article." Initially BRS/LINK supported linking only selected bibliographic databases: MEDLINE, Health Planning and Administration, and MEDLINE References on AIDS to the full-text Comprehensive Core Medical Library. At the time of Robert Maxwell’s death in 1991, Macmillan brought in Andrew Gregory to represent the company during the 2 years that Maxwell’s affairs were being settled and to prepare Maxwell Online to be able to sell the components. Maxwell Online shortly thereafter underwent yet another name change, this time to InfoPro Technologies. == Dataware Technologies ownership of BRS/SEARCH == Early in 1994, InfoPro Technologies, a subsidiary of MHC Inc. (holding company for Macmillan Inc.), the former Maxwell Online service, sold off all its subsidiaries. ORBIT Search Services went to the French-owned Questel, the dial-up BRS Search Services to CD Plus Technologies (later to become OVID), and BRS Software Products (including BRS/SEARCH) to Dataware Technologies. Almost up to the end of InfoPro Technologies, BRS Software had been the fastest growing segment of the company. At the 14th BRS North American Users Group Conference in 1999, Dave Schubmehl of Dataware Technologies presented a paper in which he stated "The purpose of this presentation is to update BRS users on upcoming releases of BRS/Search, NetAnswer, and other Dataware products. BRS/Search 7.0 will include features specifically requested by customers, as well as other enhancements. Earlier this year, Dataware acquired Sovereign Hill Software, makers of InQuery. In light of that acquisition, and Dataware's other development projects, we'll look at Dataware's plans for all products, including BRS/Search and NetAnswer." == Open Text acquisition of BRS/Search == In 2001 BRS/Search was acquired by Open Text and became LiveLink ECM Discovery Server. It is now referred to as Open Text Discovery Server. Open Text still supports both BRS/Search and NetAnswer. The core BRS/Search technology in the Open Text portfolio was augmented with other capabilities through various acquisitions. For example, Dataware's acquisition of Sovereign-Hill brought InQuery, “a probabilistic information retrieval system using an inference network”, which was developed by the University of Massachusetts Amherst Center for Intelligent Information Retrieval] out of the UMass CIIR and into the marketplace. A product re-branding table shows the range of products, their old names and their new names. InQuery is a concept search engine that uses noun phrases, parts of speech and other co-occurrence relationships in overlapping passages of text rather than single term inverted indexes of single words in documents. Open Text's portfolio has grown to include Hummingbird Content Management, and has always included BASIS. == 2003 == BRS/Search North America User's Group (BRSNAUG) website with a June 8, 2003 date listed the following features for BRS/Search. The BRSNAUG also disincorporated in 2003. Cross-references to BRS/Search on the World Wide Web point to Open Text Livelink. Engine features include: Rapid query response time. Numerical data handling and elementary statistical processing (sum, avg, min, max) Search results weighting and relevancy ranking Left- and right-truncation and expansion of search terms Superior data compression – loaded databases typically use only about 1.5 times the input stream size in disk space Large capacity databases – up to 100 million documents, each with up to 65,000 paragraphs Fine control of indexing and searching – right down to the word, sentence, and paragraph level Fine control over data security. Document access can be controlled at the database, document, and paragraph level International language support for all 7/8 bit characters sets and customizable language tables Flexible and customizable stop word lists ANSI-compatible thesauri Hypertext links within and between documents and databases (R6.x) Support for natural language parsing of queries Automatic document summarization tools Client/Server development Programming interfaces for World-Wide Web (HTTP, HTML) access to databases

    Read more →
  • CENDI

    CENDI

    CENDI (Commerce, Energy, NASA, Defense Information Managers Group) is an interagency group of senior Scientific and Technical Information (STI) managers from 14 United States federal agencies. CENDI managers cooperate by exchanging information and ideas, collaborating to address common issues, and undertaking joint initiatives. CENDI's accomplishments range from impacting federal information policy to educating a broad spectrum of stakeholders on all aspects of federal STI systems, including its value to research and the taxpayer, and to operational improvements in agency and interagency STI operations. == History == CENDI traces its roots to the Committee on Scientific and Technical Information (COSATI) of the Federal Council on Science and Technology. COSATI was established in the early 1960s to coordinate the management of the results from the U.S. government's increasing commitment to scientific research and technology development. The scientific and technical information (STI) managers of the government's major research and development (R&D) agencies worked within COSATI to standardize guidelines for cataloging and indexing technical reports. COSATI ceased formal operations in the early 1970s. To continue the cooperation begun under COSATI, managers of agency STI programs from Commerce (National Technical Information Service), Energy (Office of Scientific and Technical Information), NASA (HQ/STI Division), and Defense (Defense Technical Information Center) began meeting periodically to discuss common topics and stimulate more effective cooperation. In 1985, a Memorandum of Understanding was signed by the four charter agencies and CENDI was established. From this small core of STI managers, CENDI has grown to its current membership, which represents the major science agencies, the national libraries, and agencies involved in the dissemination and long-term management of scientific and technical information. The vision of CENDI is to facilitate cooperative enterprise where capabilities are shared and challenges are faced together so that the sum of the accomplishments is greater than each individual agency can achieve on its own amongst federal STI agencies. The abbreviation CENDI refers to the "Commerce, Energy, NASA, Defense Information Managers Group". == Membership == New members from other federal R&D information organizations may be admitted by unanimous agreement of the members. However, it is the intent of the group that membership in CENDI should remain small and focus on organizations with STI or supporting responsibilities. Each agency provides funding to CENDI. == Members == The members of CENDI are: Defense Technical Information Center (United States Department of Defense) Office of Research and Development and Office of Environmental Information (United States Environmental Protection Agency) Government Printing Office Library of Congress NASA Scientific and Technical Information Program National Agricultural Library (United States Department of Agriculture) National Archives and Records Administration National Library of Education (United States Department of Education) National Library of Medicine (United States Department of Health and Human Services) National Science Foundation National Technical Information Service (United States Department of Commerce) National Transportation Library (United States Department of Transportation) Office of Scientific and Technical Information (United States Department of Energy) USGS/Biological Resources Discipline (United States Department of the Interior) == Mission and operation == CENDI's mission is to help improve the productivity of federal science- and technology-based programs through effective scientific, technical, and related information support systems. In fulfilling its mission, CENDI agencies play an important role in addressing science- and technology-based national priorities and strengthening U.S. competitiveness. === Goals === STI Coordination and Leadership: Provide coordination and leadership for information exchange on important STI policy issues. Improvement of STI Systems: Promote the development of improved STI systems through the productive interrelationship of content and technology. STI Understanding: Promote better understanding of STI and STI management. === Principals and Alternates === CENDI is made up of senior federal STI managers and each organization appoints a Principal representative. This person is the point of contact for that organization within CENDI. Each Principal has an Alternate. The Principals and Alternates comprise the main group that meets on a regular basis, usually every other month. === Secretariat === A Tennessee-based information management company, -- Information International Associates, Inc., currently serves as the CENDI Secretariat. The Secretariat provides day-to-day operations to CENDI. The Secretariat prepares the necessary materials for the Principals' meetings, provides support for the working group and task group meetings, assists in developing papers, and maintains the CENDI files and outreach tools. === Task Groups and Working Groups === The chair(s) of a working group is appointed by the Principals and has the overall responsibility for the group's activities. The Secretariat provides support at the request of the Working Group chair(s). The Working Groups and Task Groups that are currently operating are: Copyright and Intellectual Property Working Group Distribution Markings Task Group Digital Preservation Task Group Digitization Specifications Task Group Image Metadata Task Group Science.gov (see below) STI Policy Working Group Terminology Resources Task Group === Science.gov and Worldwidescience.org === In 2001, in response to the April 2001 workshop on "Strengthening the Public Information Infrastructure for Science", and taking into consideration a request from Firstgov (now USA.gov) to develop specialized topical portals, CENDI formed an alliance to develop an interagency website for access to STI. This website, called Science.gov, is a one-stop source of STI, including both selected, authoritative government websites and deep Web databases of technical reports, journal articles, conference proceedings, and other published materials. Through the volunteer efforts of members and involving over 100 staff, content and architecture is developed for the site. The Science.gov website is hosted by the Department of Energy (DOE) Office of Scientific and Technical Information (OSTI). The site was formally launched in December 2002. As a result of the success of Science.gov, under DOE leadership and in cooperation with the International Council of Scientific and Technical Information, a worldwide coordination across national portals called WorldWideScience was launched in 2008. === Work with non-member organizations === CENDI works with several cooperating non-member organizations on a regular basis. These agencies are in academia, federal government, legal and policy analysis, international, non-governmental, and private organizations.

    Read more →