AI Code Visualizer

AI Code Visualizer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Couchbase Server

    Couchbase Server

    Couchbase Server, originally known as Membase, is a source-available, distributed (shared-nothing architecture) multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines. Couchbase Server provided client protocol compatibility with memcached, but added disk persistence, data replication, live cluster reconfiguration, rebalancing and multitenancy with data partitioning. == Product history == Membase was developed by several leaders of the memcached project, who had founded a company, NorthScale, to develop a key-value store with the simplicity, speed, and scalability of memcached, but also the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors Zynga and Naver Corporation (then known as NHN) to a new project on membase.org in June 2010. On February 8, 2011, the Membase project founders and Membase, Inc. announced a merger with CouchOne (a company with many of the principal players behind CouchDB) with an associated project merger. The merged company was called Couchbase, Inc. In January 2012, Couchbase released Couchbase Server 1.8. In September of 2012, Orbitz said it had changed some of its systems to use Couchbase. In December of 2012, Couchbase Server 2.0 (announced in July 2011) was released and included a new JSON document store, indexing and querying, incremental MapReduce and replication across data centers. == Architecture == Every Couchbase node consists of a data service, index service, query service, and cluster manager component. Starting with the 4.0 release, the three services can be distributed to run on separate nodes of the cluster if needed. In the parlance of Eric Brewer's CAP theorem, Couchbase is normally a CP type system meaning it provides consistency and partition tolerance, or it can be set up as an AP system with multiple clusters. === Cluster manager === The cluster manager supervises the configuration and behavior of all the servers in a Couchbase cluster. It configures and supervises inter-node behavior like managing replication streams and re-balancing operations. It also provides metric aggregation and consensus functions for the cluster, and a RESTful cluster management interface. The cluster manager uses the Erlang programming language and the Open Telecom Platform. ==== Replication and fail-over ==== Data replication within the nodes of a cluster can be controlled with several parameters. In December of 2012, support was added for replication between different data centers. === Data manager === The data manager stores and retrieves documents in response to data operations from applications. It asynchronously writes data to disk after acknowledging to the client. In version 1.7 and later, applications can optionally ensure data is written to more than one server or to disk before acknowledging a write to the client. Parameters define item ages that affect when data is persisted, and how max memory and migration from main-memory to disk is handled. It supports working sets greater than a memory quota per "node" or "bucket". External systems can subscribe to filtered data streams, supporting, for example, full text search indexing, data analytics or archiving. ==== Data format ==== A document is the most basic unit of data manipulation in Couchbase Server. Documents are stored in JSON document format with no predefined schemas. Non-JSON documents can also be stored in Couchbase Server (binary, serialized values, XML, etc.) ==== Object-managed cache ==== Couchbase Server includes a built-in multi-threaded object-managed cache that implements memcached compatible APIs such as get, set, delete, append, prepend etc. ==== Storage engine ==== Couchbase Server has a tail-append storage design that is immune to data corruption, OOM killers or sudden loss of power. Data is written to the data file in an append-only manner, which enables Couchbase to do mostly sequential writes for update, and provide an optimized access patterns for disk I/O. === Performance === A performance benchmark done by Altoros in 2012, compared Couchbase Server with other technologies. Cisco Systems published a benchmark that measured the latency and throughput of Couchbase Server with a mixed workload in 2012. == Licensing and support == Couchbase Server is a packaged version of Couchbase's open source software technology and is available in a community edition without recent bug fixes with an Apache 2.0 license and an edition for commercial use. Couchbase Server builds are available for Ubuntu, Debian, Red Hat, SUSE, Oracle Linux, Microsoft Windows and macOS operating systems. Couchbase has supported software developers' kits for the programming languages .NET, PHP, Ruby, Python, C, Node.js, Java, Go, and Scala. == SQL++ == A query language called SQL++ (formerly called N1QL), is used for manipulating the JSON data in Couchbase, just like SQL manipulates data in RDBMS. It has SELECT, INSERT, UPDATE, DELETE, MERGE statements to operate on JSON data. It was initially announced in March 2015 as "SQL for documents". The SQL++ data model is non-first normal form (N1NF) with support for nested attributes and domain-oriented normalization. The SQL++ data model is also a proper superset and generalization of the relational model. === Example === Like query SELECT FROM `bucket` WHERE email LIKE "%@example.org"; Array query SELECT FROM `bucket` WHERE ANY x IN friends SATISFIES x.name = "Pavan" END; == Couchbase Mobile == Couchbase Mobile / Couchbase Lite is a mobile database providing data replication. Couchbase Lite (originally TouchDB) provides native libraries for offline-first NoSQL databases with built-in peer-to-peer or client-server replication mechanisms. Sync Gateway manages secure access and synchronization of data between Couchbase Lite and Couchbase Server. Couchbase Lite added support for Vector Search in version 3.2, allowing cloud to edge support for vector search in mobile applications. == Uses == Couchbase began as an evolution of Memcached, a high-speed data cache, and can be used as a drop-in replacement for Memcached, providing high availability for memcached application without code changes. Couchbase is used to support applications where a flexible data model, easy scalability, and consistent high performance are required, such as tracking real-time user activity or providing a store of user preferences or online applications. Couchbase Mobile, which stores data locally on devices (usually mobile devices) is used to create “offline-first” applications that can operate when a device is not connected to a network and synchronize with Couchbase Server once a network connection is re-established. The Catalyst Lab at Northwestern University uses Couchbase Mobile to support the Evo application, a healthy lifestyle research program where data is used to help participants improve dietary quality, physical activity, stress, or sleep. Amadeus uses Couchbase with Apache Kafka to support their “open, simple, and agile” strategy to consume and integrate data on loyalty programs for airline and other travel partners. High scalability is needed when disruptive travel events create a need to recognize and compensate high value customers. Starting in 2012, it played a role in LinkedIn's caching systems, including backend caching for recruiter and jobs products, counters for security defense mechanisms, for internal applications. == Alternatives == For caching, Couchbase competes with Memcached and Redis. For document databases, Couchbase competes with other document-oriented database systems. It is commonly compared with MongoDB, Amazon DynamoDB, Oracle RDBMS, DataStax, Google Bigtable, MariaDB, IBM Cloudant, Redis Enterprise, SingleStore, and MarkLogic.

    Read more →
  • Query language

    Query language

    A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve information. A well known example is the Structured Query Language (SQL). == Types == Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages. The difference is that a database query language attempts to give factual answers to factual questions, while an information retrieval query language attempts to find documents containing information that is relevant to an area of inquiry. Other types of query languages include: Full-text. The simplest query language is treating all terms as bag of words that are to be matched with the postings in the inverted index and where subsequently ranking models are applied to retrieve the most relevant documents. Only tokens are defined in the CFG. Web search engines often use this approach. Boolean. A query language that also supports the use of the Boolean operators AND, OR, NOT. Structured. A language that supports searching within (a combination of) fields when a document is structured and has been indexed using its document structure. Natural language. A query language that supports natural language by parsing the natural language query to a form that can be best used to retrieve relevant documents, for example with Question answering systems or conversational search. == Examples == Attempto Controlled English is a query language that is also a controlled natural language. AQL is a query language for the ArangoDB native multi-model database system. .QL is a proprietary object-oriented query language for querying relational databases; successor of Datalog. CodeQL is the analysis engine used by developers to automate security checks, and by security researchers to perform variant analysis on GitHub. Contextual Query Language (CQL) a formal language for representing queries to information retrieval systems such as web indexes or bibliographic catalogues. Cypher is a query language for the Neo4j graph database. DMX is a query language for data mining models. Datalog is a query language for deductive databases. F-logic is a declarative object-oriented language for deductive databases and knowledge representation. FQL enables you to use a SQL-style interface to query the data exposed by the Graph API. It provides advanced features not available in the Graph API. Gellish English is a language that can be used for queries in Gellish English Databases, for dialogues (requests and responses) as well as for information modeling and knowledge modeling. Gremlin is an Apache Software Foundation graph traversal language for OLTP and OLAP graph systems. GraphQL is a data query language developed by Facebook as an alternate to REST and ad-hoc webservice architectures. HTSQL is a query language that translates HTTP queries to SQL. ISBL is a query language for PRTV, one of the earliest relational database management systems. Jaql is a functional data processing and query language most commonly used for JSON query processing. JPQL is a query language defined as part of Jakarta Persistence (used in Java applications to make queries to a relational DB using entity objects instead of DB tables). jq is a functional programming language often used for processing queries against one or more JSON documents, including very large ones. JSONiq is a declarative query language designed for collections of JSON documents. KQL (Kusto Query Language), a query language by Microsoft used in Azure Data Explorer LDAP is an application protocol for querying and modifying directory services running over TCP/IP. LogiQL is a variant of Datalog and is the query language for the LogicBlox system. M Formula language, a mashup query language used in Microsoft's Power Query. MQL is a cheminformatics query language for a substructure search allowing beside nominal properties also numerical properties. MDX is a query language for OLAP databases. N1QL is a Couchbase's query language finding data in Couchbase Servers. Object Query Language OCL (Object Constraint Language). Despite its name, OCL is also an object query language and an OMG standard. OPath, intended for use in querying WinFS Stores. Poliqarp Query Language is a special query language designed to analyze annotated text. Used in the Poliqarp search engine. PQL is a special-purpose programming language for managing process models based on information about scenarios that these models describe. PRQL PRQL (Pipelined Relational Query Language) is a modern language for transforming data. Consists of a curated set of orthogonal transformations, which are combined together to form a pipeline. PTQL based on relational queries over program traces, allowing programmers to write expressive, declarative queries about program behavior. QUEL is a relational database access language, similar in most ways to SQL. RDQL is a RDF query language. SMARTS is the cheminformatics standard for a substructure search. SPARQL is a query language for RDF graphs. SQL is a well-known query language and data manipulation language for relational databases. XQuery is a query language for XML data sources. XPath is a declarative language for navigating XML documents. YQL is an SQL-like query language created by Yahoo!. Search engine query languages, e.g., as used by Google. or Bing

    Read more →
  • Agentic commerce

    Agentic commerce

    Agentic commerce (also referred to as agent-based commerce) describes an emerging form of e-commerce in which autonomous artificial intelligence (AI) agents independently execute purchasing and payment processes on behalf of users or organizations. Unlike conventional digital commerce systems, which require direct human interaction at key decision points, agentic commerce systems are designed to search for products or services, evaluate options, make purchasing decisions, and complete payments without real-time human involvement. An emerging development within the broader fields of e-commerce, fintech, and artificial intelligence; agentic commerce combines advances in generative AI, autonomous agents, application programming interfaces (APIs), and digital payment infrastructures to direct transactions with no direct human interaction. == Characteristics == A defining feature of agentic commerce is the delegation of end-to-end commercial activities to software agents. These agents typically operate according to predefined user preferences, rules, or constraints, such as price limits, quality criteria, delivery times, or preferred payment methods. Based on these parameters, an agent can autonomously perform tasks including product discovery, price comparison, contract selection, order placement, and payment execution. In contrast to decision-support systems, which provide recommendations to human users, agentic commerce systems are designed to act independently. Human involvement may be limited to initial configuration, periodic supervision, or exception handling. == Comparison with traditional and AI-assisted commerce == Traditional e-commerce requires users to manually browse products, select offers, and authorize payments. Generative AI systems used in commerce commonly assist users by answering questions or suggesting options, and do not complete transactions autonomously. Agentic commerce differs in that decision-making authority is partially or fully transferred to AI agents. As a result, the conventional customer journey, characterized by conscious decision points, may be replaced by continuous, automated micro-decisions performed by software. == Applications and business use cases == Potential applications of agentic commerce include recurring purchases, subscription management, business-to-business procurement, inventory replenishment, and price monitoring. In such contexts, transactions are often predictable and standardized, making them suitable for automation. From a business perspective, agentic commerce systems may be used to optimize supply chains, manage inventory levels, negotiate prices algorithmically, or execute transactions across multiple platforms. Enterprises adopting the new technology include retailers Walmart, Home Depot, Wayfair and Urban Outfitters, and ad tech DSPs, including Google Ads, Amazon, and Yahoo. Chinese tech firms are using apps to provide full-service shopping and payment tools. These includes Alibaba, Tencent, and ByteDance who are currently developing AI powered shopping apps. The Qwen AI chatbot allows users to complete transactions directly within its interface. US firms are still leading in developing AI models but integration is slower due to privacy restrictions. == Payments and technical infrastructure == Agentic commerce relies on digital payment systems capable of supporting automated, machine-initiated transactions, including API-based payment processing, tokenization, real-time authorization, and continuous risk monitoring. Typical user interfaces, such as shopping carts, may be replaced by backend integrations between AI agents, merchants, and payment service providers. For example, Iike 2025, Alibaba launched Alipay AI Pay, which grew and began operating as an application for different retailers. In December 2025, Alipay teamed up with Rokid to enable developers to integrate AI payments into AI agents on Rokid's Lingzhu platform. In January 2025, Alipay unveiled the Agentic Commerce Trust Protocol in partnership with Alibaba's consumer AI applications, such as the Qwen App and Taobao Instant Commerce. Qwen adopted the platform first, connecting it to Taobao Instant Commerce and Alipay AI Pay. Users could use Qwen's agentic feature to place food and drink orders within the application instead of having to click outside to an external browser. For merchants, participation in agentic commerce may require products and services to be presented in structured, machine-readable formats to ensure discoverability and interoperability with autonomous agents. == Universal Commerce Protocol (UCP) == In January 2026, Google announced the Universal Commerce Protocol (UCP), an open-source web standard intended to enable interoperability between AI agents and retail systems across the shopping journey, from discovery and checkout to post-purchase support. UCP makes use of REST, JSON-RPC transports, and support for Agent Payments Protocol (AP2), Agent2Agent (A2A), and Model Context Protocol (MCP). == Legal, regulatory, and security considerations == The use of autonomous agents in commerce raises legal and regulatory questions, particularly regarding authorization, liability, consumer protection, and fraud prevention. Existing payment and contract frameworks are generally based on human decision-makers, and their applicability to autonomous agents remains an area of active discussion. Open issues include responsibility for unauthorized or erroneous transactions, mechanisms for dispute resolution, standards for agent authentication, and compliance with data protection and financial regulations. Continuous, automated transaction patterns may also require new approaches to security and risk assessment. Traditional fraud models centered on identity verification may be insufficient for agentic commerce, and that merchants may need intent-based detection methods using machine learning and behavioral analysis to distinguish legitimate AI agents from malicious automation. === Governance frameworks === The deployment of autonomous AI agents in commercial environments has prompted the development of dedicated governance frameworks. These aim to define operational boundaries, decision authority, oversight mechanisms, and accountability structures for agentic systems. The Agentic Commerce Framework (ACF), created in 2025 by Vincent Dorange, is a governance standard that structures the deployment of autonomous AI agents around four founding principles (Decision Sovereignty, Governance by Design, Ultimate Human Control, Traceable Accountability), four operational layers, and 18 governance KPIs. In January 2026, Singapore's Infocomm Media Development Authority (IMDA) published the Model AI Governance Framework for Agentic AI, extending its existing AI governance guidelines to address agent-specific risks including delegation chains and multi-agent coordination. The Cloud Security Alliance (CSA) has also proposed an Agentic Trust Framework applying zero-trust principles to AI agent governance. == Ecosystem and implementation == The adoption of agentic commerce typically requires changes in commerce architecture, data modeling, identity and permissions, and API-based orchestration of checkout and post-purchase workflows. Management consultancies have identified agentic commerce as a structural evolution of digital commerce, emphasizing the role of AI-driven agents in automating discovery, decision-making, and transaction processes across commerce systems. McKinsey & Company has described agentic commerce as a significant shift in how consumers interact with brands and how enterprises design their commerce operating models. In Europe, this ecosystem also includes digital commerce consultancies specializing in the adoption of agentic commerce. Consulting firms such as Horrea support brands in understanding and implementing the technological and organizational shifts associated with agentic commerce. == Market development and outlook == Agentic commerce is generally regarded as an early-stage development. Industry analysts have projected that AI-driven agents could account for a small but growing share of digital payment transactions within the coming years. Due to the scale of global digital commerce, even limited adoption could represent substantial transaction volumes. Analysts expect that by 2029, AI agents could handle between 1% and 4% of all digital payment transactions. With a projected total transaction volume of over $36 trillion a year, even a small share translates into a market worth up to $1.47 trillion. According to a McKinsey study from October 2025, agentic commerce projects that by 2030, the U.S. business-to-consumer retail market alone could see up to $1 trillion in revenue orchestrated through agentic commerce. On a global scale, the opportunity could range from $3 trillion to $5 trillion. Early experiments and pilot projects have demonstrated both the potential and current limitations of the

    Read more →
  • VMDS

    VMDS

    VMDS abbreviates the relational database technology called Version Managed Data Store provided by GE Energy as part of its Smallworld technology platform and was designed from the outset to store and analyse the highly complex spatial and topological networks typically used by enterprise utilities such as power distribution and telecommunications. VMDS was originally introduced in 1990 as has been improved and updated over the years. Its current version is 6.0. VMDS has been designed as a spatial database. This gives VMDS a number of distinctive characteristics when compared to conventional attribute only relational databases. == Distributed server processing == VMDS is composed of two parts: a simple, highly scalable data block server called SWMFS (Smallworld Master File Server) and an intelligent client API written in C and Magik. Spatial and attribute data are stored in data blocks that reside in special files called data store files on the server. When the client application requests data it has sufficient intelligence to work out the optimum set of data blocks that are required. This request is then made to SWMFS which returns the data to the client via the network for processing. This approach is particularly efficient and scalable when dealing with spatial and topological data which tends to flow in larger volumes and require more processing then plain attribute data (for example during a map redraw operation). This approach makes VMDS well suited to enterprise deployment that might involve hundreds or even thousands of concurrent clients. == Support for long transactions == Relational databases support short transactions in which changes to data are relatively small and are brief in terms in duration (the maximum period between the start and the end of a transaction is typically a few seconds or less). VMDS supports long transactions in which the volume of data involved in the transaction can be substantial and the duration of the transaction can be significant (days, weeks or even months). These types of transaction are common in advanced network applications used by, for example, power distribution utilities. Due to the time span of a long transaction in this context the amount of change can be significant (not only within the scope of the transaction, but also within the context of the database as a whole). Accordingly, it is likely that the same record might be changed more than once. To cope with this scenario VMDS has inbuilt support for automatically managing such conflicts and allows applications to review changes and accept only those edits that are correct. == Spatial and topological capabilities == As well as conventional relational database features such as attribute querying, join fields, triggers and calculated fields, VMDS has numerous spatial and topological capabilities. This allows spatial data such as points, texts, polylines, polygons and raster data to be stored and analysed. Spatial functions include: find all features within a polygon, calculate the Voronoi polygons of a set of sites and perform a cluster analysis on a set of points. Vector spatial data such as points, polylines and polygons can be given topological attributes that allow complex networks to be modelled. Network analysis engines are provided to answer questions such as find the shortest path between two nodes or how to optimize a delivery route (the travelling salesman problem). A topology engine can be configured with a set of rules that define how topological entities interact with each other when new data is added or existing data edited. == Data abstraction == In VMDS all data is presented to the application as objects. This is different from many relational databases that present the data as rows from a table or query result using say JDBC. VMDS provides a data modelling tool and underlying infrastructure as part of the Smallworld technology platform that allows administrators to associate a table in the database with a Magik exemplar (or class). Magik get and set methods for the Magik exemplar can be automatically generated that expose a table's field (or column). Each VMDS row manifests itself to the application as an instance of a Magik object and is known as an RWO (or real world object). Tables are known as collections in Smallworld parlance. # all_rwos hold all the rwos in the database and is heterogeneous all_rwos << my_application.rwo_set() # valve_collection holds the valve collection valves << all_rwos.select(:collection, {:valve}) number_of_valves << valves.size Queries are built up using predicate objects: # find 'open' valves. open_valves << valves.select(predicate.eq(:operating_status, "open")) number_of_open_valves << open_valves.size _for valve _over open_valves.elements() _loop write(valve.id) _endloop Joins are implemented as methods on the parent RWO. For example, a manager might have several employees who report to him: # get the employee collection. employees << my_application.database.collection(:gis, :employees) # find a manager called 'Steve' and get the first matching element steve << employees.select(predicate.eq(:name, "Steve").and(predicate.eq(:role, "manager")).an_element() # display the names of his direct reports. name is a field (or column) # on the employee collection (or table) _for employee _over steve.direct_reports.elements() _loop write(employee.name) _endloop Performing a transaction: # each key in the hash table corresponds to the name of the field (or column) in # the collection (or table) valve_data << hash_table.new_with( :asset_id, 57648576, :material, "Iron") # get the valve collection directly valve_collection << my_application.database.collection(:gis, :valve) # create an insert transaction to insert a new valve record into the collection a # comment can be provide that describes the transaction transaction << record_transaction.new_insert(valve_collection, valve_data, "Inserted a new valve") transaction.run()

    Read more →
  • Clesh

    Clesh

    Clesh (clip load edit share) is a cloud-based video editing platform, created by Forbidden Technologies plc, designed for the consumers, prosumers, and online communities to integrate user-generated content. The core technology is based on FORscene which is geared towards professionals working for example in broadcasting, news media, post production. Video, audio, and graphical content is uploaded to Clesh via a standard web browser, a mobile device such as a phone / tablet, or desktop software for DV capture over FireWire. The hosted material can then be reviewed, searched, edited, and published online by anyone with a standard web browser or compatible mobile device. Clesh supports storyboard shot selection, frame-accurate editing, transitions and various other functions such as; pan, zoom, colour and light correction, and audio levels. Content can be published in formats for example; Podcast, Mpeg2, HTML video or in a proprietary Java format. Cloud-based software provides greater scope for sharing information and collaborating compared to LAN or desktop based systems. Users of cloud-based software rely on the cloud's owner for adequate security, performance and resilience. Clesh does not assert any rights over uploaded content in contrast to other platforms (such as YouTube). All rights to any content uploaded to Clesh remain with the Author. == Features == Some of the services available to Clesh users: Access via Java enabled desktops or Android smartphones or tablets Real-time video rendering including effects and transitions Multiple audio tracks Secured log-on Frame accurate timeline for fine cut editing Logging / meta-data annotation assigns text to portions of video (usable by Clesh and web search engines) Storyboard assembles rough cuts using drag-and-drop Import, host, organise and search for media (DV tape and various video, audio, and still image formats) Publish content to in formats such as podcast, MPEG-2, web (Java Applet), Flash, Ogg, HTML and JPEG Chatrooms to talk to other Clesh users Showreel (a gallery for publishing material visible to internet users) Moderation for approval of material prior to distribution downstream Re-branding and integration support for white-label deployment == Technology == Clesh is based on the same technology as FORscene. An array of servers on the internet backbone provide the cloud computing platform to host Clesh. As a white-label solution Clesh would be branded and hosted per the client requirement. == User interface == End-users access Clesh on clients such as standard Java-enabled Web Browsers and / or Android enabled mobile devices such as tablets and smartphones. == History == Clesh was launched January 2006 and subject to several upgrades during the year to extend functionality including; storyboard, podcasting, moderation, chat and a showreel. During 2007 consumers are offered Clesh via a subscription model. Upgrades include Web Start and graphics upload. Mr Paparazzi selects Clesh as the platform to host its video offering and TrueTube does the same in 2008 by choosing to use Clesh to manage its video portal. Several further upgrades are applied and include; better audio quality, image enhancement controls, transitions, fades, titles, and additional publishing options such as JPEG. In 2010 a version of Clesh is demonstrated on an Android OS tablet device (Samsung Galaxy S Tab), and several upgrades are applied including; HTML publishing, pan, zoom, and overlays.

    Read more →
  • Artificial intelligence industry in Canada

    Artificial intelligence industry in Canada

    The artificial intelligence industry in Canada is a rapidly expanding sector. Although Canada held a pioneering role in the early development of artificial intelligence, transforming research excellence into broad commercial adoption has proven challenging. Despite globally recognized scientific achievements and a deep pool of skilled experts, by June 2024, Canada recorded the lowest rate of AI integration among OECD countries, with only 12% of firms implementing AI in their products or services. However, AI adoption has shown significant momentum—doubling from mid-2024 to mid-2025, rising from 6.1% to 12.2%. As of September 2025, Statistics Canada indicated that while about one-third of Canadian businesses had no plans to adopt artificial intelligence in the next year, 14.5% reported intentions to begin using AI for producing goods or delivering services. The primary reasons for not moving forward with AI were lack of relevance, insufficient knowledge, and privacy concerns. According to Public Works Canada (PwC), the pace of AI adoption in Canada is roughly three-quarters of the United States rate, highlighting a notable gap between the two countries in business integration of this technology. British-Canadian computer scientist Geoffrey Hinton stated in 2025 that Canadian companies are adopting artificial intelligence at a slower pace, which may result in the loss of the country's early advantages in the field. At the "All In AI" conference held in Montreal in September 2025, the Minister of Artificial Intelligence and Digital Innovation Evan Solomon, described "Building digital sovereignty" as the most pressing democratic issue of the time. He introduced a 26-person task force focused on updating Canada's AI strategy. In their 2024 report " "Learning Together for Responsible Artificial Intelligence" report, the Innovation, Science, and Economic Development Canada stressed that public awareness, trust, and AI literacy are essential for the responsible adoption and governance of AI in Canada. Montreal workshops in 2021 expanded the OECD's 2019 definition of AI as "the set of computer techniques that enable a machine (e.g., a computer or telephone) to perform tasks that typically require intelligence, such as reasoning or learning. It is also referred to as the automation of intelligent tasks. Scientific developments in AI, such as deep-learning techniques, have made it possible to design access to huge amounts of data and ever-increasing computing power. These new techniques have been rapidly deployed on a large scale in all areas of social life, in transport, education, culture and health." == Federal investments and policy == The 2025 federal budget allocates over $1 billion over the next five years to bolster Canada's artificial intelligence and quantum computing ecosystem. == Industry landscape or research hubs == AlexNet, an influential deep convolutional neural network developed at the University of Toronto by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, marked a pivotal turning point in modern artificial intelligence. In 2012, it achieved a dramatic reduction in error rates for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), showcasing the practical power of deep learning and GPU acceleration. The success of AlexNet helped cement Canada’s reputation for AI leadership and inspired rapid adoption of deep learning across the technology sector, with ongoing impact in both academic and commercial domains. In healthcare, AlexNet has been adapted for medical imaging to assist with analyzing radiographs, mammograms, and other scans, including identifying abnormalities and supporting clinical diagnosis. In 2015, the Ottawa-based start-up Advanced Symbolics Inc. (ASI) began developing Polly, an artificial intelligence system designed to analyze and anticipate how target audiences behave—enabling more effective communication strategies and advertising campaigns. Polly was named after its first assignment analyzing the politics of Brexit. The AI gained widespread attention in 2016 for accurately forecasting both the Brexit referendum and the 2016 U.S. presidential election won by Donald Trump. The company states that Polly is used by organizations in diverse sectors—including healthcare, politics, entertainment, and mental health research—to support decision-making based on predictive analytics. Chartwatch, an AI tool developed in Canada, has been shown to reduce unexpected hospital deaths by 26% according to a 2024 study. The system analyzes patient data to detect subtle signs of deterioration, supporting healthcare teams in providing timely interventions. === Notable figures in AI in Canada === Geoffrey Hinton's decades-long work eventually formed the foundation of artificial intelligence, which earned him the Nobel Prize for physics in 2024. Yoshua Bengio, who won the Turing Award in 2018 for his pioneering work in deep learning, founded what would become Mila in 1993. Mila, is currently a collaboration between four Montreal-based academic partners.—the Pan-Canadian Artificial Intelligence Strategy includes Alberta's Amii, Toronto's Vector Institute, and Mila. Fakhreddine Karray's work on operational AI has had tangible impact across several Canadian-relevant sectors, notably intelligent transportation systems, virtual healthcare, and driver safety. === AI in the oil and gas industry === According to a 2020 Ernst & Young report the oil and gas industry in Canada is using AI in automating routine, repetitive, and dangerous tasks with technologies like robotic process automation and machine learning; optimizing production and processing; enhancing transportation logistics; improving equipment operation and monitoring; and enabling preventative maintenance. AI is also deployed for data analysis to improve prediction and decision-making, and is expected to automate up to 50% of job competencies in upstream oil and gas by 2040. Oilsands giant Suncor Energy operates a large fleet of autonomous trucks and has started using AI in its dispatch system at the Mildred Lake mine. As of 2024, AI manages routine tasks such as allocating trucks to dump stations and sending them to refuelling locations. === Indigenous and Inuit Innovation in AI === Indigenous organizations have been working on the creation of new technologies for language revitalization in partnership with National Research Council of Canada since the mid-2010s. In 2025, Inuit researchers and technology partners launched an AI-powered initiative to support the revitalization and preservation of Inuktitut, demonstrating how artificial intelligence can be adapted for Indigenous language and cultural priorities. A 2025 CBC article notes that, while AI can help revitalize Inuktitut, Inuit leaders emphasize concerns about data sovereignty, information ownership, and the need for Indigenous leadership to ensure transparency, privacy, and accountability in AI development. == Regulation == Canada's Artificial Intelligence and Data Act (AIDA) was proposed in November 2022, as part of the Digital Charter Implementation Act (Bill C-27). As well voluntary codes, such as the September 2023 Code of Conduct for Generative AI, and landmark investments in advanced computing infrastructure and the Canadian Artificial Intelligence Safety Institute (CAISI) reflect Canada's commitment to both safety and global competitiveness. == AI infrastructure == Canada has undertaken efforts to expand its AI computing infrastructure at both provincial and federal levels. The federal government's Canadian Sovereign AI Compute Strategy, allocated up to C$2 billion in Budget 2024, aims to enhance computing capacity to support domestic AI industry growth and AI adoption across the economy, with up to C$700 million designated to mobilize private sector investment in new or expanded data centres. Alberta has introduced an AI Data Centres Strategy to position itself as a leading North American destination for data centre investment, targeting C$100 billion worth of AI data centres under development by 2030. One major project under Alberta's strategy is the Wonder Valley AI Data Centre Park near Grande Prairie, which was exempted from provincial environmental impact assessment in April 2026 but still requires permits demonstrating safe construction and operation. According to Statista, as of April 2026, Canada has 287 data centres.

    Read more →
  • Digital artifact

    Digital artifact

    Digital artifact in information science, is any undesired or unintended alteration in data introduced in a digital process by an involved technique and/or technology. Digital artifact can be of any content types including text, audio, video, image, animation or a combination. == Information science == In information science, digital artifacts result from: Hardware malfunction: In computer graphics, visual artifacts may be generated whenever a hardware component such as the processor, memory chip, cabling malfunctions, etc., corrupts data. Examples of malfunctions include physical damage, overheating, insufficient voltage and GPU overclocking. Common types of hardware artifacts are texture corruption and T-vertices in 3D graphics, and pixelization in MPEG compressed video. Software malfunction: Artifacts may be caused by algorithm flaws such as decoding/encoding audio or video, or a poor pseudo-random number generator that would introduce artifacts distinguishable from the desired noise into statistical models. Compression: Controlled amounts of unwanted information may be generated as a result of the use of lossy compression techniques. One example is the artifacts seen in JPEG and MPEG compression algorithms that produce compression artifacts. Quantization: Digital imprecision generated in the process of converting analog information into digital space, is due to the limited granularity of digital numbering space. In computer graphics, quantization is seen as pixelation. Aliasing: As a consequence of sampling or sample-rate conversion, energy from frequencies outside of the signal frequency band of interest are folded across multiples of the Nyquist frequency. This is typically mitigated by using an anti-aliasing filter. Filtering: The process of filtering a signal, such as using an anti-aliasing filter, causes undesired alterations to the signal due to imperfections in the frequency response magnitude and phase, and due to the time domain impulse response. Rolling shutter, the line scanning of an object that is moving too fast for the image sensor to capture a unitary image. Error diffusion: poorly-weighted kernel coefficients result in undesirable visual artifacts.

    Read more →
  • Organizational metacognition

    Organizational metacognition

    Organizational metacognition is knowing what an organization knows, a concept related to metacognition, organizational learning, the learning organization and sensemaking. It is used to describe how organizations and teams develop an awareness of their own thinking, learning how to learn, where awareness of ignorance can motivate learning. The organizational deutero-learning concept identified by Argyris and Schon defines when organizations learn how to carry out single-loop and double-loop learning. It has also been described as learning how to learn through a process of collaborative inquiry and reflection (evaluative inquiry). "When an organization engages in deutero-learning its members learn about the previous context for learning. They reflect on and inquire into previous episodes of organizational learning, or failure to learn. They discover what they did that facilitated or inhibited learning, they invent new strategies for learning, they produce these strategies, and they evaluate and generalize what they have produced" Learning what facilitates and inhibits learning enables organizations to develop new strategies to develop their knowledge. For example, identification of a gap between perceived performance (such as satisfaction) and actual performance (outcomes) creates an awareness that makes the organization understand that learning needs to occur, driving appropriate changes to the environment and processes. == Learning prototypes == Wijnhoven (2001) grouped four learning prototypes that best meet learning needs, the match between these needs and learning norms dictating an organization's learning capabilities; deutero-learning is the acquisition of these capabilities. knowledge gap analysis classification of problems to select operationally required knowledge and skills coping with organizational tremors and jolts by anticipation, response and adjustments of behavioural repertoires decisional uncertainty measurement == Terminological ambiguities == Organizational metacognition and organizational deutero-learning have both been described as the concept or phenomenon where organizations learn how to learn. Argyris and Schon (1978) place deutero-learning into their cognitive theory of action framework, neglecting aspects of adaptive behaviour and context core to Bateson's (1972) original definitions. In order to resolve terminological ambiguities, Visser (2007) reviewed and reformulated the concept of deutero-learning as, "the behavioral adaptation to patterns of conditioning in relationships in organizational contexts, distinguishing it from meta-learning and planned learning" (pg. 659). == Significance == Organizational metacognition is considered a key norm to the prescriptive concept of the learning organization. Its significance has been recognized by industry, the military and in disaster response. == Examples in practice == Examples of poor metacognition (deutero-learning) have been described in knowledge network environments, "Knowledge networking is important to most competitive enterprises today. Enterprise knowledge is becoming ever more specialized in nature, so no single person or organization can know everything in detail. Hence addressing complex, multidisciplinary problems requires developing and accessing a network of knowledgeable people and organizations. The problem is, many otherwise knowledgeable people and organizations are not fully aware of their knowledge networks, and even more problematic, they are not aware that they are not aware. This focuses our attention toward organizational metacognition."

    Read more →
  • Realization (linguistics)

    Realization (linguistics)

    In linguistics, realization is the process by which some kind of surface representation is derived from its underlying representation; that is, the way in which some abstract object of linguistic analysis comes to be produced in actual language. Phonemes are often said to be realized by speech sounds. The different sounds that can realize a particular phoneme are called its allophones. Realization is also a subtask of natural language generation, which involves creating an actual text in a human language (English, French, etc.) from a syntactic representation. There are a number of software packages available for realization, most of which have been developed by academic research groups in NLG. The remainder of this article concerns realization of this kind. == Example == For example, the following Java code causes the simplenlg system [2] to print out the text The women do not smoke.: In this example, the computer program has specified the linguistic constituents of the sentence (verb, subject), and also linguistic features (plural subject, negated), and from this information the realiser has constructed the actual sentence. == Processing == Realisation involves three kinds of processing: Syntactic realisation: Using grammatical knowledge to choose inflections, add function words and also to decide the order of components. For example, in English the subject usually precedes the verb, and the negated form of smoke is do not smoke. Morphological realisation: Computing inflected forms, for example the plural form of woman is women (not womans). Orthographic realisation: Dealing with casing, punctuation, and formatting. For example, capitalising The because it is the first word of the sentence. The above examples are very basic, most realisers are capable of considerably more complex processing. == Systems == A number of realisers have been developed over the past 20 years. These systems differ in terms of complexity and sophistication of their processing, robustness in dealing with unusual cases, and whether they are accessed programmatically via an API or whether they take a textual representation of a syntactic structure as their input. There are also major differences in pragmatic factors such as documentation, support, licensing terms, speed and memory usage, etc. It is not possible to describe all realisers here, but a few of the emerging areas are: Simplenlg [3]: a document realizing engine with an api which intended to be simple to learn and use, focused on limiting scope to only finding the surface area of a document. KPML [4]: this is the oldest realiser, which has been under development under different guises since the 1980s. It comes with grammars for ten different languages. FUF/SURGE [5]: a realiser which was widely used in the 1990s, and is still used in some projects today OpenCCG [6]: an open-source realiser which has a number of nice features, such as the ability to use statistical language models to make realisation decisions.

    Read more →
  • Magic Quadrant

    Magic Quadrant

    Magic Quadrant (MQ) is a series of market research reports published by research and advisory firm Gartner that rely on proprietary qualitative data analysis methods to demonstrate market trends, such as direction, maturity, and participants. Their analyses are conducted for several specific technology industries and are updated every 1–2 years: once an updated report has been published, its predecessor is "retired". == Rating == Gartner rates vendors upon two criteria: completeness of vision and ability to execute. Completeness of vision – Reflects the vendor's innovation, and whether the vendor drives or follows the market. Ability to execute – Summarizes factors such as the vendor's financial viability, market responsiveness, product development, sales channels and customer base. The two component scores lead to a vendor position in one of four quadrants: === Leaders === Vendors in the "Leaders" quadrant have the highest composite scores for their completeness of vision and ability to execute. A vendor in the Leaders quadrant has the market share, credibility, and marketing & sales capabilities needed to drive the acceptance of new technologies. These vendors demonstrate a clear understanding of market needs, they are innovators and thought leaders, and they have well-articulated plans that customers and prospects can use when designing their infrastructures and strategies. In addition, they have a presence in the five major geographical regions, consistent financial performance, and broad platform support. === Challengers === Vendors in the "Challengers" quadrant have high scores mainly for their ability to execute. They both participate in the market and execute well enough to be a serious threat to vendors in the "Leaders" quadrant. They have strong products, as well as sufficiently credible market position and resources to sustain continued growth. Financial viability is not an issue for vendors in the "Challengers" quadrant, but they lack the size and influence of vendors in the "Leaders" quadrant due to their relative lack of vision. === Visionaries === Vendors in the "Visionaries" quadrant have high scores mainly for their completeness of vision. They deliver innovative products that address operationally or financially important end-user problems at a broad scale, but have not yet demonstrated the ability to capture market share or maintain sustainable levels of profitability. Visionary vendors are frequently privately held companies and acquisition targets for larger, established companies. The likelihood of acquisition often reduces the risks associated with installing their systems. === Niche Players === Vendors in the "Niche Players" quadrant have relatively low scores for both their ability to execute and their completeness of vision. They are often narrowly focused on specific market or vertical segments. This quadrant often also includes vendors that are adapting their existing products to enter the market under consideration, or larger vendors having difficulty developing and executing on their vision. == Gartner Critical Capabilities == Gartner Critical Capabilities complement Magic Quadrant analysis to offer deeper insight into the products and services offered by multiple vendors by a comparative analysis that scores competing products or services against a set of critical differentiators identified by Gartner. Gartner has periodically ended Magic Quadrant listings for IT Service Management, Web Content Management, and other industries as those markets have fully matured or other factors rendered the analytic framework inapplicable. == Criticism == The Magic Quadrant, and analysts in general, skew the market: according to research, by applying their methodologies to describe a market, they change that marketplace to fit their tools. Another criticism is that open source vendors are not considered sufficiently by analysts like Gartner, as has been published in an online discussion between a VP from Talend and a German Research VP from Gartner. On May 29, 2009 (2009-05-29), software vendor ZL Technologies filed a federal lawsuit against Gartner that challenged the "legitimacy" of Gartner's Magic Quadrant rating system. Gartner filed a motion to dismiss by claiming First Amendment protection since it contends that its MQ reports contain "pure opinion", which legally means opinions that are not based on fact. The court threw out the ZL case because it lacked a specific complaint. The decision was upheld on appeal.

    Read more →
  • Information literacy

    Information literacy

    The Association of College and Research Libraries defines information literacy as a "set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued and the use of information in creating new knowledge and participating ethically in communities of learning". In the United Kingdom, the Chartered Institute of Library and Information Professionals' definition also makes reference to knowing both "when" and "why" information is needed. The 1989 American Library Association (ALA) Presidential Committee on Information Literacy formally defined information literacy (IL) as attributes of an individual, stating that "to be information literate, a person must be able to recognize when information is needed and have the ability to locate, evaluate and use effectively the needed information". In 1990, academic Lori Arp published a paper asking, "Are information literacy instruction and bibliographic instruction the same?" Arp argued that neither term was particularly well defined by theoreticians or practitioners in the field. Further studies were needed to lessen the confusion and continue to articulate the parameters of the question. The Alexandria Proclamation of 2005 defined the term as a human rights issue: "Information literacy empowers people in all walks of life to seek, evaluate, use and create information effectively to achieve their personal, social, occupational and educational goals. It is a basic human right in a digital world and promotes social inclusion in all nations." The United States National Forum on Information Literacy defined information literacy as "the ability to know when there is a need for information, to be able to identify, locate, evaluate, and effectively use that information for the issue or problem at hand." Meanwhile, in the UK, the library professional body CILIP, define information literacy as "the ability to think critically and make balanced judgements about any information we find and use. It empowers us as citizens to develop informed views and to engage fully with society." A number of other efforts have been made to better define the concept and its relationship to other skills and forms of literacy. Other pedagogical outcomes related to information literacy include traditional literacy, computer literacy, research skills and critical thinking skills. Information literacy as a sub-discipline is an emerging topic of interest and counter measure among educators and librarians with the prevalence of misinformation, fake news, and disinformation. Scholars have argued that in order to maximize people's contributions to a democratic and pluralistic society, educators should be challenging governments and the business sector to support and fund educational initiatives in information literacy. == History == The phrase "information literacy" first appeared in print in a 1974 report written on behalf of the National Commission on Libraries and Information Science by Paul G. Zurkowski, who was at the time president of the Information Industry Association (now the Software and Information Industry Association). Zurkowski used the phrase to describe the "techniques and skills" learned by the information literate "for utilizing the wide range of information tools as well as primary sources in molding information solutions to their problems" and drew a relatively firm line between the "literates" and "information illiterates." The concept of information literacy appeared again in a 1976 paper by Lee Burchina presented at the Texas A&M University library's symposium. Burchina identified a set of skills needed to locate and use information for problem solving and decision making. In another 1976 article in Library Journal, M.R. Owens applied the concept to political information literacy and civic responsibility, stating, "All [people] are created equal but voters with information resources are in a position to make more intelligent decisions than citizens who are information illiterates. The application of information resources to the process of decision-making to fulfill civic responsibilities is a vital necessity." In a literature review published in an academic journal in 2020, Oral Roberts University professor Angela Sample cites several conceptual waves of information literacy definitions as defining information as a way of thinking, a set of skills, and a social practice. The introduction of these concepts led to the adoption of a mechanism called metaliteracy and the creation of threshold concepts and knowledge dispositions, which led to the creation of the ALA's Information Literacy Framework. The American Library Association's Presidential Committee on Information Literacy released a report on January 10, 1989. Titled as the Presidential Committee on Information Literacy: Final Report, the article outlines the importance of information literacy, opportunities to develop it, and the idea of an Information Age School. The recommendations of the Committee led to establishment of the National Forum on Information Literacy, a coalition of more than 90 national and international organizations. In 1998, the American Association of School Librarians and the Association for Educational Communications and Technology published Information Power: Building Partnerships for Learning, which further established specific goals for information literacy education, defining some nine standards in the categories of "information literacy," "independent learning," and "social responsibility." Also in 1998, the Presidential Committee on Information Literacy updated its final report. The report outlined six recommendations from the original report, and examined areas of challenge and progress. In 1999, the Society of College, National and University Libraries (SCONUL) in the UK published The Seven Pillars of Information Literacy to model the relationship between information skills and IT skills, and the idea of the progression of information literacy into the curriculum of higher education. In 2003, the National Forum on Information Literacy, along with UNESCO and the National Commission on Libraries and Information Science, sponsored an international conference in Prague. Representatives from twenty-three countries gathered to discuss the importance of information literacy in a global context. The resulting Prague Declaration described information literacy as a "key to social, cultural, and economic development of nations and communities, institutions and individuals in the 21st century" and declared its acquisition as "part of the basic human right of lifelong learning". In the United States specifically, information literacy was prioritized in 2009 during President Barack Obama's first term. In effort to stress the value information literacy has on everyday communication, he designated October as National Information Literacy Awareness Month in his released proclamation. In 2015, the Association of College and Research Libraries (ACRL) adopted the Framework for Information Literacy for Higher Education, which defines information literacy as "the set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued, and the use of information in creating new knowledge and participating ethically in communities of learning".Association of College and Research Libraries (2015-02-09). "Framework for Information Literacy for Higher Education". Association of College and Research Libraries. American Library Association. Retrieved 2026-02-17. == Presidential Committee on Information Literacy == The American Library Association's Presidential Committee on Information Literacy defined information literacy as the ability "to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information" and highlighted information literacy as a skill essential for lifelong learning and the production of an informed and prosperous citizenry. The committee outlined six principal recommendations. Included were recommendations like "Reconsider the ways we have organized information institutionally, structured information access, and defined information's role in our lives at home in the community, and in the work place"; to promote "public awareness of the problems created by information illiteracy"; to develop a national research agenda related to information and its use; to ensure the existence of "a climate conducive to students' becoming information literate"; to include information literacy concerns in teacher education democracy. In the updated report, the committee ended with an invitation, asking the National Forum and regular citizens to recognize that "the result of these combined efforts will be a citizenry which is made up of effective lifelong learners who can always find the information needed for the issue or decision at hand. This new

    Read more →
  • Information explosion

    Information explosion

    Information explosion is the rapid increase in the amount of published information or data and the effects of this abundance. As the amount of available data grows, the problem of managing the information becomes more difficult, which can lead to information overload. The Online Oxford English Dictionary indicates use of the phrase in a March 1964 New Statesman article. The New York Times first used the phrase in its editorial content in an article by Walter Sullivan on June 7, 1964, in which he described the phrase as "much discussed". The earliest known use of the phrase was in a speech about television by NBC president Pat Weaver at the Institute of Practitioners of Advertising in London on September 27, 1955. The speech was rebroadcast on radio station WSUI in Iowa City and excerpted in the Daily Iowan newspaper two months later. Many sectors are seeing this rapid increase in the amount of information available such as healthcare, supermarkets, and governments. Another sector that is being affected by this phenomenon is journalism. Such a profession, which in the past was responsible for the dissemination of information, may be suppressed by the overabundance of information today. Techniques to gather knowledge from an overabundance of electronic information (e.g., data fusion may help in data mining) have existed since the 1970s. Another common technique to deal with such amount of information is qualitative research. Such approaches aim to organize the information, synthesizing, categorizing and systematizing in order to be more usable and easier to search. == Growth patterns == The world's technological capacity to store information grew from, optimally compressed, 2.6 exabytes in 1986 to 15.7 in 1993, over 54.5 in 2000, and to 295 exabytes in 2007. The world's technological capacity to receive information through one-way broadcast networks was 432 exabytes of (optimally compressed) information in 1986, 715 (optimally compressed) exabytes in 1993, 1,200 (optimally compressed) exabytes in 2000, and 1,900 in 2007. The world's effective capacity to exchange information through two-way telecommunications networks was 0.281 exabytes of (optimally compressed) information in 1986, 0.471 in 1993, 2.2 in 2000, and 65 (optimally compressed) exabytes in 2007. A new metric that is being used in an attempt to characterize the growth in person-specific information, is the disk storage per person (DSP), which is measured in megabytes/person (where megabytes is 106 bytes and is abbreviated MB). Global DSP (GDSP) is the total rigid disk drive space (in MB) of new units sold in a year divided by the world population in that year. The GDSP metric is a crude measure of how much disk storage could possibly be used to collect person-specific data on the world population. In 1983, one million fixed drives with an estimated total of 90 terabytes were sold worldwide; 30MB drives had the largest market segment. In 1996, 105 million drives, totaling 160,623 terabytes were sold with 1 and 2 gigabyte drives leading the industry. By the year 2000, with 20GB drive leading the industry, rigid drives sold for the year are projected to total 2,829,288 terabytes Rigid disk drive sales to top $34 billion in 1997. According to Latanya Sweeney, there are three trends in data gathering today: Type 1. Expansion of the number of fields being collected, known as the “collect more” trend. Type 2. Replace an existing aggregate data collection with a person-specific one, known as the “collect specifically” trend. Type 3. Gather information by starting a new person-specific data collection, known as the “collect it if you can” trend. == Related terms == Since "information" in electronic media is often used synonymously with "data", the term information explosion is closely related to the concept of data flood (also dubbed data deluge). Sometimes the term information flood is used as well. All of those basically boil down to the ever-increasing amount of electronic data exchanged per time unit. A term that covers the potential negative effects of information explosion is information inflation. The awareness about non-manageable amounts of data grew along with the advent of ever more powerful data processing since the mid-1960s. == Challenges == Even though the abundance of information can be beneficial in several levels, some problems may be of concern such as privacy, legal and ethical guidelines, filtering and data accuracy. Filtering refers to finding useful information in the middle of so much data, which relates to the job of data scientists. A typical example of a necessity of data filtering (data mining) is in healthcare since in the next years is due to have EHRs (Electronic Health Records) of patients available. With so much information available, the doctors will need to be able to identify patterns and select important data for the diagnosis of the patient. On the other hand, according to some experts, having so much public data available makes it difficult to provide data that is actually anonymous. Another point to take into account is the legal and ethical guidelines, which relates to who will be the owner of the data and how frequently he/she is obliged to the release this and for how long. With so many sources of data, another problem will be accuracy of such. An untrusted source may be challenged by others, by ordering a new set of data, causing a repetition in the information. According to Edward Huth, another concern is the accessibility and cost of such information. The accessibility rate could be improved by either reducing the costs or increasing the utility of the information. The reduction of costs according to the author, could be done by associations, which should assess which information was relevant and gather it in a more organized fashion. == Web servers == As of August 2005, there were over 70 million web servers. As of September 2007 there were over 135 million web servers. == Blogs == According to Technorati, the number of blogs doubles about every 6 months with a total of 35.3 million blogs as of April 2006. This is an example of the early stages of logistic growth, where growth is approximately exponential, since blogs are a recent innovation. As the number of blogs approaches the number of possible producers (humans), saturation occurs, growth declines, and the number of blogs eventually stabilizes.

    Read more →
  • Charge-coupled device

    Charge-coupled device

    A charge-coupled device (CCD) is an integrated circuit containing an array of linked, or coupled, capacitors. Under the control of an external circuit, each capacitor can transfer its electric charge to a neighboring capacitor. CCD sensors are a major technology used in digital imaging. In a CCD image sensor, pixels are represented by p-doped metal–oxide–semiconductor (MOS) capacitors. These MOS capacitors, the basic building blocks of a CCD, are biased above the threshold for inversion when image acquisition begins, allowing the conversion of incoming photons into electron charges at the semiconductor-oxide interface; the CCD is then used to read out these charges. Although CCDs are not the only technology to allow for light detection, CCD image sensors are widely used in professional, medical, and scientific applications where high-quality image data are required. In applications with less exacting quality demands, such as consumer and professional digital cameras, active pixel sensors, also known as CMOS sensors (complementary MOS sensors), are generally used. However, the large quality advantage CCDs enjoyed early on has narrowed over time and since the late 2010s CMOS sensors are the dominant technology, having largely if not completely replaced CCD image sensors. == History == The basis for the CCD is the metal–oxide–semiconductor (MOS) structure, with MOS capacitors being the basic building blocks of a CCD, and a depleted MOS structure used as the photodetector in early CCD devices. In the late 1960s, Willard Boyle and George E. Smith at Bell Labs were researching MOS technology while working on semiconductor bubble memory. They realized that an electric charge was the analog of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. This led to the invention of the charge-coupled device by Boyle and Smith in 1969. They conceived of the design of what they termed, in their notebook, "Charge 'Bubble' Devices". The initial paper describing the concept in April 1970 listed possible uses as memory, a delay line, and an imaging device. The device could also be used as a shift register. The essence of the design was the ability to transfer charge along the surface of a semiconductor from one storage capacitor to the next. The first experimental device demonstrating the principle was a row of closely spaced metal squares on an oxidized silicon surface electrically accessed by wire bonds. It was demonstrated by Gil Amelio, Michael Francis Tompsett and George Smith in April 1970. This was the first experimental application of the CCD in image sensor technology, and used a depleted MOS structure as the photodetector. The first patent (U.S. patent 4,085,456) on the application of CCDs to imaging was assigned to Tompsett, who filed the application in 1971. The first working CCD made with integrated circuit technology was a simple 8-bit shift register, reported by Tompsett, Amelio and Smith in August 1970. This device had input and output circuits and was used to demonstrate its use as a shift register and as a crude eight pixel linear imaging device. Development of the device progressed at a rapid rate. By 1971, Bell researchers led by Michael Tompsett were able to capture images with simple linear devices. Several companies, including Fairchild Semiconductor, RCA and Texas Instruments, picked up on the invention and began development programs. Fairchild's effort, led by ex-Bell researcher Gil Amelio, was the first with commercial devices, and by 1974 had a linear 500-element device and a 2D 100 × 100 pixel device. Peter L. P. Dillon, a scientist at Kodak Research Labs, invented the first color CCD image sensor by overlaying a color filter array on this Fairchild 100 x 100 pixel Interline CCD starting in 1974. Steven Sasson, an electrical engineer working for the Kodak Apparatus Division, invented a digital still camera using this same Fairchild 100 × 100 CCD in 1975. The interline transfer (ILT) CCD device was proposed by L. Walsh and R. Dyck at Fairchild in 1973 to reduce smear and eliminate a mechanical shutter. To further reduce smear from bright light sources, the frame-interline-transfer (FIT) CCD architecture was developed by K. Horii, T. Kuroda and T. Kunii at Matsushita (now Panasonic) in 1981. The first KH-11 KENNEN reconnaissance satellite equipped with charge-coupled device array (800 × 800 pixels) technology for imaging was launched in December 1976. Under the leadership of Kazuo Iwama, Sony started a large development effort on CCDs involving a significant investment. Eventually, Sony managed to mass-produce CCDs for their camcorders. Before this happened, Iwama died in August 1982. Subsequently, a CCD chip was placed on his tombstone to acknowledge his contribution. The first mass-produced consumer CCD video camera, the CCD-G5, was released by Sony in 1983, based on a prototype developed by Yoshiaki Hagiwara in 1981. Early CCD sensors suffered from shutter lag. This was largely resolved with the invention of the pinned photodiode (PPD). It was invented by Nobukazu Teranishi, Hiromitsu Shiraki and Yasuo Ishihara at NEC in 1980. They recognized that lag can be eliminated if the signal carriers could be transferred from the photodiode to the CCD. This led to their invention of the pinned photodiode, a photodetector structure with low lag, low noise, high quantum efficiency and low dark current. It was first publicly reported by Teranishi and Ishihara with A. Kohono, E. Oda and K. Arai in 1982, with the addition of an anti-blooming structure. The new photodetector structure invented at NEC was given the name "pinned photodiode" (PPD) by B.C. Burkey at Kodak in 1984. In 1987, the PPD began to be incorporated into most CCD devices, becoming a fixture in consumer electronic video cameras and then digital still cameras. Since then, the PPD has been used in nearly all CCD sensors and then CMOS sensors. In January 2006, Boyle and Smith were awarded the National Academy of Engineering Charles Stark Draper Prize, and in 2009 they were awarded the Nobel Prize for Physics for their invention of the CCD concept. Michael Tompsett was awarded the 2010 National Medal of Technology and Innovation, for pioneering work and electronic technologies including the design and development of the first CCD imagers. He was also awarded the 2012 IEEE Edison Medal for "pioneering contributions to imaging devices including CCD Imagers, cameras and thermal imagers". == Basics of operation == In a CCD for capturing images, there is a photoactive region (an epitaxial layer of silicon), and a transmission region made out of a shift register (the CCD, properly speaking). An image is projected through a lens onto the capacitor array (the photoactive region), causing each capacitor to accumulate an electric charge proportional to the light intensity at that location. A one-dimensional array, used in line-scan cameras, captures a single slice of the image, whereas a two-dimensional array, used in video and still cameras, captures a two-dimensional picture corresponding to the scene projected onto the focal plane of the sensor. Once the array has been exposed to the image, a control circuit causes each capacitor to transfer its contents to its neighbor (operating as a shift register). The last capacitor in the array dumps its charge into a charge amplifier, which converts the charge into a voltage. By repeating this process, the controlling circuit converts the entire contents of the array in the semiconductor to a sequence of voltages. In a digital device, these voltages are then sampled, digitized, and usually stored in memory; in an analog device (such as an analog video camera), they are processed into a continuous analog signal (e.g. by feeding the output of the charge amplifier into a low-pass filter), which is then processed and fed out to other circuits for transmission, recording, or other processing. == Detailed physics of operation == === Charge generation === Before the MOS capacitors are exposed to light, they are biased into the depletion region; in n-channel CCDs, the silicon under the bias gate is slightly p-doped or intrinsic. The gate is then biased at a positive potential, above the threshold for strong inversion, which will eventually result in the creation of an n channel below the gate as in a MOSFET. However, it takes time to reach this thermal equilibrium: up to hours in high-end scientific cameras cooled at low temperature. Initially after biasing, the holes are pushed far into the substrate, and no mobile electrons are at or near the surface; the CCD thus operates in a non-equilibrium state called deep depletion. Then, when electron–hole pairs are generated in the depletion region, they are separated by the electric field, the elec

    Read more →
  • Media aggregation platform

    Media aggregation platform

    A Media Aggregation Platform or Media Aggregation Portal (MAP) is an over the top service for distributing web-based streaming media content from multiple sources to a large audience. MAPs consist of networks of sources who host their own content which viewers can choose and access directly from a larger variety of content to choose from than a single source can offer. The service is used by content providers, looking to extend the reach of their content. Unlike multichannel video programming distributor (MVPD) or multiple-system operators (MSO), MAPs rely on the Internet rather than cables or satellite. As more network television channels have moved online in the early 21st century, joining web-native channels like Netflix, MAPs aggregate content the way that MSOs and MVPDs have used cable, and to a lesser extent satellite and IPTV infrastructure. There are companies that offer a similar service for free, including Yidio and StreamingMoviesRight, while others charge a subscription fee like as FreeCast Inc's Rabbit TV Plus. When compared with MSOs and MVPDs, MAP networks have much lower costs due to lack of physical infrastructure. The majority of revenue from MAP services are retained by the content creators, and revenue is instead collected from advertisements, pay-per-view, and subscription-based content offerings instead of licensing and reselling content. MAP service consumers interact and purchase content directly from its source, without the markup added by a middleman.

    Read more →
  • Semantic heterogeneity

    Semantic heterogeneity

    Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogeneity is compounded due to the flexibility of semi-structured data and various tagging methods applied to documents or unstructured data. Semantic heterogeneity is one of the more important sources of differences in heterogeneous datasets. Yet, for multiple data sources to interoperate with one another, it is essential to reconcile these semantic differences. Decomposing the various sources of semantic heterogeneities provides a basis for understanding how to map and transform data to overcome these differences. == Classification == One of the first known classification schemes applied to data semantics is from William Kent in the late 80s. Kent's approach dealt more with structural mapping issues than differences in meaning, which he pointed to data dictionaries as potentially solving. One of the most comprehensive classifications is from Pluempitiwiriyawej and Hammer, "Classification Scheme for Semantic and Schematic Heterogeneities in XML Data Sources". They classify heterogeneities into three broad classes: Structural conflicts arise when the schema of the sources representing related or overlapping data exhibit discrepancies. Structural conflicts can be detected when comparing the underlying schema. The class of structural conflicts includes generalization conflicts, aggregation conflicts, internal path discrepancy, missing items, element ordering, constraint and type mismatch, and naming conflicts between the element types and attribute names. Domain conflicts arise when the semantics of the data sources that will be integrated exhibit discrepancies. Domain conflicts can be detected by looking at the information contained in the schema and using knowledge about the underlying data domains. The class of domain conflicts includes schematic discrepancy, scale or unit, precision, and data representation conflicts. Data conflicts refer to discrepancies among similar or related data values across multiple sources. Data conflicts can only be detected by comparing the underlying sources. The class of data conflicts includes ID-value, missing data, incorrect spelling, and naming conflicts between the element contents and the attribute values. Moreover, mismatches or conflicts can occur between set elements (a "population" mismatch) or attributes (a "description" mismatch). Michael Bergman expanded upon this schema by adding a fourth major explicit category of language, and also added some examples of each kind of semantic heterogeneity, resulting in about 40 distinct potential categories . This table shows the combined 40 possible sources of semantic heterogeneities across sources: A different approach toward classifying semantics and integration approaches is taken by Sheth et al. Under their concept, they split semantics into three forms: implicit, formal and powerful. Implicit semantics are what is either largely present or can easily be extracted; formal languages, though relatively scarce, occur in the form of ontologies or other description logics; and powerful (soft) semantics are fuzzy and not limited to rigid set-based assignments. Sheth et al.'s main point is that first-order logic (FOL) or description logic is inadequate alone to properly capture the needed semantics. == Relevant applications == Besides data interoperability, relevant areas in information technology that depend on reconciling semantic heterogeneities include data mapping, semantic integration, and enterprise information integration, among many others. From the conceptual to actual data, there are differences in perspective, vocabularies, measures and conventions once any two data sources are brought together. Explicit attention to these semantic heterogeneities is one means to get the information to integrate or interoperate. A mere twenty years ago, information technology systems expressed and stored data in a multitude of formats and systems. The Internet and Web protocols have done much to overcome these sources of differences. While there is a large number of categories of semantic heterogeneity, these categories are also patterned and can be anticipated and corrected. These patterned sources inform what kind of work must be done to overcome semantic differences where they still reside.

    Read more →