Inductive probability

Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about the world. There are three sources of knowledge: inference, communication, and deduction. Communication relays information found using other methods. Deduction establishes new facts based on existing facts. Inference establishes new facts from data. Its basis is Bayes' theorem. Information describing the world is written in a language. For example, a simple mathematical language of propositions may be chosen. Sentences may be written down in this language as strings of characters. But in the computer it is possible to encode these sentences as strings of bits (1s and 0s). Then the language may be encoded so that the most commonly used sentences are the shortest. This internal language implicitly represents probabilities of statements. Occam's razor says the "simplest theory, consistent with the data is most likely to be correct". The "simplest theory" is interpreted as the representation of the theory written in this internal language. The theory with the shortest encoding in this internal language is most likely to be correct. == History == Probability and statistics was focused on probability distributions and tests of significance. Probability was formal, well defined, but limited in scope. In particular its application was limited to situations that could be defined as an experiment or trial, with a well defined population. Bayes's theorem is named after Rev. Thomas Bayes 1701–1761. Bayesian inference broadened the application of probability to many situations where a population was not well defined. But Bayes' theorem always depended on prior probabilities, to generate new probabilities. It was unclear where these prior probabilities should come from. Ray Solomonoff developed algorithmic probability which gave an explanation for what randomness is and how patterns in the data may be represented by computer programs, that give shorter representations of the data circa 1964. Chris Wallace and D. M. Boulton developed minimum message length circa 1968. Later Jorma Rissanen developed the minimum description length circa 1978. These methods allow information theory to be related to probability, in a way that can be compared to the application of Bayes' theorem, but which give a source and explanation for the role of prior probabilities. Marcus Hutter combined decision theory with the work of Ray Solomonoff and Andrey Kolmogorov to give a theory for the Pareto optimal behavior for an Intelligent agent, circa 1998. === Minimum description/message length === The program with the shortest length that matches the data is the most likely to predict future data. This is the thesis behind the minimum message length and minimum description length methods. At first sight Bayes' theorem appears different from the minimimum message/description length principle. At closer inspection it turns out to be the same. Bayes' theorem is about conditional probabilities, and states the probability that event B happens if firstly event A happens: P ( A ∧ B ) = P ( B ) ⋅ P ( A | B ) = P ( A ) ⋅ P ( B | A ) {\displaystyle P(A\land B)=P(B)\cdot P(A|B)=P(A)\cdot P(B|A)} becomes in terms of message length L, L ( A ∧ B ) = L ( B ) + L ( A | B ) = L ( A ) + L ( B | A ) . {\displaystyle L(A\land B)=L(B)+L(A|B)=L(A)+L(B|A).} This means that if all the information is given describing an event then the length of the information may be used to give the raw probability of the event. So if the information describing the occurrence of A is given, along with the information describing B given A, then all the information describing A and B has been given. ==== Overfitting ==== Overfitting occurs when the model matches the random noise and not the pattern in the data. For example, take the situation where a curve is fitted to a set of points. If a polynomial with many terms is fitted then it can more closely represent the data. Then the fit will be better, and the information needed to describe the deviations from the fitted curve will be smaller. Smaller information length means higher probability. However, the information needed to describe the curve must also be considered. The total information for a curve with many terms may be greater than for a curve with fewer terms, that has not as good a fit, but needs less information to describe the polynomial. === Inference based on program complexity === Solomonoff's theory of inductive inference is also inductive inference. A bit string x is observed. Then consider all programs that generate strings starting with x. Cast in the form of inductive inference, the programs are theories that imply the observation of the bit string x. The method used here to give probabilities for inductive inference is based on Solomonoff's theory of inductive inference. ==== Detecting patterns in the data ==== If all the bits are 1, then people infer that there is a bias in the coin and that it is more likely also that the next bit is 1 also. This is described as learning from, or detecting a pattern in the data. Such a pattern may be represented by a computer program. A short computer program may be written that produces a series of bits which are all 1. If the length of the program K is L ( K ) {\displaystyle L(K)} bits then its prior probability is, P ( K ) = 2 − L ( K ) {\displaystyle P(K)=2^{-L(K)}} The length of the shortest program that represents the string of bits is called the Kolmogorov complexity. Kolmogorov complexity is not computable. This is related to the halting problem. When searching for the shortest program some programs may go into an infinite loop. ==== Considering all theories ==== The Greek philosopher Epicurus is quoted as saying "If more than one theory is consistent with the observations, keep all theories". As in a crime novel all theories must be considered in determining the likely murderer, so with inductive probability all programs must be considered in determining the likely future bits arising from the stream of bits. Programs that are already longer than n have no predictive power. The raw (or prior) probability that the pattern of bits is random (has no pattern) is 2 − n {\displaystyle 2^{-n}} . Each program that produces the sequence of bits, but is shorter than the n is a theory/pattern about the bits with a probability of 2 − k {\displaystyle 2^{-k}} where k is the length of the program. The probability of receiving a sequence of bits y after receiving a series of bits x is then the conditional probability of receiving y given x, which is the probability of x with y appended, divided by the probability of x. ==== Universal priors ==== The programming language affects the predictions of the next bit in the string. The language acts as a prior probability. This is particularly a problem where the programming language codes for numbers and other data types. Intuitively we think that 0 and 1 are simple numbers, and that prime numbers are somehow more complex than numbers that may be composite. Using the Kolmogorov complexity gives an unbiased estimate (a universal prior) of the prior probability of a number. As a thought experiment an intelligent agent may be fitted with a data input device giving a series of numbers, after applying some transformation function to the raw numbers. Another agent might have the same input device with a different transformation function. The agents do not see or know about these transformation functions. Then there appears no rational basis for preferring one function over another. A universal prior insures that although two agents may have different initial probability distributions for the data input, the difference will be bounded by a constant. So universal priors do not eliminate an initial bias, but they reduce and limit it. Whenever we describe an event in a language, either using a natural language or other, the language has encoded in it our prior expectations. So some reliance on prior probabilities are inevitable. A problem arises where an intelligent agent's prior expectations interact with the environment to form a self reinforcing feed back loop. This is the problem of bias or prejudice. Universal priors reduce but do not eliminate this problem. === Universal artificial intelligence === The theory of universal artificial intelligence applies decision theory to inductive probabilities. The theory shows how the best actions to optimize a reward function may be chosen. The result is a theoretical model of intelligence. It is a fundamental theory of intelligence, which optimizes the agents behavior in, Exploring the environment; performing actions to get responses that broaden the agents knowledge. Competing or co-operating with another agent; games. Balancing short and long term rewards. In general no agent will always provi

Domain adaptation

Domain adaptation is a field associated with machine learning and transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data distribution (the target domain). A common example is spam filtering, where a model trained on emails from one user (source domain) is adapted to handle emails for another user with significantly different patterns (target domain). Domain adaptation techniques can also leverage unrelated data sources to improve learning. When multiple source distributions are involved, the problem extends to multi-source domain adaptation. Domain adaptation is a specific type of transfer learning. According to the taxonomy laid out by Pan and Yang (2010), it falls into the category of transductive transfer learning. In this setting, the source and target tasks are the same (e.g., both are object recognition), but the domains differ (different marginal distributions). This distinguishes it from inductive transfer learning (where labeled data is available for the target task) and unsupervised transfer learning (where labels are unavailable in both domains). == Classification of domain adaptation problems == Domain adaptation setups are classified in two different ways: according to the distribution shift between the domains, and according to the available data from the target domain. === Distribution shifts === Common distribution shifts are classified as follows: Covariate Shift occurs when the input distributions of the source and destination change, but the relationship between inputs and labels remains unchanged. The above-mentioned spam filtering example typically falls in this category. Namely, the distributions (patterns) of emails may differ between the domains, but emails labeled as spam in the one domain should similarly be labeled in another. Prior Shift (Label Shift) occurs when the label distribution differs between the source and target datasets, while the conditional distribution of features given labels remains the same. An example is a classifier of hair color in images from Italy (source domain) and Norway (target domain). The proportions of hair colors (labels) differ, but images within classes like blond and black-haired populations remain consistent across domains. A classifier for the Norway population can exploit this prior knowledge of class proportions to improve its estimates. Concept Shift (Conditional Shift) refers to changes in the relationship between features and labels, even if the input distribution remains the same. For instance, in medical diagnosis, the same symptoms (inputs) may indicate entirely different diseases (labels) in different populations (domains). === Data available during training === Domain adaptation problems typically assume that some data from the target domain is available during training. Problems can be classified according to the type of this available data: Unsupervised: Unlabeled data from the target domain is available, but no labeled data. In the above-mentioned example of spam filtering, this corresponds to the case where emails from the target domain (user) are available, but they are not labeled as spam. Domain adaptation methods can benefit from such unlabeled data, by comparing its distribution (patterns) with the labeled source domain data. Semi-supervised: Most data that is available from the target domain is unlabelled, but some labeled data is also available. In the above-mentioned case of spam filter design, this corresponds to the case that the target user has labeled some emails as being spam or not. Supervised: All data that is available from the target domain is labeled. In this case, domain adaptation reduces to refinement of the source domain predictor. In the above-mentioned example classification of hair-color from images, this could correspond to the refinement of a network already trained on a large dataset of labeled images from Italy, using newly available labeled images from Norway. == Formalization == Let X {\displaystyle X} be the input space (or description space) and let Y {\displaystyle Y} be the output space (or label space). The objective of a machine learning algorithm is to learn a mathematical model (a hypothesis) h : X → Y {\displaystyle h:X\to Y} able to attach a label from Y {\displaystyle Y} to an example from X {\displaystyle X} . This model is learned from a learning sample S = { ( x i , y i ) ∈ ( X × Y ) } i = 1 m {\displaystyle S=\{(x_{i},y_{i})\in (X\times Y)\}_{i=1}^{m}} . Usually in supervised learning (without domain adaptation), we suppose that the examples ( x i , y i ) ∈ S {\displaystyle (x_{i},y_{i})\in S} are drawn i.i.d. from a distribution D S {\displaystyle D_{S}} of support X × Y {\displaystyle X\times Y} (unknown and fixed). The objective is then to learn h {\displaystyle h} (from S {\displaystyle S} ) such that it commits the least error possible for labelling new examples coming from the distribution D S {\displaystyle D_{S}} . The main difference between supervised learning and domain adaptation is that in the latter situation we study two different (but related) distributions D S {\displaystyle D_{S}} and D T {\displaystyle D_{T}} on X × Y {\displaystyle X\times Y} . The domain adaptation task then consists of the transfer of knowledge from the source domain D S {\displaystyle D_{S}} to the target one D T {\displaystyle D_{T}} . The goal is then to learn h {\displaystyle h} (from labeled or unlabelled samples coming from the two domains) such that it commits as little error as possible on the target domain D T {\displaystyle D_{T}} . The major issue is the following: if a model is learned from a source domain, what is its capacity to correctly label data coming from the target domain? == Four algorithmic principles == === Reweighting algorithms === The objective is to reweight the source labeled sample such that it "looks like" the target sample (in terms of the error measure considered). === Iterative algorithms === A method for adapting consists in iteratively "auto-labeling" the target examples. The principle is simple: a model h {\displaystyle h} is learned from the labeled examples; h {\displaystyle h} automatically labels some target examples; a new model is learned from the new labeled examples. Note that there exist other iterative approaches, but they usually need target labeled examples. === Search of a common representation space === The goal is to find or construct a common representation space for the two domains. The objective is to obtain a space in which the domains are close to each other while keeping good performances on the source labeling task. This can be achieved through the use of Adversarial machine learning techniques where feature representations from samples in different domains are encouraged to be indistinguishable. === Hierarchical Bayesian Model === The goal is to construct a Bayesian hierarchical model p ( n ) {\displaystyle p(n)} , which is essentially a factorization model for counts n {\displaystyle n} , to derive domain-dependent latent representations allowing both domain-specific and globally shared latent factors. == Software packages == Several compilations of domain adaptation and transfer learning algorithms have been implemented over the past decades: SKADA (Python) ADAPT (Python) TLlib (Python) Domain-Adaptation-Toolbox (MATLAB)

Variable-message sign

A variable- (also changeable-, electronic-, or dynamic-) message sign or message board, often abbreviated VMS, VMB, CMS, or DMS, and in the UK known as a matrix sign, is an electronic traffic sign often used on roadways to give travelers information about special events. Such signs warn of traffic congestion, accidents, incidents such as terrorist attacks, Amber/Silver/Blue Alerts, roadwork zones, or speed limits on a specific highway segment. In urban areas, VMS are used within parking guidance and information systems to guide drivers to available car parking spaces. They may also ask vehicles to take alternative routes, limit travel speed, warn of duration and location of the incidents, inform of the traffic conditions, or display general public safety messages. == History == VMS systems were deployed at least as early as the 1950s on the New Jersey Turnpike. The road's signs of that period, and up to around 2012, were capable of displaying a few messages in neon, all oriented around warning drivers to slow down: "REDUCE SPEED", followed by a warning of either construction, accident, congestion, ice, snow, or fog at a certain distance ahead. The New Jersey Turnpike Authority replaced those signs (along with 1990s-vintage dot-matrix VMS systems along the Garden State Parkway) with more flexible electronic signs between 2010 and 2016. The current VMS systems are largely deployed on freeways, trunk highways, or in work zones. On the interchange of I-5 and SR 120 in San Joaquin County, California, an automated visibility and speed warning system was installed in 1996 to warn traffic of reduced visibility due to fog (where tule fog is a common problem in the winter), and of slow or stopped traffic. Message Signs were deployed in Ontario during the 1990s and are now being upgraded on 400-series highways as well as two pilot secondary highways in northeastern Ontario. == Technologies and types == Early variable message signs included static signs with words that would illuminate (often using neon tubing) indicating the type of incident that occurred, or signs that used rotating prisms (trilons) to change the message being displayed. These were later replaced by dot matrix displays typically using eggcrate, fiber optic, or flip-disc technology, which were capable of displaying a much wider range of messages than earlier static variable message signs. Since the late 1990s, the most common technology used in new installations for variable message signs are LED displays. In recent years, some newer LED variable message signs have the ability to display colored text and graphics. Dot-matrix variable message signs are divided into three subgroups: character matrix, row matrix, and full matrix. In a character matrix VMS, each character is given its own matrix with equal horizontal spacing between them, typically with two or three rows of characters. In a full matrix VMS, the entire sign is a single large dot matrix display, allowing the display of different fonts and graphics. A row matrix VMS is a hybrid of the two types, divided into two or three rows like a character matrix display, except each row is a single long dot matrix display instead of being split per character horizontally. Overhead variable message signs are today available in three form factors: front access, rear access, and walk-in. In a front access variable message sign, maintenance is performed by lifting the sign open from the front. Most smaller VMS are of the front access form factor, and are typically installed today on major arterials. The rear access form factor is similar to the front access form factor, except that maintenance is performed from the rear of the sign, and are commonly used for medium-sized dynamic message signs installed along the roadside of freeways (instead of overhead). The walk-in form factor is a more recent introduction, where maintenance on the sign is performed from the inside of the sign. A key advantage of the walk-in form factor is that lane closures are generally not required to perform maintenance on the sign. Most of the largest VMS units installed today are walk-in units, and are typically installed overhead on freeways. The NJ Turnpike Authority counts five unique types of variable message signs under its jurisdiction, at least one of which has been replaced by newer signs. They are: "REDUCE SPEED" neon signs (1950s-2010, obsolete, have now been replaced). "Changeable message signs" (trilon/ rotating-drum signs that can be used for closing roads or moving traffic to other roadways). Electronic VMS: signs with remotely controlled messages displayed on them; the messages are sent from the State Traffic Management Center, updating the signs automatically. Variable speed limit signs - used for varying the posted speed limits within work zones and in emergencies. Portable VMS: movable "electronic VMS". A portable VMS has much the same characteristics as a fixed electronic VMS, but can be moved from location to location as the need dictates. == Usage == Early models required an operator to be physically present when programming a message, whereas newer models may be reprogrammed remotely via a wired or wireless network or cellphone connection. A complete message on a panel generally includes a problem statement indicating incident, roadwork, stalled vehicle etc.; a location statement indicating where the incident is located; an effect statement indicating lane closure, delay, etc. and an action statement giving suggestion what to do traffic conditions ahead. These signs are also used for Amber alert messages, and in some states, Silver and Blue Alert messages. In some places, VMSes are set up with permanent, semi-static displays indicating predicted travel times to important traffic destinations such as major cities or interchanges along the route of a highway. Typical messages provide the following information: Promotional messages about services provided by a road authority during non-critical hours, such as carpooling efforts, travelers' information stations and 5-1-1 lines Crashes, including vehicle spin-out or rollover Road Works Incidents affecting normal traffic flow in a lane or on shoulders Non-recurring congestion, often a residual effect of cleared crash Closures of an entire road, e.g. over a mountain pass in winter. Exit ramp closures Debris on roadway Vehicle fires Wildfires Short-term maintenance or construction lasting less than three days Pavement failure alerts AMBER, Silver, and Blue Alerts, as well as weather warnings via the warning infrastructure of NOAA Weather Radio's SAME system Travel times Variable speed limits Car park occupancy levels speed sign, for recommending a speed to approach the next traffic light in its green phase. The information comes from a variety of traffic monitoring and surveillance systems. It is expected that by providing real-time information on special events on the oncoming road, VMS can improve motorists' route selection, reduce travel time, mitigate the severity and duration of incidents and improve the performance of the transportation network. === United Kingdom === Do not enter the motorway when the red lamps are flashing in pairs from side to side. On 27 March 1972, the first motorway computer-controlled warning lights in the UK, with 59 miles on the M6 from Broughton, Lancashire to Barthomley, on the Cheshire boundary, and 26 miles on the M62 east of Whitefield, was switched on by Michael Heseltine and Charles Legh Shuldham Cornwall-Legh, 5th Baron Grey of Codnor at the headquarters of Cheshire Constabulary on Nuns Road. It was centred at a police computer centre at Westhoughton, that connected to police stations in Preston and Chester. The Chester site was soon be connected to the M53 and M57. Four other regional computer centres would be opened at Perry Barr near the M6, Scratchwood near the M1, at Hook near the M3, and at Almondsbury near the M4. Most British motorways would be covered by 1975. The system was designed by GEC and had taken five years to design. == Safety messages for drivers == Increasingly, signs have been used to remind drivers to buckle seat belts ("Click It or Ticket"), obey the speed limit, and stay off the road if impaired ("Drive sober or get pulled over"). In a federal study, a slight majority of drivers reported that public safety messages on dynamic message signs impacted their driving behaviors. The Ohio Department of Transportation began using humorous dynamic message signs in 2015, perplexing some drivers. Examples of humorous signs seen in New Jersey, Arizona, Texas, Pennsylvania, Delaware, Iowa, New York, Minnesota and Ohio include: "Hold on to your butts. Help prevent forest fires." "We'll be blunt. Don't drive high." "Visiting in-laws? Slow down, get there late." "Only sparklers should be lit." and “Don’t drive Star Spangled hammered." (for Fourth of July) "Hocus pocus – drive with focus." and "Slow down in work zones - my mummy works here." (f

Glossary of operating systems terms

This page is a glossary of Operating systems terminology. == A == access token: In Microsoft Windows operating systems, an access token contains the security credentials for a login session and identifies the user, the user's groups, the user's privileges, and, in some cases, a particular application. == B == binary semaphore: See semaphore. booting: In computing, booting (also known as booting up) is the initial set of operations that a computer performs after electrical power is switched on or when the computer is reset. This can take tens of seconds and typically involves performing a power-on self-test, locating and initializing peripheral devices, and then finding, loading and starting the operating system. == C == cache: In computer science, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. cloud: Cloud computing operating systems are recent, and were not mentioned in Gagne's 8th Edition (2009). In contrast, by Gagne's 9th (2012), cloud o/s received 3 pages of coverage (41, 42, 716). Doeppner (2011) mentions them (p. 3), but only to prove that operating systems "are not a solved problem" and that even if the day of the dedicated PC is waning, cloud computing has created an entirely new opportunity for o/s development ala sharing, networks, memory, parallelism, etc. Gagne (2012) adds that in addition to numerous traditional o/s's at cloud warehouses, Virtual machine o/s (VMMs), Eucalyptus, Vware, vCloud Director and others are being developed specifically for cloud management with numerous traditional o/s features (security, threads, file and memory management, guis, etc.) (p. 42). Microsoft's investment in cloud aspects of o/s tend to support that argument. concurrency == D == daemon: Operating systems often start daemons at boot time and serve the function of responding to network requests, hardware activity, or other programs by performing some task. Daemons can also configure hardware (like udevd on some Linux systems), run scheduled tasks (like cron), and perform a variety of other tasks. == E == == F == == G == == H == == I == == J == == K == kernel: In computing, the kernel is a computer program that manages input/output requests from software and translates them into data processing instructions for the central processing unit and other electronic components of a computer. The kernel is a fundamental part of a modern computer's operating system. == L == lock: In computer science, a lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy. == M == mutual exclusion: Mutual exclusion is to allow only one process at a time to access the same critical section (a part of code which accesses the critical resource). This helps prevent race conditions. mutex: See lock. == N == == O == == P == paging daemon: See daemon. process == Q == == R == == S == semaphore: In computer science, particularly in operating systems, a semaphore is a variable or abstract data type that is used for controlling access, by multiple processes, to a common resource in a parallel programming or a multi user environment. == T == thread: In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by an operating system scheduler. The scheduler itself is a light-weight process. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. templating: In an o/s context, templating refers to creating a single virtual machine image as a guest operating system, then saving it as a tool for multiple running virtual machines (Gagne, 2012, p. 716). The technique is used both in virtualization and cloud computing management, and is common in large server warehouses. == U == == V == == W == == Z ==

Atomtronics

Atomtronics is an emerging field concerning the quantum technology of matter-wave circuits which coherently guide propagating ultra-cold atoms. The systems typically include components analogous to those found in electronics, quantum electronics or optical systems, such as beam splitters, transistors, and atomic counterparts of superconducting quantum interference devices (SQUIDs). Applications range from studies of fundamental physics to the development of practical devices such as quantum superfluids for the computation of large models for artificial general intelligence. == Etymology == Atomtronics is a portmanteau of "atom" and "electronics", in reference to the creation of atomic analogues of electronic components, such as transistors and diodes, and also electronic materials such as semiconductors. The field itself has considerable overlap with atom optics and quantum simulation, and is not strictly limited to the development of electronic-like components. However, this field develops into the research of ultra-cold atoms for the applied research implications of computations in quantum science. == Methodology == Three major elements are required for an atomtronic circuit. The first is a Bose-Einstein condensate, which is needed for its coherent and superfluid properties, although an ultracold Fermi gas may also be used for certain applications. The second is a tailored trapping potential, which can be generated optically, magnetically, or using a combination of both. The final element is a method to induce the movement of atoms within the potential, which can be achieved in several ways, for various research advancements around fields not limited to distributed computing, supercomputing, and quantum computing. For example, a transistor-like atomtronic circuit may be realized by a ring-shaped trap divided into two by two moveable weak barriers, with the two separate parts of the ring acting as the drain and the source and the barriers acting as the gate. As the barriers move, atoms flow from the source to the drain. It is now possible to coherently guide matterwaves over distances of up to 40 cm in ring-shaped atomtronic matterwave guide measurement. == Applications == The field of atomtronics is still very nascent and any schemes realized thus far are proof-of-concept. Applications include: gravimetry rotational sensing via the Sagnac effect quantum computing Obstacles to the development of practical sensing devices are largely due to the technical challenges of creating Bose-Einstein condensates. They require bulky lab-based setups not easily suitable for transportation. However, creating portable experimental setups is an active area of research.

Local-first software

Local-first software is a software engineering approach in which an application stores its data primarily on the user's own device rather than on remote servers. Users can read and write data without an Internet connection, and changes are synchronized across devices in the background when connectivity is available. The approach differs from conventional cloud-based applications, where the server holds the authoritative copy of user data and the client acts as a thin client. The term was coined in a 2019 paper published by researchers at Ink & Switch, an independent research lab, and presented at the Onward! conference at ACM SIGPLAN. The paper, sometimes referred to as a manifesto, was authored by Martin Kleppmann, Adam Wiggins, Peter van Hardenberg, and Mark McGranaghan. == Background == Before the widespread adoption of Internet-connected software in the 2000s, most desktop applications stored data as files on the user's local disk. Users had direct access to their files and could copy, back up, or delete them at will. The rise of software as a service (SaaS) and cloud-based applications like Google Docs shifted data storage to centralized servers. While cloud applications made real-time collaboration across devices straightforward, they introduced a dependency on the service provider: if the provider discontinued the service or experienced an outage, users could lose access to their data. A related concept, "offline-first," emerged in the early 2010s and focused on making web applications resilient to network interruptions. The local-first approach built on these earlier efforts while placing greater emphasis on long-term data ownership and end-to-end encryption. == Origins == === Ink & Switch manifesto === Ink & Switch is an industrial research lab co-founded by Adam Wiggins, who had earlier co-founded Heroku. Martin Kleppmann, an associate professor in the Department of Computer Science and Technology at the University of Cambridge, was a co-author of the 2019 paper. The manifesto proposed seven "ideals" for local-first software: Fast — Operations respond without network round-trips. Multi-device — Data synchronizes across a user's devices. Offline — Users can read and write data without a network connection. Collaboration — Multiple users can work on the same data concurrently. Longevity — Data remains accessible even if the software vendor ceases operation. Privacy — End-to-end encryption protects user data. User control — The vendor cannot restrict how users access or use their data. The paper surveyed existing approaches to data storage and collaboration — ranging from email attachments and Dropbox-style file synchronization to web applications and mobile backends — and argued that none of them satisfied all seven ideals simultaneously. === Role of CRDTs === The manifesto identified conflict-free replicated data types (CRDTs) as a promising technical foundation for local-first applications. CRDTs are data structures that allow multiple replicas to be edited independently and then merged without conflicts, a property first formalized in research by Marc Shapiro and colleagues around 2011. Kleppmann and collaborators at Ink & Switch developed Automerge, an open-source CRDT library for JSON documents, to make these algorithms available to application developers. == Adoption and community == Developer interest in the local-first approach grew after the 2019 paper spread on Hacker News and at developer conferences In August 2023, Wired published a feature article on the movement, describing it as an effort to reduce reliance on large cloud providers. The first Local-First Conf took place on 30 May 2024 in Berlin, with talks by Kleppmann and developers from companies including Linear and Anytype. The community has continued to expand, with regular "LoFi" meetups, a podcast (localfirst.fm), and a third edition of the conference planned for Berlin in July 2026. == Criticisms and limitations == Developers and commentators have pointed out practical difficulties with the local-first approach. Synchronizing data between multiple devices that may be offline for extended periods introduces complexity that cloud-based architectures avoid. Conflict resolution, even with CRDTs, can produce results that are technically consistent but semantically unexpected to users. Schema migrations across thousands of client devices running different application versions pose another difficulty that does not arise with server-side databases. Web browsers impose storage limits and may evict locally stored data. Safari, for instance, has been reported to clear IndexedDB data after seven days of inactivity on a given site, which undermines the assumption that local data is persistent. There is also disagreement within the local-first community about whether a fully decentralized architecture is required. The original manifesto described decentralization as the "logical end goal," but a number of products that identify as local-first still depend on centralized servers for authentication, backup, or synchronization. In a talk at Local-First Conf 2024, Kleppmann said the seven ideals are better understood as a "gradient" rather than a strict checklist.

Web3D

Web3D, also called 3D Web, is a group of technologies to display and navigate websites using 3D computer graphics. These technologies enable applications such as online games, virtual reality experiences, interactive product demonstrations, and 3D data visualization directly within web browsers. The emergence of Web3D dates back to 1994, with the advent of VRML, a file format designed to store and display 3D graphical data on the World Wide Web. Modern Web3D is primarily powered by WebGL, a JavaScript API that enables hardware-accelerated 3D graphics rendering in web browsers without requiring plug-ins. == Pre-WebGL era == The emergence of Web3D dates back to 1994, with the advent of VRML, a file format designed to store and display 3D graphical data on the World Wide Web. In October 1995, at Internet World, Template Graphics Software demonstrated a 3D/VRML plug-in for the beta release of Netscape 2.0 by Netscape Communications. The Web3D Consortium was formed to further the collective development of the format. VRML and its successor, X3D, have been accepted as international standards by the International Organization for Standardization and the International Electrotechnical Commission. The main drawback of the technology was the requirement to use third-party browser plug-ins to perform 3D rendering, which slowed the adoption of the standard. Between 2000 and 2010, one of these plug-ins, Adobe Flash Player, was widely installed on desktop computers and was used to display interactive web pages and online games and to play video and audio content. Several Flash-based frameworks appeared that used software rendering and ActionScript 3 to perform 3D computations such as transformations, lighting, and texturing. Most notable among them were Papervision3D and Away3D. Eventually, Adobe developed Stage3D, an API for rendering interactive 3D graphics with GPU-acceleration for its Flash player and AIR products, which was adopted by software vendors. In 2009, an open-source 3D web technology called O3D was introduced by Google. It also required a browser plug-in, but contrary to Flash/Stage3D, was based on JavaScript API. O3D was geared not only for games but also for advertisements, 3D model viewers, product demos, simulations, engineering applications, control and monitoring systems. == WebGL and glTF == WebGL (short for "Web Graphics Library") evolved out of the Canvas 3D experiments started by Vladimir Vukićević at Mozilla Foundation. Vukićević first demonstrated a Canvas 3D prototype in 2006. By the end of 2007, both Mozilla and Opera had made their own separate implementations. In early 2009, the nonprofit technology consortium Khronos Group started the WebGL Working Group, with initial participation from Apple, Google, Mozilla, Opera, and others. Version 1.0 of the WebGL specification was released in March 2011. Major advantages of the new technology include conformity with web standards and near-native 3D performance without the use of any browser plug-ins. Since WebGL is based on OpenGL ES, it works on mobile devices without any additional abstraction layers. For other platforms, WebGL implementations leverage ANGLE to translate OpenGL ES calls to DirectX, OpenGL, or Vulkan API calls. Among notable WebGL frameworks are A-Frame, which uses HTML-based markup for building virtual reality experiences; PlayCanvas, an open-source engine alongside a proprietary cloud-hosted creation platform for building browser games; Three.js, an MIT-licensed framework used to create demoscene from the early 2000s; Unity, which obtained a WebGL back-end in version 5; and Verge3D, which integrates with Blender, 3ds Max, and Maya to create 3D web content. With the rapid adoption of WebGL, a new problem arose—the lack of a 3D file format optimized for the Web. This issue was addressed by glTF, a format that was conceived in 2012 by members of the COLLADA working group. At SIGGRAPH 2012, Khronos presented a demo of glTF, which was then called WebGL Transmissions Format (WebGL TF). On 19 October 2015, the glTF 1.0 specification was released. Version 2.0 glTF uses a physically based rendering material model, proposed by Fraunhofer. Other upgrades include sparse accessors and morph targets for techniques such as facial animation, and schema tweaks and breaking changes for corner cases or performance, such as replacing top-level glTF object properties with arrays for faster index-based access. == Future == "WebGPU" is the working name for a potential web standard and JavaScript API for accelerated graphics and computing, aiming to provide "modern 3D graphics and computation capabilities". It is developed by the W3C "GPU for the Web" Community Group, with engineers from Apple, Mozilla, Microsoft, and Google, among others. WebGPU will not be based on any existing 3D API and will use Rust-like syntax for shaders.