Mobile Fortify is a mobile app used by United States Immigration and Customs Enforcement (ICE) on their government-issued phones. The app allows agents to take a photo in order to gather biometrics, including contactless fingerprints and faceprints, for the purpose of identifying an individual and their potential immigration status. The app was created by NEC. == History == In June 2025, use of Mobile Fortify by ICE was uncovered through leaked emails and the user manual, reported by 404 Media. The app is internally developed, and details of the parent company and developer were initially unknown. In January 2026, the DHS's 2025 AI Use Case Inventory revealed the vendor as NEC Corporation, an international conglomerate with subsidiaries in Argentina, Australia, China, India and Malaysia. Later that month, several senators demanded transparency around the app and its origins, and that ICE stop using it. A second letter was sent again in November, after hearing no response to the previous letter from ICE. == Technology == Unlike other facial recognition software, Fortify uses federally linked databases. By contrast, Clearview AI uses public social media databases for biometric scanning. Federal databases include DHS's automated biometric identification system (IDENT), containing more than 270 million biometric records, and Customs and Border Protection's Traveler Verification Service. The State Department's visa and passport photo database, the FBI's National Crime Information Center, National Law Enforcement Telecommunications Systems, and CBP's TECS and Seized Assets and Case Tracing System (SEACATS). == Oversight == Several senators urged ICE to stop using the app for fear of infringing on fourth amendment and first amendment rights, and requested details on who developed the app, when it was deployed, whether the app was tested for accuracy, and policies and practices governing its use. In June 2025, they sent an open letter to Todd Lyons, ICE acting director, signed by senators Cory Booker, Chris Van Hollen, Ed Markey, Bernie Sanders, Adam Schiff, Tina Smith, Elizabeth Warren, and Ron Wyden. On November 3, a second letter was sent to the ICE by senators, after not receiving answers to questions from the previous letter deadlined for October 2. == Criticism == Mobile Fortify, and ICE's use of similar biometric identification technologies (such as Mobile Identify, an app similar to Mobile Fortify to be used by local or regional law enforcement to assist in immigration enforcement ) has faced scrutiny from a variety of digital rights organizations, politicians, and news outlets. The criticism is already considered to potentially be a reason why the similar Mobile Identify app was pulled from the Google Play Store. Facial recognition technologies are known to produce false-positives and generally unreliable results, especially on those with darker skin tones. ICE has already previously mistakenly arrested a U.S. citizen under the belief he was illegally in the country, and later stated that he "could be deported based on biometric confirmation of his identity" prior to his release. U.S. representative Bennie Thompson, ranking member of the House Homeland Security Committee has previously commented that "ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person's status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien," and that "Mobile Fortify is a dangerous tool in the hands of ICE, and it puts American citizens at risk of detention and even deportation," On January 19, 2026, 404 Media reported on a case where a woman, identified in court documents as "MJMA", was scanned by Mobile Fortify twice in the same interaction, and two entirely different names were provided by the app. According to the Innovation Law Lab, whose attorneys are representing MJMA, both of the names were incorrect. ICE has stated that they will not allow people to decline to be scanned by Mobile Fortify, and that photos taken, even those of U.S. citizens, will be stored for 15 years, something that has been criticized primarily because ICE has not performed a Privacy Impact Assessment (PIA) for Mobile Fortify, the right to decline other forms of biometric verification to the U.S. government is often available under other circumstances, and the 15 year window is viewed as unnecessarily large.
Reconstruction from projections
The problem of reconstructing a multidimensional signal from its projection is uniquely multidimensional, having no 1-D counterpart. It has applications that range from computer-aided tomography to geophysical signal processing. It is a problem which can be explored from several points of view—as a deconvolution problem, a modeling problem, an estimation problem, or an interpolation problem. == Motivation and applications == Many fields in science and engineering use reconstruction from projections, especially in imaging. It is widely applied geophysical tomography, medical imaging and industrial radiography. For example, in a CT scanner, the 3D structure of the patient’s body being scanned is measured with beams going through the tissue and hitting a detector, giving a flat projection of the body from that angle. Multiple projections are put together to get an image of the position and shape of structures inside in 3D. == Problem statement and basics == A projection is a linear mapping of an M {\displaystyle M} dimensional signal into an N {\displaystyle N} dimensional one, where N ≤ M {\displaystyle N\leq M} . And the objective of reconstruction is to restore the M {\displaystyle M} dimensional signal based on the N {\displaystyle N} dimensional signal. The following case is a 2-D signal projected into 1D signal. The signal in the original coordinate is denoted as d ( u , v ) {\displaystyle d(u,v)} . Now consider a collimated beam of radiation coming from the opposite orientation of v ^ {\displaystyle {\hat {v}}} , producing a projection along u ^ {\displaystyle {\hat {u}}} . v ^ {\displaystyle {\hat {v}}} and u ^ {\displaystyle {\hat {u}}} are normal to each other, and the angle between u {\displaystyle u} and u ^ {\displaystyle {\hat {u}}} is theta. The signal obtained along u ^ {\displaystyle {\hat {u}}} axis is defined to be p θ ( u ^ ) {\displaystyle p_{\theta }({\hat {u}})} . The relationship between the original coordinate and the rotated coordinate is given by [ u ^ v ^ ] = [ cos θ sin θ − sin θ cos θ ] [ u v ] {\displaystyle {\begin{bmatrix}{\hat {u}}\\{\hat {v}}\end{bmatrix}}={\begin{bmatrix}\cos \theta &\sin \theta \\-\sin \theta &\cos \theta \end{bmatrix}}{\begin{bmatrix}u\\v\end{bmatrix}}} or inversely, [ u v ] = [ cos θ − sin θ sin θ cos θ ] [ u ^ v ^ ] {\displaystyle {\begin{bmatrix}u\\v\end{bmatrix}}={\begin{bmatrix}\cos \theta &-\sin \theta \\\sin \theta &\cos \theta \end{bmatrix}}{\begin{bmatrix}{\hat {u}}\\{\hat {v}}\end{bmatrix}}} Then we have p θ ( u ^ ) = ∫ − ∞ ∞ d ( u , v ) d v ^ = ∫ − ∞ ∞ d ( u ^ cos ( θ ) − v ^ sin ( θ ) , u ^ sin ( θ ) + v ^ cos ( θ ) ) d v ^ {\displaystyle p_{\theta }({\hat {u}})=\int _{-\infty }^{\infty }d(u,v)\,\mathrm {d} {\hat {v}}=\int _{-\infty }^{\infty }d({\hat {u}}\cos(\theta )-{\hat {v}}\sin(\theta ),{\hat {u}}\sin(\theta )+{\hat {v}}\cos(\theta ))\,\mathrm {d} {\hat {v}}} By varying theta, a large number of projections can be obtained. Given the projection-slice theorem, D ( Ω , θ ) {\displaystyle D(\Omega ,\theta )} ,the slice of the Fourier transform of d ( u , v ) {\displaystyle d(u,v)} at angle theta, is equivalent to P θ ( Ω ) {\displaystyle P_{\theta }(\Omega )} , the Fourier Transform of the projection p θ ( u ^ ) {\displaystyle p_{\theta }({\hat {u}})} . Therefore, the unknown d ( u , v ) {\displaystyle d(u,v)} can be obtained from its Fourier transform by means of the Fourier transform inversion integral d ( u , v ) = 1 4 π 2 ∫ − ∞ ∞ ∫ − ∞ ∞ D ( Ω 1 , Ω 2 ) e j Ω 1 u e j Ω 2 v d Ω 1 , Ω 2 {\displaystyle \mathrm {d} (u,v)={\frac {1}{4\pi ^{2}}}\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }D(\Omega _{1},\Omega _{2})e^{j\Omega _{1}u}e^{j\Omega _{2}v}\,\mathrm {d} \Omega _{1},\Omega _{2}} = 1 4 π 2 ∫ 0 ∞ ∫ − π π D ( Ω , θ ) e j Ω u cos ( θ ) e j Ω v s i n θ | Ω | d Ω d θ {\displaystyle ={\frac {1}{4\pi ^{2}}}\int _{0}^{\infty }\int _{-\pi }^{\pi }D(\Omega ,\theta )e^{j\Omega u\cos(\theta )}e^{j\Omega vsin\theta }{\begin{vmatrix}\Omega \end{vmatrix}}\,\mathrm {d} \Omega \mathrm {d} \theta } = 1 4 π 2 ∫ − π π ∫ 0 ∞ P θ ( Ω ) e j Ω ( u cos θ + v sin θ ) | Ω | d Ω d θ {\displaystyle ={\frac {1}{4\pi ^{2}}}\int _{-\pi }^{\pi }\int _{0}^{\infty }P_{\theta }(\Omega )e^{j}\Omega (u\cos \theta +v\sin \theta ){\begin{vmatrix}\Omega \end{vmatrix}}\,\mathrm {d} \Omega \mathrm {d} \theta } = 1 4 π 2 ∫ 0 π ( ∫ − ∞ ∞ P θ ( Ω ) | Ω | {\displaystyle ={\frac {1}{4\pi ^{2}}}\int _{0}^{\pi }(\int _{-\infty }^{\infty }P_{\theta }(\Omega ){\begin{vmatrix}\Omega \end{vmatrix}}} e j Ω u ^ d Ω ) d θ {\displaystyle e^{j\Omega {\hat {u}}}\mathrm {d} \Omega )\mathrm {d} \theta } By taking the inverse Fourier Transform and assuming g ( u ^ ) = F − 1 ( | Ω | 2 ) {\displaystyle g({\hat {u}})={\mathcal {F}}^{-1}({{\begin{vmatrix}\Omega \end{vmatrix}}^{2}})} , we get d ( u , v ) = ∑ i △ θ i [ p θ ( u ^ ) ∗ g θ i ( u ^ ) ] {\displaystyle d(u,v)=\sum _{i}\vartriangle \theta _{i}[p_{\theta }({\hat {u}})g_{\theta i}({\hat {u}})]} == Approaches == In practice, there are a wide variety of methods that are utilized, most of which are reconstruct 3-D information (volume) from 2-D signals (image). Typically used methods are CT, MRI, PET and SPECT. And the filtered back projection based on the principles introduced above are commonly applied. === Computed Tomography (CT) === In CT, a volume is formed by stacking the axial slices. The software cuts the volume in a different plane (usually orthogonal). Commonly, slice data is generated using an X-ray source that rotates around the object. X-ray sensors are positioned on the opposite side of the circle from the X-ray source. === Magnetic resonance imaging (MRI) === In MRI, energy from an oscillating magnetic field is temporarily applied to the patient at the appropriate resonance frequency. The protons (hydrogen atoms) emit a radio frequency signal which is measured by a receiving coil. The radio signal can be made to encode position information by varying the main magnetic field using gradient coils. === Positron emission tomography (PET) === The system detects pairs of gamma rays emitted indirectly by a positron-emitting radionuclide (tracer), which is introduced into the body on a biologically active molecule. Three-dimensional images of tracer concentration within the body are then constructed by computer analysis. In modern PET-CT scanners, three dimensional imaging is often accomplished with the aid of a CT X-ray scan performed on the patient during the same session, in the same machine. === Single-photon emission computed tomography (SPECT) === SPECT imaging is performed by using a gamma camera to acquire multiple 2-D images (projections) from multiple angles. Multiple projections are used to yield a 3-D data set. This data set may then be manipulated to show thin slices along any chosen axis of the body. SPECT is similar to PET in its use of radioactive tracer material and detection of gamma rays, while the tracers used in SPECT emit gamma radiation that is measured more directly.
Inductive logic programming
Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "inductive" here refers to philosophical (i.e. suggesting a theory to explain observed facts) rather than mathematical (i.e. proving a property for all members of a well-ordered set) induction. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples. Schema: positive examples + negative examples + background knowledge ⇒ hypothesis. Bioinformatics and drug design have been highlighted as a principal application area of inductive logic programming techniques. == History == Building on earlier work on Inductive inference, Gordon Plotkin was the first to formalise induction in a clausal setting around 1970, adopting an approach of generalising from examples. In 1981, Ehud Shapiro introduced several ideas that would shape the field in his new approach of model inference, an algorithm employing refinement and backtracing to search for a complete axiomatisation of given examples. His first implementation was the Model Inference System in 1981: a Prolog program that inductively inferred Horn clause logic programs from positive and negative examples. The term Inductive Logic Programming was first introduced in a paper by Stephen Muggleton in 1990, defined as the intersection of machine learning and logic programming. Muggleton and Wray Buntine introduced predicate invention and inverse resolution in 1988. Several inductive logic programming systems that proved influential appeared in the early 1990s. FOIL, introduced by Ross Quinlan in 1990 was based on upgrading propositional learning algorithms AQ and ID3. Golem, introduced by Muggleton and Feng in 1990, went back to a restricted form of Plotkin's least generalisation algorithm. The Progol system, introduced by Muggleton in 1995, first implemented inverse entailment, and inspired many later systems. Aleph, a descendant of Progol introduced by Ashwin Srinivasan in 2001, is still one of the most widely used systems as of 2022. At around the same time, the first practical applications emerged, particularly in bioinformatics, where by 2000 inductive logic programming had been successfully applied to drug design, carcinogenicity and mutagenicity prediction, and elucidation of the structure and function of proteins. Unlike the focus on automatic programming inherent in the early work, these fields used inductive logic programming techniques from a viewpoint of relational data mining. The success of those initial applications and the lack of progress in recovering larger traditional logic programs shaped the focus of the field. Recently, classical tasks from automated programming have moved back into focus, as the introduction of meta-interpretative learning makes predicate invention and learning recursive programs more feasible. This technique was pioneered with the Metagol system introduced by Muggleton, Dianhuan Lin, Niels Pahlavi and Alireza Tamaddoni-Nezhad in 2014. This allows ILP systems to work with fewer examples, and brought successes in learning string transformation programs, answer set grammars and general algorithms. == Setting == Inductive logic programming has adopted several different learning settings, the most common of which are learning from entailment and learning from interpretations. In both cases, the input is provided in the form of background knowledge B, a logical theory (commonly in the form of clauses used in logic programming), as well as positive and negative examples, denoted E + {\textstyle E^{+}} and E − {\textstyle E^{-}} respectively. The output is given as a hypothesis H, itself a logical theory that typically consists of one or more clauses. The two settings differ in the format of examples presented. === Learning from entailment === As of 2022, learning from entailment is by far the most popular setting for inductive logic programming. In this setting, the positive and negative examples are given as finite sets E + {\textstyle E^{+}} and E − {\textstyle E^{-}} of positive and negated ground literals, respectively. A correct hypothesis H is a set of clauses satisfying the following requirements, where the turnstile symbol ⊨ {\displaystyle \models } stands for logical entailment: Completeness: B ∪ H ⊨ E + Consistency: B ∪ H ∪ E − ⊭ false {\displaystyle {\begin{array}{llll}{\text{Completeness:}}&B\cup H&\models &E^{+}\\{\text{Consistency: }}&B\cup H\cup E^{-}&\not \models &{\textit {false}}\end{array}}} Completeness requires any generated hypothesis H to explain all positive examples E + {\textstyle E^{+}} , and consistency forbids generation of any hypothesis H that is inconsistent with the negative examples E − {\textstyle E^{-}} , both given the background knowledge B. In Muggleton's setting of concept learning, "completeness" is referred to as "sufficiency", and "consistency" as "strong consistency". Two further conditions are added: "Necessity", which postulates that B does not entail E + {\textstyle E^{+}} , does not impose a restriction on H, but forbids any generation of a hypothesis as long as the positive facts are explainable without it. "Weak consistency", which states that no contradiction can be derived from B ∧ H {\textstyle B\land H} , forbids generation of any hypothesis H that contradicts the background knowledge B. Weak consistency is implied by strong consistency; if no negative examples are given, both requirements coincide. Weak consistency is particularly important in the case of noisy data, where completeness and strong consistency cannot be guaranteed. === Learning from interpretations === In learning from interpretations, the positive and negative examples are given as a set of complete or partial Herbrand structures, each of which are themselves a finite set of ground literals. Such a structure e is said to be a model of the set of clauses B ∪ H {\textstyle B\cup H} if for any substitution θ {\textstyle \theta } and any clause h e a d ← b o d y {\textstyle \mathrm {head} \leftarrow \mathrm {body} } in B ∪ H {\textstyle B\cup H} such that b o d y θ ⊆ e {\textstyle \mathrm {body} \theta \subseteq e} , h e a d θ ⊆ e {\displaystyle \mathrm {head} \theta \subseteq e} also holds. The goal is then to output a hypothesis that is complete, meaning every positive example is a model of B ∪ H {\textstyle B\cup H} , and consistent, meaning that no negative example is a model of B ∪ H {\textstyle B\cup H} . == Approaches to ILP == An inductive logic programming system is a program that takes as an input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} and outputs a correct hypothesis H with respect to theories B , E + , E − {\displaystyle B,E^{+},E^{-}} . A system is complete if and only if for any input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} any correct hypothesis H with respect to these input theories can be found with its hypothesis search procedure. Inductive logic programming systems can be roughly divided into two classes, search-based and meta-interpretative systems. Search-based systems exploit that the space of possible clauses forms a complete lattice under the subsumption relation, where one clause C 1 {\textstyle C_{1}} subsumes another clause C 2 {\textstyle C_{2}} if there is a substitution θ {\textstyle \theta } such that C 1 θ {\textstyle C_{1}\theta } , the result of applying θ {\textstyle \theta } to C 1 {\textstyle C_{1}} , is a subset of C 2 {\textstyle C_{2}} . This lattice can be traversed either bottom-up or top-down. === Bottom-up search === Bottom-up methods to search the subsumption lattice have been investigated since Plotkin's first work on formalising induction in clausal logic in 1970. Techniques used include least general generalisation, based on anti-unification, and inverse resolution, based on inverting the resolution inference rule. ==== Least general generalisation ==== A least general generalisation algorithm takes as input two clauses C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} and outputs the least general generalisation of C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , that is, a clause C {\textstyle C} that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , and that is subsumed by every other clause that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} . The least general generalisation can be computed by first computing all selections from C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , which are pairs of literals ( L , M ) ∈ ( C 1 × C 2 ) {\displaystyle (L,M)\in (C_{1}\times C_{2})} sharing the same predicate symbol and negated/unnegated status. Then, the least general generalisation is obtained as the disjunction of the least general generalisations of the indi
Synaptic transistor
A synaptic transistor is an electrical device that can learn in ways similar to a neural synapse. It optimizes its own properties for the functions it has carried out in the past. The device mimics the behavior of the property of neurons called spike-timing-dependent plasticity, or STDP. == Structure == Its structure is similar to that of a field effect transistor, where an ionic liquid takes the place of the gate insulating layer between the gate electrode and the conducting channel. That channel is composed of samarium nickelate (SmNiO3, or SNO) rather than the field effect transistor's doped silicon. == Function == A synaptic transistor has a traditional immediate response whose amount of current that passes between the source and drain contacts varies with voltage applied to the gate electrode. It also produces a much slower learned response such that the conductivity of the SNO layer varies in response to the transistor's STDP history, essentially by shuttling oxygen ions between the SNO and the ionic liquid. The analog of strengthening a synapse is to increase the SNO's conductivity, which essentially increases gain. Similarly, weakening a synapse is analogous to decreasing the SNO's conductivity, lowering the gain. The input and output of the synaptic transistor are continuous analog values, rather than digital on-off signals. While the physical structure of the device has the potential to learn from history, it contains no way to bias the transistor to control the memory effect. An external supervisory circuit converts the time delay between input and output into a voltage applied to the ionic liquid that either drives ions into the SNO or removes them. A network of such devices can learn particular responses to "sensory inputs", with those responses being learned through experience rather than explicitly programmed.
Mathematics of neural networks in machine learning
An artificial neural network (ANN) or neural network combines biological principles with advanced statistics to solve problems in domains such as pattern recognition and game-play. ANNs adopt the basic model of neuron analogues connected to each other in a variety of ways. == Structure == === Neuron === A neuron with label j {\displaystyle j} receiving an input p j ( t ) {\displaystyle p_{j}(t)} from predecessor neurons consists of the following components: an activation a j ( t ) {\displaystyle a_{j}(t)} , the neuron's state, depending on a discrete time parameter, an optional threshold θ j {\displaystyle \theta _{j}} , which stays fixed unless changed by learning, an activation function f {\displaystyle f} that computes the new activation at a given time t + 1 {\displaystyle t+1} from a j ( t ) {\displaystyle a_{j}(t)} , θ j {\displaystyle \theta _{j}} and the net input p j ( t ) {\displaystyle p_{j}(t)} giving rise to the relation a j ( t + 1 ) = f ( a j ( t ) , p j ( t ) , θ j ) , {\displaystyle a_{j}(t+1)=f(a_{j}(t),p_{j}(t),\theta _{j}),} and an output function f out {\displaystyle f_{\text{out}}} computing the output from the activation o j ( t ) = f out ( a j ( t ) ) . {\displaystyle o_{j}(t)=f_{\text{out}}(a_{j}(t)).} Often the output function is simply the identity function. An input neuron has no predecessor but serves as input interface for the whole network. Similarly an output neuron has no successor and thus serves as output interface of the whole network. === Propagation function === The propagation function computes the input p j ( t ) {\displaystyle p_{j}(t)} to the neuron j {\displaystyle j} from the outputs o i ( t ) {\displaystyle o_{i}(t)} and typically has the form p j ( t ) = ∑ i o i ( t ) w i j . {\displaystyle p_{j}(t)=\sum _{i}o_{i}(t)w_{ij}.} === Bias === A bias term can be added, changing the form to the following: p j ( t ) = ∑ i o i ( t ) w i j + w 0 j , {\displaystyle p_{j}(t)=\sum _{i}o_{i}(t)w_{ij}+w_{0j},} where w 0 j {\displaystyle w_{0j}} is a bias. == Neural networks as functions == Neural network models can be viewed as defining a function that takes an input (observation) and produces an output (decision) f : X → Y {\displaystyle \textstyle f:X\rightarrow Y} or a distribution over X {\displaystyle \textstyle X} or both X {\displaystyle \textstyle X} and Y {\displaystyle \textstyle Y} . Sometimes models are intimately associated with a particular learning rule. A common use of the phrase "ANN model" is really the definition of a class of such functions (where members of the class are obtained by varying parameters, connection weights, or specifics of the architecture such as the number of neurons, number of layers or their connectivity). Mathematically, a neuron's network function f ( x ) {\displaystyle \textstyle f(x)} is defined as a composition of other functions g i ( x ) {\displaystyle \textstyle g_{i}(x)} , that can further be decomposed into other functions. This can be conveniently represented as a network structure, with arrows depicting the dependencies between functions. A widely used type of composition is the nonlinear weighted sum, where f ( x ) = K ( ∑ i w i g i ( x ) ) {\displaystyle \textstyle f(x)=K\left(\sum _{i}w_{i}g_{i}(x)\right)} , where K {\displaystyle \textstyle K} (commonly referred to as the activation function) is some predefined function, such as the hyperbolic tangent, sigmoid function, softmax function, or rectifier function. The important characteristic of the activation function is that it provides a smooth transition as input values change, i.e. a small change in input produces a small change in output. The following refers to a collection of functions g i {\displaystyle \textstyle g_{i}} as a vector g = ( g 1 , g 2 , … , g n ) {\displaystyle \textstyle g=(g_{1},g_{2},\ldots ,g_{n})} . This figure depicts such a decomposition of f {\displaystyle \textstyle f} , with dependencies between variables indicated by arrows. These can be interpreted in two ways. The first view is the functional view: the input x {\displaystyle \textstyle x} is transformed into a 3-dimensional vector h {\displaystyle \textstyle h} , which is then transformed into a 2-dimensional vector g {\displaystyle \textstyle g} , which is finally transformed into f {\displaystyle \textstyle f} . This view is most commonly encountered in the context of optimization. The second view is the probabilistic view: the random variable F = f ( G ) {\displaystyle \textstyle F=f(G)} depends upon the random variable G = g ( H ) {\displaystyle \textstyle G=g(H)} , which depends upon H = h ( X ) {\displaystyle \textstyle H=h(X)} , which depends upon the random variable X {\displaystyle \textstyle X} . This view is most commonly encountered in the context of graphical models. The two views are largely equivalent. In either case, for this particular architecture, the components of individual layers are independent of each other (e.g., the components of g {\displaystyle \textstyle g} are independent of each other given their input h {\displaystyle \textstyle h} ). This naturally enables a degree of parallelism in the implementation. Networks such as the previous one are commonly called feedforward, because their graph is a directed acyclic graph. Networks with cycles are commonly called recurrent. Such networks are commonly depicted in the manner shown at the top of the figure, where f {\displaystyle \textstyle f} is shown as dependent upon itself. However, an implied temporal dependence is not shown. == Backpropagation == Backpropagation training algorithms fall into three categories: steepest descent (with variable learning rate and momentum, resilient backpropagation); quasi-Newton (Broyden–Fletcher–Goldfarb–Shanno, one step secant); Levenberg–Marquardt and conjugate gradient (Fletcher–Reeves update, Polak–Ribiére update, Powell–Beale restart, scaled conjugate gradient). === Algorithm === Let N {\displaystyle N} be a network with e {\displaystyle e} connections, m {\displaystyle m} inputs and n {\displaystyle n} outputs. Below, x 1 , x 2 , … {\displaystyle x_{1},x_{2},\dots } denote vectors in R m {\displaystyle \mathbb {R} ^{m}} , y 1 , y 2 , … {\displaystyle y_{1},y_{2},\dots } vectors in R n {\displaystyle \mathbb {R} ^{n}} , and w 0 , w 1 , w 2 , … {\displaystyle w_{0},w_{1},w_{2},\ldots } vectors in R e {\displaystyle \mathbb {R} ^{e}} . These are called inputs, outputs and weights, respectively. The network corresponds to a function y = f N ( w , x ) {\displaystyle y=f_{N}(w,x)} which, given a weight w {\displaystyle w} , maps an input x {\displaystyle x} to an output y {\displaystyle y} . In supervised learning, a sequence of training examples ( x 1 , y 1 ) , … , ( x p , y p ) {\displaystyle (x_{1},y_{1}),\dots ,(x_{p},y_{p})} produces a sequence of weights w 0 , w 1 , … , w p {\displaystyle w_{0},w_{1},\dots ,w_{p}} starting from some initial weight w 0 {\displaystyle w_{0}} , usually chosen at random. These weights are computed in turn: first compute w i {\displaystyle w_{i}} using only ( x i , y i , w i − 1 ) {\displaystyle (x_{i},y_{i},w_{i-1})} for i = 1 , … , p {\displaystyle i=1,\dots ,p} . The output of the algorithm is then w p {\displaystyle w_{p}} , giving a new function x ↦ f N ( w p , x ) {\displaystyle x\mapsto f_{N}(w_{p},x)} . The computation is the same in each step, hence only the case i = 1 {\displaystyle i=1} is described. w 1 {\displaystyle w_{1}} is calculated from ( x 1 , y 1 , w 0 ) {\displaystyle (x_{1},y_{1},w_{0})} by considering a variable weight w {\displaystyle w} and applying gradient descent to the function w ↦ E ( f N ( w , x 1 ) , y 1 ) {\displaystyle w\mapsto E(f_{N}(w,x_{1}),y_{1})} to find a local minimum, starting at w = w 0 {\displaystyle w=w_{0}} . This makes w 1 {\displaystyle w_{1}} the minimizing weight found by gradient descent. == Learning pseudocode == To implement the algorithm above, explicit formulas are required for the gradient of the function w ↦ E ( f N ( w , x ) , y ) {\displaystyle w\mapsto E(f_{N}(w,x),y)} where the function is E ( y , y ′ ) = | y − y ′ | 2 {\displaystyle E(y,y')=|y-y'|^{2}} . The learning algorithm can be divided into two phases: propagation and weight update. === Propagation === Propagation involves the following steps: Propagation forward through the network to generate the output value(s) Calculation of the cost (error term) Propagation of the output activations back through the network using the training pattern target to generate the deltas (the difference between the targeted and actual output values) of all output and hidden neurons. === Weight update === For each weight: Multiply the weight's output delta and input activation to find the gradient of the weight. Subtract the ratio (percentage) of the weight's gradient from the weight. The learning rate is the ratio (percentage) that influences the speed and quality of learning. The greater the ratio, the faster the neuron trains, but the lower the ratio, the more accurat
Materialized view
In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function. The process of setting up a materialized view is sometimes called materialization. This is a form of caching the results of a query, similar to memoization of the value of a function in functional languages, and it is sometimes described as a form of precomputation. As with other forms of precomputation, database users typically use materialized views for performance reasons, i.e. as a form of optimization. Materialized views that store data based on remote tables were also known as snapshots (deprecated Oracle terminology). In any database management system following the relational model, a view is a virtual table representing the result of a database query. Whenever a query or an update addresses an ordinary view's virtual table, the DBMS converts these into queries or updates against the underlying base tables. A materialized view takes a different approach: the query result is cached as a concrete ("materialized") table (rather than a view as such) that may be updated from the original base tables from time to time. This enables much more efficient access, at the cost of extra storage and of some data being potentially out-of-date. Materialized views find use especially in data warehousing scenarios, where frequent queries of the actual base tables can be expensive. In a materialized view, indexes can be built on any column. In contrast, in a normal view, it's typically only possible to exploit indexes on columns that come directly from (or have a mapping to) indexed columns in the base tables; often this functionality is not offered at all. == Implementations == === Oracle === Materialized views were implemented first by the Oracle Database: the Query rewrite feature was added from version 8i. Example syntax to create a materialized view in Oracle: === PostgreSQL === In PostgreSQL, version 9.3 and newer natively support materialized views. In version 9.3, a materialized view is not auto-refreshed, and is populated only at time of creation (unless WITH NO DATA is used). It may be refreshed later manually using REFRESH MATERIALIZED VIEW. In version 9.4, the refresh may be concurrent with selects on the materialized view if CONCURRENTLY is used. Example syntax to create a materialized view in PostgreSQL: === SQL Server === Microsoft SQL Server differs from other RDBMS by the way of implementing materialized view via a concept known as "Indexed Views". The main difference is that such views do not require a refresh because they are in fact always synchronized to the original data of the tables that compound the view. To achieve this, it is necessary that the lines of origin and destination are "deterministic" in their mapping, which limits the types of possible queries to do this. This mechanism has been realised since the 2000 version of SQL Server. Example syntax to create a materialized view in SQL Server: === Stream processing frameworks === Apache Kafka (since v0.10.2), Apache Spark (since v2.0), Apache Flink, Kinetica DB, Materialize, RisingWave, and Epsio all support materialized views on streams of data. === Others === Materialized views are also supported in Sybase SQL Anywhere. In IBM Db2, they are called "materialized query tables". ClickHouse supports materialized views that automatically refresh on merges. MySQL doesn't support materialized views natively, but workarounds can be implemented by using triggers or stored procedures or by using the open-source application Flexviews. Materialized views can be implemented in Amazon DynamoDB using data modification events captured by DynamoDB Streams. Google announced in 8 April 2020 the availability of materialized views for BigQuery as a beta release.
Deterministic blockmodeling
Deterministic blockmodeling is an approach in blockmodeling that does not assume a probabilistic model, and instead relies on the exact or approximate algorithms, which are used to find blockmodel(s). This approach typically minimizes some inconsistency that can occur with the ideal block structure. Such analysis is focused on clustering (grouping) of the network (or adjacency matrix) that is obtained with minimizing an objective function, which measures discrepancy from the ideal block structure. However, some indirect approaches (or methods between direct and indirect approaches, such as CONCOR) do not explicitly minimize inconsistencies or optimize some criterion function. This approach was popularized in the 1970s, due to the presence of two computer packages (CONCOR and STRUCTURE) that were used to "find a permutation of the rows and columns in the adjacency matrix leading to an approximate block structure". The opposite approach to deterministic blockmodeling is a stochastic blockmodeling approach.