AI Detector Make It Human

AI Detector Make It Human — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • PDE surface

    PDE surface

    PDE surfaces are used in geometric modelling and computer graphics for creating smooth surfaces conforming to a given boundary configuration. PDE surfaces use partial differential equations to generate a surface which usually satisfy a mathematical boundary value problem. PDE surfaces were first introduced into the area of geometric modelling and computer graphics by two British mathematicians, Malcolm Bloor and Michael Wilson. == Technical details == The PDE method involves generating a surface for some boundary by means of solving an elliptic partial differential equation of the form ( ∂ 2 ∂ u 2 + a 2 ∂ 2 ∂ v 2 ) 2 X ( u , v ) = 0. {\displaystyle \left({\frac {\partial ^{2}}{\partial u^{2}}}+a^{2}{\frac {\partial ^{2}}{\partial v^{2}}}\right)^{2}X(u,v)=0.} Here X ( u , v ) {\displaystyle X(u,v)} is a function parameterised by the two parameters u {\displaystyle u} and v {\displaystyle v} such that X ( u , v ) = ( x ( u , v ) , y ( u , v ) , z ( u , v ) ) {\displaystyle X(u,v)=(x(u,v),y(u,v),z(u,v))} where x {\displaystyle x} , y {\displaystyle y} and z {\displaystyle z} are the usual cartesian coordinate space. The boundary conditions on the function X ( u , v ) {\displaystyle X(u,v)} and its normal derivatives ∂ X / ∂ n {\displaystyle \partial {X}/\partial {n}} are imposed at the edges of the surface patch. With the above formulation it is notable that the elliptic partial differential operator in the above PDE represents a smoothing process in which the value of the function at any point on the surface is, in some sense, a weighted average of the surrounding values. In this way, a surface is obtained as a smooth transition between the chosen set of boundary conditions. The parameter a {\displaystyle a} is a special design parameter which controls the relative smoothing of the surface in the u {\displaystyle u} and v {\displaystyle v} directions. When a = 1 {\displaystyle a=1} , the PDE is the biharmonic equation: X u u u u + 2 X u u v v + X v v v v = 0 {\displaystyle X_{uuuu}+2X_{uuvv}+X_{vvvv}=0} . The biharmonic equation is the equation produced by applying the Euler-Lagrange equation to the simplified thin plate energy functional X u u 2 + 2 X u v 2 + X v v 2 {\displaystyle X_{uu}^{2}+2X_{uv}^{2}+X_{vv}^{2}} . So solving the PDE with a = 1 {\displaystyle a=1} is equivalent to minimizing the thin plate energy functional subject to the same boundary conditions. == Applications == PDE surfaces can be used in many application areas. These include computer-aided design, interactive design, parametric design, computer animation, computer-aided physical analysis and design optimisation. == Related publications == M.I.G. Bloor and M.J. Wilson, Generating Blend Surfaces using Partial Differential Equations, Computer Aided Design, 21(3), 165–171, (1989). H. Ugail, M.I.G. Bloor, and M.J. Wilson, Techniques for Interactive Design Using the PDE Method, ACM Transactions on Graphics, 18(2), 195–212, (1999). J. Huband, W. Li and R. Smith, An Explicit Representation of Bloor-Wilson PDE Surface Model by using Canonical Basis for Hermite Interpolation, Mathematical Engineering in Industry, 7(4), 421-33 (1999). H. Du and H. Qin, Direct Manipulation and Interactive Sculpting of PDE surfaces, Computer Graphics Forum, 19(3), C261-C270, (2000). H. Ugail, Spine Based Shape Parameterisations for PDE surfaces, Computing, 72, 195–204, (2004). L. You, P. Comninos, J.J. Zhang, PDE Blending Surfaces with C2 Continuity, Computers and Graphics, 28(6), 895–906, (2004).

    Read more →
  • FloodAlerts

    FloodAlerts

    FloodAlerts is a software application, developed by software specialists Shoothill, which takes real-time flooding information, and displays the data on an interactive Bing map, updating and warning its users when they, their premises or the routes they need to travel could be at risk of flooding. == History == FloodAlerts was launched in 2012, originally as the world's first Facebook flood warning app. == Operation == FloodAlerts is made available free of charge to individuals. Users are able to set up their own monitored locations and receive alerts via the application or their Facebook wall if the locations they are monitoring are at imminent risk of flooding. Hosted in the Cloud, using the Microsoft Windows Azure platform, the FloodAlerts application processes the data received from the Environment Agency, automatically creates the required map tiles, pins and alerts and displays them on an interactive Bing map, updating the content every 15 minutes. Users are able to see the latest information on the map without having to refresh their browser. FloodAlerts can also be provided as a customised risk management solution to businesses that require infrastructure or asset safety monitoring in areas where water levels are rising or receding. == Awards and recognition == FloodAlerts has received The Guardian and Virgin Media Business's 2012 Innovation Nation Awards and was shortlisted as a finalist for a further two national awards: the UK IT Industry Awards for Innovation and Entrepreneurship and The Institution of Engineering and Technology Innovation Awards for Information Technology. == In the press == The FloodAlerts application was reviewed on the BBC website. It was also reviewed on BBC Click.

    Read more →
  • Geometric hashing

    Geometric hashing

    In computer science, geometric hashing is a method for efficiently finding two-dimensional objects represented by discrete points that have undergone an affine transformation, though extensions exist to other object representations and transformations. In an off-line step, the objects are encoded by treating each pair of points as a geometric basis. The remaining points can be represented in an invariant fashion with respect to this basis using two parameters. For each point, its quantized transformed coordinates are stored in the hash table as a key, and indices of the basis points as a value. Then a new pair of basis points is selected, and the process is repeated. In the on-line (recognition) step, randomly selected pairs of data points are considered as candidate bases. For each candidate basis, the remaining data points are encoded according to the basis and possible correspondences from the object are found in the previously constructed table. The candidate basis is accepted if a sufficiently large number of the data points index a consistent object basis. Geometric hashing was originally suggested in computer vision for object recognition in 2D and 3D, but later was applied to different problems such as structural alignment of proteins. == Geometric hashing in computer vision == Geometric hashing is a method used for object recognition. Let’s say that we want to check if a model image can be seen in an input image. This can be accomplished with geometric hashing. The method could be used to recognize one of the multiple objects in a base, in this case the hash table should store not only the pose information but also the index of object model in the base. === Example === For simplicity, this example will not use too many point features and assume that their descriptors are given by their coordinates only (in practice local descriptors such as SIFT could be used for indexing). ==== Training Phase ==== Find the model's feature points. Assume that 5 feature points are found in the model image with the coordinates ( 12 , 17 ) ; {\displaystyle (12,17);} ( 45 , 13 ) ; {\displaystyle (45,13);} ( 40 , 46 ) ; {\displaystyle (40,46);} ( 20 , 35 ) ; {\displaystyle (20,35);} ( 35 , 25 ) {\displaystyle (35,25)} , see the picture. Introduce a basis to describe the locations of the feature points. For 2D space and similarity transformation the basis is defined by a pair of points. The point of origin is placed in the middle of the segment connecting the two points (P2, P4 in our example), the x ′ {\displaystyle x'} axis is directed towards one of them, the y ′ {\displaystyle y'} is orthogonal and goes through the origin. The scale is selected such that absolute value of x ′ {\displaystyle x'} for both basis points is 1. Describe feature locations with respect to that basis, i.e. compute the projections to the new coordinate axes. The coordinates should be discretised to make recognition robust to noise, we take the bin size 0.25. We thus get the coordinates ( − 0.75 , − 1.25 ) ; {\displaystyle (-0.75,-1.25);} ( 1.00 , 0.00 ) ; {\displaystyle (1.00,0.00);} ( − 0.50 , 1.25 ) ; {\displaystyle (-0.50,1.25);} ( − 1.00 , 0.00 ) ; {\displaystyle (-1.00,0.00);} ( 0.00 , 0.25 ) {\displaystyle (0.00,0.25)} Store the basis in a hash table indexed by the features (only transformed coordinates in this case). If there were more objects to match with, we should also store the object number along with the basis pair. Repeat the process for a different basis pair (Step 2). It is needed to handle occlusions. Ideally, all the non-colinear pairs should be enumerated. We provide the hash table after two iterations, the pair (P1, P3) is selected for the second one. Hash Table: Most hash tables cannot have identical keys mapped to different values. So in real life one won’t encode basis keys (1.0, 0.0) and (-1.0, 0.0) in a hash table. ==== Recognition Phase ==== Find interesting feature points in the input image. Choose an arbitrary basis. If there isn't a suitable arbitrary basis, then it is likely that the input image does not contain the target object. Describe coordinates of the feature points in the new basis. Quantize obtained coordinates as it was done before. Compare all the transformed point features in the input image with the hash table. If the point features are identical or similar, then increase the count for the corresponding basis (and the type of object, if any). For each basis such that the count exceeds a certain threshold, verify the hypothesis that it corresponds to an image basis chosen in Step 2. Transfer the image coordinate system to the model one (for the supposed object) and try to match them. If successful, the object is found. Otherwise, go back to Step 2. === Finding mirrored pattern === It seems that this method is only capable of handling scaling, translation, and rotation. However, the input image may contain the object in mirror transform. Therefore, geometric hashing should be able to find the object, too. There are two ways to detect mirrored objects. For the vector graph, make the left side positive, and the right side negative. Multiplying the x position by -1 will give the same result. Use 3 points for the basis. This allows detecting mirror images (or objects). Actually, using 3 points for the basis is another approach for geometric hashing. === Geometric hashing in higher-dimensions === Similar to the example above, hashing applies to higher-dimensional data. For three-dimensional data points, three points are also needed for the basis. The first two points define the x-axis, and the third point defines the y-axis (with the first point). The z-axis is perpendicular to the created axis using the right-hand rule. Notice that the order of the points affects the resulting basis

    Read more →
  • OrCam device

    OrCam device

    OrCam devices such as OrCam MyEye are portable, artificial vision devices that allow visually impaired people to understand text and identify objects through audio feedback, describing what they are unable to see. Reuters described an important part of how it works as "a wireless smartcamera" which, when attached outside eyeglass frames, can read and verbalize text, and also supermarket barcodes. This information is converted to spoken words and entered "into the user’s ear." Face-recognition is also part of OrCam's feature set. == Devices == OrCam Technologies Ltd has created three devices; OrCam MyEye 2.0, OrCam MyEye 1, and OrCam MyReader. OrCam My Eye 2.0: OrCam debuted the second-generation model, the OrCam MyEye 2.0 in December 2017. About the size of a finger, the MyEye 2.0 is battery-powered, and has been compressed into a self-contained device. The device snaps onto any eyeglass frame magnetically. Orcam 2.0 is small and light (22.5 grams/0.8 ounces) with functionality to restore independence to the visually impaired. It comes in two versions. The basic model can read text, and a more advanced one adds features such as face recognition and barcode reading. As of July 2023, the retail cost is between $4000 and $6000 (USD). == Clinical Studies == JAMA Ophthalmology: In 2016 JAMA Ophthalmology conducted a study involving 12 legally blind participants to evaluate the usefulness of a portable artificial vision device (OrCam) for patients with low vision. The results showed that the OrCam device improved the patient's ability to perform tasks simulating those of daily living, such as reading a message on an electronic device, a newspaper article or a menu. Wills Eye: Wills Eye was a clinical study designed to measure the impact of the OrCam device on the quality of life of patients with End-stage Glaucoma. The conclusion was that OrCam, a novel artificial vision device using a mini-camera mounted on eyeglasses, allowed legally blind patients with end-stage glaucoma to read independently, subsequently improving their quality of life. == Employee testing == The New York Times described how a pre-release OrCam device was used by a Coloboma-impaired employee of the device's developer in 2013 for grocery shopping. It was the small size of the prototype rather than the functionality that gave her added mobility in an Israeli store's aisles. Added life-enhancement was described: "to both recognize and speak .. bus numbers .. traffic lights." == Social aspects == In contrast to an early version of Google Glass, which "failed ... because .. Glass wearers were ..mocked", early OrCam devices used designs that "clip unobtrusively on your shirt or perhaps your belt." In addition, it does not record sounds or images, what was called "the privacy puzzle that stumped Google. One 2018 technology reviewer wrote that he wished it had a headphone jack "so it would be less disruptive in places where others are working." An attempt was made to use bone conduction. == USA introduction == In 2018 a team headed by New York Assemblyman Dov Hikind introduced use of OrCam devices to ten individuals screened for what he termed "new Israeli technology that really makes a difference to the blind." Although not the first USA success, it was more focused than a publicly funded project that was authorized in 2016 by a California government agency. Also in 2016 the Chicago Lighthouse for the Blind demonstrated its use. == Technology == In the area of hardware, miniaturization has been quite important, but one major area, software, was mentioned by Assemblyman Hikind, and reported by The Times of Israel is the "AI-driven algorithms" that "reports .. how many people are in a room. In addition to reading printed text, it can also aid in "seeing" what is on a television or computer screen. Although OrCam can't help with handwritten information, it can reuse information, the basis of recognizing "US currency, and even faces." === Features === While early language support was for English, French, German, Hebrew and Spanish, others now available include Danish, Dutch, Finnish, Italian, Norwegian, Portuguese and Swedish. == History == OrCam Technologies Ltd was founded in 2010 by Professor Amnon Shashua and Ziv Aviram. Before co-founding OrCam, the two in 1999 co-founded Mobileye, an Israeli company that develops vision-based advanced driver-assistance systems (ADAS) providing warnings for collision prevention and mitigation, which was acquired by Intel for $15.3 billion in 2017. OrCam launched OrCam MyEye in 2013 after years of development and testing, and began selling it commercially in 2015. In its early years, the company raised $22 million, $6 million of which came from Intel Capital. By 2014, Intel, which was also investing in Google Glass, had invested $15 million in Orcam. In March 2017, OrCam had raised $41 million in capital, making it worth $600 million. === Marketing === One outcome of initial marketing in the USA was that they "reached a deal with the California Department of Rehabilitation, ...qualifying blind and visually impaired state residents." == OrCam Technologies Ltd == OrCam Technologies Ltd. is the Israeli-based company producing these OrCam devices, which are wearable artificial intelligence space. The company develops and manufactures assistive technology devices for individuals who are visually impaired, partially sighted, blind, print disabilities, or have other disabilities. OrCam headquarters is located in Jerusalem, operating under the company name OrCam Technologies Ltd. OrCam has over 150 employees, is headquartered in Jerusalem, and has offices in New York, Toronto, and London. == Awards == 2018 Last Gadget Standing Winner 2018 CES Innovation Awards Honoree in Accessible Tech 2017 NAIDEX Innovation Award 2016 Louise Braille Corporate Recognition Award 2016 Silmo-d-Or Award

    Read more →
  • Agent verification

    Agent verification

    Agent verification is activity to gain assurances that purposeful artificial constructs act in accordance with their specifications. While primitive forms of inorganic agents have been used in manufacturing for centuries, the study of artificial agents did not begin until the mid 20th century. Foundational work on such agents was closely bound with the emergence of artificial intelligence as an academic discipline. Early agents deployed for industrial control systems and in computing were often controlled by quite simple logic however, not involving artificial intelligence as such. When deployed as part of a multi-agent system, even such simple agents could require special agent orientated testing methods, as their collective behaviour was challenging to verify with traditional testing techniques. Difficulties in providing assurances that agents will not behave in dangerous ways became more prevalent after the introduction of LLM agents, especially after the rapid acceleration of their deployment in 2025. The verification of agent behaviour can be conducted by formal or informal methods. Informal verification requires less mathematical skill. But when agents are part of systems where errors have significant risks — such as danger to human life, environmental damage or major financial loss — formal verification is preferred. Both regulators and system designers themselves like formal verification as it provides a high degree of mathematical certainty. It is not however always possible to formally test all aspects of an agent based system's behaviour, especially where newer LLM based agents are concerned, due in part to their high degree of autonomy. Accordingly, agent verification for low impact deployments might be carried out only with informal methods, while for high impact deployments, it may be performed with a mix of formal and informal techniques. == Terminology == In academia, the term agent verification is often defined to mean activity concerned with gaining assurance that the agent behaves in accordance with its specification - whether by processes such as testing or simulation. 'Verification' is typically contrasted with 'validation', the latter meaning activity concerned with checking that the specification itself meets user or real world needs. Such definitions are not universally adhered to however - for example, in some workplaces and documents, the words 'verification' and 'validation' can be used synonymously. Efforts to gain confidence in Agents have intensified sharply since 2025 due to the rapid roll out of LLM agents; different terms are sometimes used in the commercial sector. Here the term 'agent verification' can be used in the same sense as it is in academia, but sometimes the same activity can be covered by more ambiguous and wider ranging terms such as 'Agent governance' , 'Agent observability' or 'AI agent policing'. == History == === Classical agents === The theoretical underpinnings for artificial (inorganic) agents emerged in the mid 20th century, with establishment of cybernetics and artificial intelligence. Oliver Selfridge's 1958 Pandemonium - A Paradigm for Learning paper was an important early theoretical contribution in establishing agent oriented architecture. Practical implementations of agents for real world applications began to become widespread in the 1990s, after the introduction of the belief–desire–intention software model (BDI), and agent-oriented programming. Pure digital agents were deployed in computer infrastructure for purposes such as monitoring, while agents connected to real-world sensors and actuators were increasingly used in industrial control systems. While the concept of artificial agents was interwoven with early artificial intelligence studies right from the start, early agents lacked general purpose reasoning capabilities, often only having simple if then logic. Even a device as simple as a thermostat, which has a sensor and a means of acting, can be considered a proto agent in this sense. Verifying the behaviours of a simple single agent system is not generally especially difficult, but it can be a different matter when several simple agents coexist in the same system. Craig Reynolds's work on boids showed that relatively complex, "intelligent" behaviour can emerge from a number of such simple agents working together in a Multi-agent system (MAS). By the 1990s, even the behaviour of a single agent system could sometimes be quite complex; in accordance with the Belief–desire–intention software model, agents could have believes that might evolve over time. Agents were increasingly introduced that were controlled by quite large decision tree models, which had new vulnerabilities to adversarial attack. It was becoming increasingly apparent that traditional software verification methods had limitations for testing such agents, or even for the more primitive type of agents when they were deployed as part of a MAS. It was the use of agents for industrial control systems, sometimes associated with robotics, that lent urgency to the practice of agent verification. Informal testing might be acceptable for digital agents used say to monitor whether each of an organisation's computers are properly licensed. But with an increasing potential for faulty agents to result in a failure that might cause a large fire to break out at a chemical manufacturing plant, a botched medical operation, or even a crashed aircraft, the need to develop reliable means of verifying behaviour of such agents was considered urgent. The Foundation for Intelligent Physical Agents was established in 1996. From the late 90s, a growing number of industry and university based scientists began working on the problem, with researchers publishing papers on the verification of both single and multi agent systems. Much of this work showed how formal verification techniques like model checking could be used to gain a high level of assurance that agent based systems would conform with their specification. A 2018 systematic review covering 231 studies found that model checking was the most common technique for agent verification, with theorem proving the second most commonly used formal verification method. In the first two decades of the 20th century, agents run by AI became more common, with Siri and Alexa being well known examples. But such agents still lacked general reasoning capabilities and did not pose new pressing problems for agent verification. === General purpose reasoning agents === The advent of LLMs created huge potential for further use of artificial agents, as agents based on them could have general purpose cognitive abilities. Agents run by LLMs (and occasionally non-LLM foundation models) have similar vulnerability to adversarial attack as those run by decision tree models. The wider scope of actions for LLM agents has created new challenges for their verification, over and above those present for classical agents. For example, the LLM's neural network endows it with infinite domains, an especial challenge for traditional formal verification techniques. Academics began to study the problems involved in verifying LLM agents from 2018. Deployment of such agents began to accelerate in late 2023 after OpenAI's "function-calling" API was made available, and especially after Anthropic's late 2024 introduction of Model Context Protocol (MCP), a standardised way for LLM agents to gain contextual awareness, and to act on the world by calling various external tools. The rapid rollout of LLM agents following MCP's release has seen the task of agent verification receive increased attention within academia, and also from the private sector. In 2024 and 2025 several startups focusing on LLM agent verification have been founded in both Europe and the US to meet growing demand. == Approaches == === Formal verification === Formal verification involves proving the correctness of some or all aspects of a system using mathematical methods. Such methods can range from manual formal proof, to verification assisted with automated theorem provers like Isabelle. For agent verification, model checking is by far the most frequently used formal verification method; for pre-LLM models it was often complemented with techniques using computation tree logic. Another common method is theorem proving. Formal verification provides a higher degree of confidence than informal methods, but it is not always used, even when it is possible. Sometimes a person or organisation developing software agents won't have the necessary skills, or may not see it as worth the effort if the agent(s) will not have the ability to cause much harm even if they malfunction. When agents are deployed in systems where errors could have serious consequences, the ability of formal verification methods to provide mathematical certainty tends to be strongly preferred by both regulators and designers themselves. But even for high impact systems, formal verificatio

    Read more →
  • Superquadrics

    Superquadrics

    In mathematics, the superquadrics or super-quadrics (also superquadratics) are a family of geometric shapes defined by formulas that resemble those of ellipsoids and other quadrics, except that the squaring operations are replaced by arbitrary powers. They can be seen as the three-dimensional relatives of the superellipses. The term may refer to the solid object or to its surface, depending on the context. The equations below specify the surface; the solid is specified by replacing the equality signs by less-than-or-equal signs. The superquadrics include many shapes that resemble cubes, octahedra, cylinders, lozenges and spindles, with rounded or sharp corners. Because of their flexibility and relative simplicity, they are popular geometric modeling tools, especially in computer graphics. It becomes an important geometric primitive widely used in computer vision, robotics, and physical simulation. Some authors, such as Alan Barr, define "superquadrics" as including both the superellipsoids and the supertoroids. In modern computer vision literatures, superquadrics and superellipsoids are used interchangeably, since superellipsoids are the most representative and widely utilized shape among all the superquadrics. Comprehensive coverage of geometrical properties of superquadrics and methods of their recovery from range images and point clouds are covered in several computer vision literatures. == Formulas == === Implicit equation === The surface of the basic superquadric is given by | x | r + | y | s + | z | t = 1 {\displaystyle \left|x\right|^{r}+\left|y\right|^{s}+\left|z\right|^{t}=1} where r, s, and t are positive real numbers that determine the main features of the superquadric. Namely: less than 1: a pointy octahedron modified to have concave faces and sharp edges. exactly 1: a regular octahedron. between 1 and 2: an octahedron modified to have convex faces, blunt edges and blunt corners. exactly 2: a sphere greater than 2: a cube modified to have rounded edges and corners. infinite (in the limit): a cube Each exponent can be varied independently to obtain combined shapes. For example, if r=s=2, and t=4, one obtains a solid of revolution which resembles an ellipsoid with round cross-section but flattened ends. This formula is a special case of the superellipsoid's formula if (and only if) r = s. If any exponent is allowed to be negative, the shape extends to infinity. Such shapes are sometimes called super-hyperboloids. The basic shape above spans from -1 to +1 along each coordinate axis. The general superquadric is the result of scaling this basic shape by different amounts A, B, C along each axis. Its general equation is | x A | r + | y B | s + | z C | t = 1. {\displaystyle \left|{\frac {x}{A}}\right|^{r}+\left|{\frac {y}{B}}\right|^{s}+\left|{\frac {z}{C}}\right|^{t}=1.} === Parametric description === Parametric equations in terms of surface parameters u and v (equivalent to longitude and latitude if m equals 2) are x ( u , v ) = A g ( v , 2 r ) g ( u , 2 r ) y ( u , v ) = B g ( v , 2 s ) f ( u , 2 s ) z ( u , v ) = C f ( v , 2 t ) − π 2 ≤ v ≤ π 2 , − π ≤ u < π , {\displaystyle {\begin{aligned}x(u,v)&{}=Ag\left(v,{\frac {2}{r}}\right)g\left(u,{\frac {2}{r}}\right)\\y(u,v)&{}=Bg\left(v,{\frac {2}{s}}\right)f\left(u,{\frac {2}{s}}\right)\\z(u,v)&{}=Cf\left(v,{\frac {2}{t}}\right)\\&-{\frac {\pi }{2}}\leq v\leq {\frac {\pi }{2}},\quad -\pi \leq u<\pi ,\end{aligned}}} where the auxiliary functions are f ( ω , m ) = sgn ⁡ ( sin ⁡ ω ) | sin ⁡ ω | m g ( ω , m ) = sgn ⁡ ( cos ⁡ ω ) | cos ⁡ ω | m {\displaystyle {\begin{aligned}f(\omega ,m)&{}=\operatorname {sgn}(\sin \omega )\left|\sin \omega \right|^{m}\\g(\omega ,m)&{}=\operatorname {sgn}(\cos \omega )\left|\cos \omega \right|^{m}\end{aligned}}} and the sign function sgn(x) is sgn ⁡ ( x ) = { − 1 , x < 0 0 , x = 0 + 1 , x > 0. {\displaystyle \operatorname {sgn}(x)={\begin{cases}-1,&x<0\\0,&x=0\\+1,&x>0.\end{cases}}} === Spherical product === Barr introduces the spherical product which given two plane curves produces a 3D surface. If f ( μ ) = ( f 1 ( μ ) f 2 ( μ ) ) , g ( ν ) = ( g 1 ( ν ) g 2 ( ν ) ) {\displaystyle f(\mu )={\begin{pmatrix}f_{1}(\mu )\\f_{2}(\mu )\end{pmatrix}},\quad g(\nu )={\begin{pmatrix}g_{1}(\nu )\\g_{2}(\nu )\end{pmatrix}}} are two plane curves then the spherical product is h ( μ , ν ) = f ( μ ) ⊗ g ( ν ) = ( f 1 ( μ ) g 1 ( ν ) f 1 ( μ ) g 2 ( ν ) f 2 ( μ ) ) {\displaystyle h(\mu ,\nu )=f(\mu )\otimes g(\nu )={\begin{pmatrix}f_{1}(\mu )\ g_{1}(\nu )\\f_{1}(\mu )\ g_{2}(\nu )\\f_{2}(\mu )\end{pmatrix}}} This is similar to the typical parametric equation of a sphere: x = x 0 + r sin ⁡ θ cos ⁡ φ y = y 0 + r sin ⁡ θ sin ⁡ φ ( 0 ≤ θ ≤ π , 0 ≤ φ < 2 π ) z = z 0 + r cos ⁡ θ {\displaystyle {\begin{aligned}x&=x_{0}+r\sin \theta \;\cos \varphi \\y&=y_{0}+r\sin \theta \;\sin \varphi \qquad (0\leq \theta \leq \pi ,\;0\leq \varphi <2\pi )\\z&=z_{0}+r\cos \theta \end{aligned}}} which give rise to the name spherical product. Barr uses the spherical product to define quadric surfaces, like ellipsoids, and hyperboloids as well as the torus, superellipsoid, superquadric hyperboloids of one and two sheets, and supertoroids. == Plotting code == The following GNU Octave code generates a mesh approximation of a superquadric:

    Read more →
  • Coalition for App Fairness

    Coalition for App Fairness

    The Coalition for App Fairness (CAF) is a coalition comprised by companies, who aim to reach a fairer deal for the inclusion of their apps into the Apple App Store or the Google Play Store. The organization's executive director is Meghan DiMuzio and its headquarters are located in Washington, D.C. == Background == In July 2015, Spotify launched an email campaign to urge its App Store subscribers to cancel their subscriptions and start new ones through its website, bypassing the 30% transaction fee for in-app purchases required for iOS applications by technology company Apple Inc. A later update to the Spotify app on iOS was rejected by Apple, prompting Spotify's general counsel Horacio Gutierrez to write a letter to Apple's then-general counsel Bruce Sewell, stating: "This latest episode raises serious concerns under both U.S. and EU competition law. It continues a troubling pattern of behavior by Apple to exclude and diminish the competitiveness of Spotify on iOS and as a rival to Apple Music, particularly when seen against the backdrop of Apple's previous anticompetitive conduct aimed at Spotify … we cannot stand by as Apple uses the App Store approval process as a weapon to harm competitors." In August 2020, Epic Games updated their Fortnite Battle Royale game app on both Apple's App Store and Google's Google Play to include its own storefront that offered a 20% discount on V-Bucks, the in-game currency, if players bought through there rather than through the app stores' storefront, both which take a 30% revenue cut of the sale. Both Apple and Google removed the Fortnite app within hours, as this alternate storefront violated their terms of use that required all in-app purchases to be made through their storefronts. Epic immediately filed lawsuits against both companies challenging their storefront policies on antitrust principles, arguing that their non-negotiable 30% revenue cut is too high and the restrictions against alternate storefronts anticompetitive. Apple countersued Epic over its behavior, leading to a highly publicized 2021 bench trial. Ultimately, Epic largely lost its lawsuit against Apple, though the court did order Apple to allow developers to point users to alternative payment methods. Conversely, Epic won its antitrust lawsuit against Google in late 2023. == Foundation == On 24 September 2020, Epic Games joined forces with thirteen other prominent companies—including the music streaming platform Spotify, Tinder owner Match Group, the encrypted mail service Proton Mail, and the crypto currency website Blockchain.com—to establish the Coalition for App Fairness. It also includes Basecamp. The coalition criticizes the fact that for now the app stores of both Apple and Google charge their clients a 30% fee on any purchases made over their stores. Apple and Google defended themselves by arguing that the 30% transaction fee is a standard in the industry while the Coalition for App Fairness states that there is no other transaction fee which is even close to the 30%. In October 2020, it was reported that the coalition grew from 13 to 40 members since its foundation and received more than 400 applications for membership. In October 2025, X (formerly Twitter) joined CAF. This was seen as a larger pushback in the industry against Apple and Google, and a step towards hopefully passing the Bipartisan Open App Markets Act. == Aims == The group has broadened their demands for the app stores and now also aim for a better treatment for the apps available in the App Store. They claim that Apple favors its own services before other services available on the market and unjustifiably excludes other apps from their App Store. The group has also been viewing other transaction fees like the 5% fee which is charged by credit card companies, and states that Apple charges up to 600% more and would like the 30% fee, which was only included in 2011 by Apple, adapted to a comparable percentage that charge other providers of payment solutions. Its demands are mainly directed at Apple's strict control over its App Store, but to a lesser extent are also directed towards Google. Google allows apps to be downloaded over an independent web link or also another App Store, such as the Epic Game App Store. The organization emphasizes that no app developer should come into the position in which they are discriminated and are not granted the same rights as to the developers of the owner of the app store. == Reactions == In October 2020, Microsoft presented a new framework concerning the access to its Windows 10 operating system by app stores other than the one offered by Microsoft. The new framework is based on the demands of the Coalition for App Fairness. Microsoft emphasized though, that these principles would not apply to the Xbox. In December 2020, Apple announced that they would be lowering the revenue cut Apple takes for app developers making $1M or less from 30% to 15% if app developers fill out an application for the lowered revenue cut. In March 2021, Google followed suit by also lowering the revenue cut from the Play Store from 30% to 15% for the first million in revenue earned by a developer each year. == Notable members == Members listed are notable companies listed as members the groups website: Blockchain.com Deezer Epic Games European Digital SME Alliance Fanfix Life360 Masimo Nium Proton Mail Spotify TapTap Threema Vipps

    Read more →
  • LTX (text-to-video model)

    LTX (text-to-video model)

    LTX is a family of open source artificial intelligence video foundation models developed by Lightricks, and first released in November 2024. The latest models, LTX-2, create videos based on user prompts. They were preceded by LTX Video, which was released in 2024 as the company's first text-to-video model. LTX-2 is part of the LTX family of video generation models, which form the core technology, alongside LTX Studio, of the LTX ecosystem. == History == === Origins: LTX Video (2024–2025) === In November 2024 Lightricks publicly released its first text-to-video model, LTX Video. It was a 2-billion parameter model, available as open source. In May 2025 Lightricks launched LTXV-13b, a version with 13-billion parameters. Two months later, the model broke the 60 second barrier for generated video. === Release of LTX-2 (2025) === In October 2025 Lightricks announced its latest model, and renamed it LTX-2. The model was described as capable of generating synchronized audio and video at native 4K resolution and up to 50 frames per second (fps), using a variety of conditions and prompts, including text-to-video and image-to-video. Google highlighted the fact that LTX-2 was trained on its infrastructure, and saying it was "The first open source AI video generation model, powered by Google Cloud". Upon its release it was ranked in the top-3 models for image-to-video creation by Artificial Analysis, behind Kling 3.5 by Kling AI and Veo 3.1 by Google. Its text-to-image option was ranked 7th. In addition to its open-source release, Lightricks offers API access to LTX-2, allowing developers to generate videos from text and image prompts through a hosted service without running the model locally. === Open Source Release (2026) === In January 2026, Lightricks officially released the full open-source version of LTX-2, making the model’s complete codebase, weights, and associated tooling publicly available. In March 2026 the company released LTX-2.3, which was accompanied by a desktop video editor enabling the entire model to run locally on consumer hardware. == Technical features == === Advancements over LTX Video === LTX-2 builds upon the LTX Video architecture with several major improvements: Unified audio-video generation producing synchronized dialogue, ambience, and motion Native 4K rendering 50-fps output for cinematic motion Three operational modes (Fast, Pro, Ultra) More efficient diffusion pipelines enabling high fidelity on consumer GPUs === Core capabilities === Text-to-video generation Image-to-video generation Multimodal audiovisual synthesis High-resolution spatial and temporal coherence Configurable quality/performance settings Open-source distribution of weights and datasets == Reception == Initial reception to LTX-2 was broadly positive, with several technology and media outlets highlighting its open-source approach and multimodal capabilities. Open Source For You described LTX-2 as “one of the first AI video systems to combine 4K output, synchronized audio, and an open model release,” noting that it positioned Lightricks as a significant competitor to proprietary systems such as OpenAI's Sora and Google's Veo. IEA Green said that the model “could rewrite the AI filmmaking game,” emphasizing that its 50-fps rendering and unified audio-video generation made it suitable for professional studios and independent creators alike. AI News characterized LTX-2 as a “major step forward in the democratization of cinematic-quality video generation,” praising its consumer-grade hardware efficiency and multi-tier generation modes, while also noting ongoing challenges in long-form temporal stability. FinancialContent reported strong interest among creative agencies, attributing the attention to Lightricks’ decision to release model weights and datasets, which reviewers said enabled “a level of transparency not typically seen in commercial AI video models.” === Benchmarks and rankings === Upon release, LTX-2 ranked third for image-to-video creation in the Artificial Analysis benchmark, behind Kling 3.5 and Veo 3.1, while its text-to-video option ranked seventh. As of early 2026, it was the highest-ranked open-source model in the benchmark. === Limitations === Some early reviewers also pointed out quality limitations. The Ray3 technical review noted occasional inconsistencies in lip-sync and motion tracking during long scenes, though it stated these were “in line with the challenges faced by all current AI video diffusion models” and expected to improve with continued iteration. Like other diffusion-based video generators, LTX-2 can produce artifacts in complex multi-person scenes and may struggle with precise text rendering within generated video.

    Read more →
  • Predictive text

    Predictive text

    Predictive text is an input technology used where one key or button represents many letters, such as on the physical numeric keypads of mobile phones and in accessibility technologies. Each key press results in a prediction rather than repeatedly sequencing through the same group of "letters" it represents, in the same, invariable order. Predictive text could allow for an entire word to be input by a single keypress. Predictive text makes efficient use of fewer device keys to input writing into a text message, an e-mail, an address book, a calendar, and the like. The most widely used, general, predictive text systems are T9, iTap, eZiText, and LetterWise/WordWise. There are many ways to build a device that predicts text, but all predictive text systems have initial linguistic settings that offer predictions that are re-prioritized to adapt to each user. This learning adapts, by way of the device memory, to a user's disambiguating feedback that results in corrective key presses, such as pressing a "next" key to get to the intention. Most predictive text systems have a user database to facilitate this process. Theoretically the number of keystrokes required per desired character in the finished writing is, on average, comparable to using a keyboard. This is approximately true provided that all words used are in its database, punctuation is ignored, and no input mistakes are made when typing or spelling. The theoretical keystrokes per character, KSPC, of a keyboard is KSPC=1.00, and of multi-tap is KSPC=2.03. Eatoni's LetterWise is a predictive multi-tap hybrid, which when operating on a standard telephone keypad achieves KSPC=1.15 for English. The choice of which predictive text system is the best to use involves matching the user's preferred interface style, the user's level of learned ability to operate predictive text software, and the user's efficiency goal. There are various levels of risk in predictive text systems, versus multi-tap systems, because the predicted text that is automatically written provides the speed and mechanical efficiency benefit, which, if the user is not careful to review, results in transmitting misinformation. Predictive text systems take time to learn to use well, and so generally, a device's system has user options to set up the choice of multi-tap or any one of several schools of predictive text methods. == Background == Short message service (SMS) permits a mobile phone user to send text messages (also called messages, SMSes, texts, and txts) as a short message. The most common system of SMS text input is referred to as "multi-tap". Using multi-tap, a key is pressed multiple times to access the list of letters on that key. For instance, pressing the "2" key once displays an "a", twice displays a "b" and three times displays a "c". To enter two successive letters that are on the same key, the user must either pause or hit a "next" button. A user can type by pressing an alphanumeric keypad without looking at the electronic equipment display. Thus, multi-tap is easy to understand and can be used without any visual feedback. However, multi-tap is not very efficient, requiring potentially many keystrokes to enter a single letter. In ideal predictive text entry, all words used are in the dictionary, punctuation is ignored, no spelling mistakes are made, and no typing mistakes are made. The ideal dictionary would include all slang, proper nouns, abbreviations, URLs, foreign-language words and other user-unique words. This ideal circumstance gives predictive text software a reduction in the number of key strokes a user is required to enter a word. The user presses the number corresponding to each letter. As long as the word exists in the predictive text dictionary or is correctly disambiguated by non-dictionary systems, it will appear. For instance, pressing "4663" will typically be interpreted as the word good, provided that a linguistic database in English is currently in use, though alternatives such as home, hood and hoof are also valid interpretations of the sequence of key strokes. The most widely used systems of predictive text are Tegic's T9, Motorola's iTap, and the Eatoni Ergonomics' LetterWise and WordWise. T9 and iTap use dictionaries, but Eatoni Ergonomics' products use a disambiguation process, a set of statistical rules to recreate words from keystroke sequences. All predictive text systems require a linguistic database for every supported input language. == Dictionary vs. non-dictionary systems == Traditional disambiguation works by referencing a dictionary of commonly used words, though Eatoni offers a dictionaryless disambiguation system. In dictionary-based systems, as the user presses the number buttons, an algorithm searches the dictionary for a list of possible words that match the keypress combination and offers up the most probable choice. The user can then confirm the selection and move on, or use a key to cycle through the possible combinations. A non-dictionary system constructs words and other sequences of letters from the statistics of word parts. To attempt predictions of the intended result of keystrokes not yet entered, disambiguation may be combined with a word completion facility. Either system (disambiguation or predictive) may include a user database, which can be further classified as a "learning" system when words or phrases are entered into the user database without direct user intervention. The user database is for storing words or phrases that are not well disambiguated by the pre-supplied database. Some disambiguation systems further attempt to correct spelling, format text or perform other automatic rewrites, with the risky effect of either enhancing or frustrating user efforts to enter text. == History == The predictive text and autocomplete technology was invented out of necessities by Chinese scientists and linguists in the 1950s to solve the input inefficiency of the Chinese typewriter, as the typing process involved finding and selecting thousands of logographic characters on a tray, drastically slowing down the word processing speed. The actuating keys of the Chinese typewriter created by Lin Yutang in the 1940s included suggestions for the characters following the one selected. In 1951, the Chinese typesetter Zhang Jiying arranged Chinese characters in associative clusters, a precursor of modern predictive text entry, and broke speed records by doing so. Predictive entry of text from a telephone keypad has been known at least since the 1970s (Smith and Goodwin, 1971). Predictive text was mainly used to look up names in directories over the phone until mobile phone text messaging came into widespread use. == Example == On a typical phone keypad, if users wished to type the in a "multi-tap" keypad entry system, they would need to: Press 8 (tuv) once to select t. Press 4 (ghi) twice to select h. Press 3 (def) twice to select e. Meanwhile, in a phone with predictive text, they need only: Press 8 once to select the (tuv) group for the first character. Press 4 once to select the (ghi) group for the second character. Press 3 once to select the (def) group for the third character. The system updates the display as each keypress is entered, to show the most probable entry. In this example, prediction reduced the number of button presses from five to three. The effect is even greater with longer words and those composed of letters later in each key's sequence. A dictionary-based predictive system is based on the hope that the desired word is in the dictionary. That hope may be misplaced if the word differs in any way from common usage—in particular, if the word is not spelled or typed correctly, is slang, or is a proper noun. In these cases, some other mechanism must be used to enter the word. Furthermore, the simple dictionary approach fails with agglutinative languages, where a single word does not necessarily represent a single semantic entity. == Companies and products == Predictive text is developed and marketed in a variety of competing products, such as Nuance Communications's T9. Other products include Motorola's iTap; Eatoni Ergonomic's LetterWise (character, rather than word-based prediction); WordWise (word-based prediction without a dictionary); EQ3 (a QWERTY-like layout compatible with regular telephone keypads); Prevalent Devices's Phraze-It; Xrgomics' TenGO (a six-key reduced QWERTY keyboard system); Adaptxt (considers language, context, grammar and semantics); Lightkey (a predictive typing software for Windows); Clevertexting (statistical nature of the language, dictionaryless, dynamic key allocation); and Oizea Type (temporal ambiguity); Intelab's Tauto; WordLogic's Intelligent Input Platform™ (patented, layer-based advanced text prediction, includes multi-language dictionary, spell-check, built-in Web search); Google's Gboard. == Textonyms == Words produced by the same combination of keypresses have been called "textonyms"; also "txtonyms"; or "T9o

    Read more →
  • Computational photography

    Computational photography

    Computational photography refers to digital image capture and processing techniques that use digital computation instead of optical processes. Computational photography can improve the capabilities of a camera, or introduce features that were not possible at all with film-based photography, or reduce the cost or size of camera elements. Examples of computational photography include in-camera computation of digital panoramas, high-dynamic-range images, and light field cameras. Light field cameras use novel optical elements to capture three-dimensional scene information, which can then be used to produce 3D images, enhanced depth-of-field, and selective de-focusing (or "post focus"). Enhanced depth-of-field reduces the need for mechanical focusing systems. All of these features use computational imaging techniques. The definition of computational photography has evolved to cover a number of subject areas in computer graphics, computer vision, and applied optics. These areas are given below, organized according to a taxonomy proposed by Shree K. Nayar. Within each area is a list of techniques, and for each technique, one or two representative papers or books are cited. Deliberately omitted from the taxonomy are image processing (see also digital image processing) techniques applied to traditionally captured images to produce better images. Examples of such techniques are image scaling, dynamic range compression (i.e. tone mapping), color management, image completion (a.k.a. inpainting or hole filling), image compression, digital watermarking, and artistic image effects. Also omitted are techniques that produce range data, volume data, 3D models, 4D light fields, 4D, 6D, or 8D BRDFs, or other high-dimensional image-based representations. Epsilon photography is a sub-field of computational photography. == Effect on photography == Photos taken using computational photography can allow amateurs to produce photographs rivalling the quality of professional photographers, but as of 2019 do not outperform the use of professional-level equipment. == Computational illumination == This is controlling photographic illumination in a structured fashion, then processing the captured images, to create new images. The applications include image-based relighting, image enhancement, image deblurring, geometry/material recovery and so forth. High-dynamic-range imaging uses differently exposed pictures of the same scene to extend dynamic range. Other examples include processing and merging differently illuminated images of the same subject matter ("lightspace"). == Computational optics == This is a capture of optically coded images, followed by computational decoding to produce new images. Coded aperture imaging was mainly applied in astronomy and X-ray imaging to boost the image quality. Instead of a single pin-hole, a pinhole pattern is applied in imaging, and deconvolution is performed to recover the image. In coded exposure imaging, the on/off state of the shutter is coded to modify the kernel of motion blur. In this way, motion deblurring becomes a well-conditioned problem. Similarly, in a lens based coded aperture, the aperture can be modified by inserting a broadband mask. Thus, out of focus deblurring becomes a well-conditioned problem. The coded aperture can also improve the quality in light field acquisition using Hadamard transform optics. Coded aperture patterns can also be designed using color filters, in order to apply different codes at different wavelengths. This allows for increase the amount of light that reaches the camera sensor, compared to binary masks. == Computational imaging == Computational imaging is a set of imaging techniques that combine data acquisition and data processing to create the image of an object through indirect means to yield enhanced resolution, additional information such as optical phase or 3D reconstruction. The information is often recorded without using a conventional optical microscope configuration or with limited datasets. Computational imaging allows going beyond physical limitations of optical systems, such as numerical aperture, or even obliterates the need for optical elements. For parts of the optical spectrum where imaging elements such as objectives are difficult to manufacture or image sensors cannot be miniaturized, computational imaging provides useful alternatives, in fields such as X-ray and THz radiations. === Common techniques === Among common computational imaging techniques are lensless imaging, computational speckle imaging , ptychography and Fourier ptychography. Computational imaging technique often draws on compressive sensing or phase retrieval techniques, where the angular spectrum of the object is reconstructed. Other techniques are related to the field of computational imaging, such as digital holography, computer vision and inverse problems such as tomography. == Computational processing == This is the processing of non-optically-coded images to produce new images. == Computational sensors == These are detectors that combine sensing and processing, typically in hardware, like the oversampled binary image sensor. == Early work in computer vision == Although computational photography is a currently popular buzzword in computer graphics, many of its techniques first appeared in the computer vision literature, either under other names or within papers aimed at 3D shape analysis. == Art history == Computational photography, as an art form, has been practiced by capturing differently exposed pictures of the same subject matter and combining them. This was the inspiration for the development of the wearable computer in the 1970s and early 1980s. Computational photography was inspired by the work of Charles Wyckoff, and thus computational photography datasets (e.g. differently exposed pictures of the same subject matter that are taken in order to make a single composite image) are sometimes referred to as Wyckoff Sets, in his honor. Early work in this area (joint estimation of image projection and exposure value) was undertaken by Mann and Candoccia. Charles Wyckoff devoted much of his life to creating special kinds of 3-layer photographic films that captured different exposures of the same subject matter. A picture of a nuclear explosion, taken on Wyckoff's film, appeared on the cover of Life Magazine and showed the dynamic range from the dark outer areas to the inner core.

    Read more →
  • Inverse consistency

    Inverse consistency

    In image registration, inverse consistency measures the consistency of mappings between images produced by a registration algorithm. The inverse consistency error, introduced by Christiansen and Johnson in 2001, quantifies the distance between the composition of the mappings from each image to the other, produced by the registration procedure, and the identity function, and is used as a regularisation constraint in the loss function of many registration algorithms to enforce consistent mappings. Inverse consistency is necessary for good image registration but it is not sufficient, since a mapping can be perfectly consistent but not register the images at all. == Definition == Image registration is the process of establishing a common coordinate system between two images, and given two images I 1 : Ω 1 → R I 2 : Ω 2 → R {\displaystyle {\begin{aligned}I_{1}:\Omega _{1}\to \mathbb {R} \\I_{2}:\Omega _{2}\to \mathbb {R} \end{aligned}}} registering a source image I 1 {\displaystyle I_{1}} to a target image I 2 {\displaystyle I_{2}} consists of determining a transformation f 1 : Ω 2 → Ω 1 {\displaystyle f_{1}:\Omega _{2}\to \Omega _{1}} that maps points from the target space to the source space. An ideal registration algorithm should not be sensitive to which image in the pair is used as source or target, and the registration operator should be antisymmetric such that the mappings f 1 : Ω 2 → Ω 1 f 2 : Ω 1 → Ω 2 {\displaystyle {\begin{aligned}f_{1}:\Omega _{2}\to \Omega _{1}\\f_{2}:\Omega _{1}\to \Omega _{2}\end{aligned}}} produced when registering I 1 {\displaystyle I_{1}} to I 2 {\displaystyle I_{2}} and I 2 {\displaystyle I_{2}} to I 1 {\displaystyle I_{1}} respectively should be the inverse of each other, i.e. f 2 = f 1 − 1 {\displaystyle f_{2}=f_{1}^{-1}} and f 1 = f 2 − 1 {\displaystyle f_{1}=f_{2}^{-1}} or, equivalently, f 2 ∘ f 1 = id Ω 2 {\displaystyle f_{2}\circ f_{1}=\operatorname {id} _{\Omega _{2}}} and f 1 ∘ f 2 = id Ω 1 {\displaystyle f_{1}\circ f_{2}=\operatorname {id} _{\Omega _{1}}} , where ∘ {\displaystyle \circ } denotes the function composition operator. Real algorithms are not perfect, and when swapping the role of source and target image in a registration problem the so obtained transformations are not the inverse of each other. Inverse consistency can be enforced by adding to the loss function of the registration a symmetric regularisation term that penalises inconsistent transformations ∫ Ω 2 ‖ f 2 ( f 1 ( x ) ) − x ‖ 2 d x + ∫ Ω 1 ‖ f 1 ( f 2 ( x ) ) − x ‖ 2 d x . {\displaystyle \int _{\Omega _{2}}\left\Vert f_{2}(f_{1}(x))-x\right\Vert ^{2}\mathrm {d} x+\int _{\Omega _{1}}\left\Vert f_{1}(f_{2}(x))-x\right\Vert ^{2}\mathrm {d} x.} Inverse consistency can be used as a quality metric to evaluate image registration results. The inverse consistency error ( I C E {\displaystyle ICE} ) measures the distance between the composition of the two transforms and the identity function, and it can be formulated in terms of both average ( I C E a {\displaystyle ICE_{a}} ) or maximum ( I C E m {\displaystyle ICE_{m}} ) over a region of interest Ω {\displaystyle \Omega } of the image: I C E a = 1 ∫ Ω d x ∫ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ d x I C E m = max x ∈ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ . {\displaystyle {\begin{aligned}ICE_{a}&={\frac {1}{\int _{\Omega }\mathrm {d} x}}\int _{\Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert \mathrm {d} x\\ICE_{m}&=\max _{x\in \Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert .\end{aligned}}} While inverse consistency is a necessary property of good registration algorithms, inverse consistency error alone is not a sufficient metric to evaluate the quality of image registration results, since a perfectly consistent mapping, with no other constraint, may be not even close to correctly register a pair of images.

    Read more →
  • IT operations analytics

    IT operations analytics

    In the fields of information technology (IT) and systems management, IT operations analytics (ITOA) is an approach or method to retrieve, analyze, and report data for IT operations. ITOA may apply big data analytics to large datasets to produce business insights. In 2014, Gartner predicted its use might increase revenue or reduce costs. By 2017, it predicted that 15% of enterprises will use IT operations analytics technologies. == Definition == IT operations analytics (ITOA) (also known as advanced operational analytics, or IT data analytics) technologies are primarily used to discover complex patterns in high volumes of often "noisy" IT system availability and performance data. Forrester Research defined IT analytics as "The use of mathematical algorithms and other innovations to extract meaningful information from the sea of raw data collected by management and monitoring technologies." Note, ITOA is different than AIOps, which focuses on applying artificial intelligence and machine learning to the applications of ITOA. == History == Operations research as a discipline emerged from the Second World War to improve military efficiency and decision-making on the battlefield. However, only with the emergence of machine learning tech in the early 2000s could an artificially intelligent operational analytics platform actually begin to engage in the high-level pattern recognition that could adequately serve business needs. A critical catalyst towards ITOA development was the rise of Google, which pioneered a predictive analytics model that represented the first attempt to read into patterns of human behavior on the Internet. IT specialists then applied predictive analytics to the IT Industry, coming forward with platforms that can sift through data to generate insights without the need for human intervention. Due to the mainstream embrace of cloud computing and the increasing desire for businesses to adopt more big data practices, the ITOA industry has grown significantly since 2010. A 2016 ExtraHop survey of large and mid-size corporations indicates that 65 percent of the businesses surveyed will seek to integrate their data silos either this year or the next. The current goals of ITOA platforms are to improve the accuracy of their APM services, facilitate better integration with the data, and to enhance their predictive analytics capabilities. == Applications == ITOA systems tend to be used by IT operations teams, and Gartner describes seven applications of ITOA systems: Root cause analysis: The models, structures and pattern descriptions of IT infrastructure or application stack being monitored can help users pinpoint fine-grained and previously unknown root causes of overall system behavior pathologies. Proactive control of service performance and availability: Predicts future system states and the impact of those states on performance. Problem assignment: Determines how problems may be resolved or, at least, direct the results of inferences to the most appropriate individuals, or communities in the enterprise for problem resolution. Service impact analysis: When multiple root causes are known, the analytics system's output is used to determine and rank the relative impact, so that resources can be devoted to correcting the fault in the most timely and cost-effective way possible. Complement best-of-breed technology: The models, structures and pattern descriptions of IT infrastructure or application stack being monitored are used to correct or extend the outputs of other discovery-oriented tools to improve the fidelity of information used in operational tasks (e.g., service dependency maps, application runtime architecture topologies, network topologies). Real time application behavior learning: Learns & correlates the behavior of Application based on user pattern and underlying Infrastructure on various application patterns, create metrics of such correlated patterns and store it for further analysis. Dynamically baselines threshold: Learns behavior of Infrastructure on various application user patterns and determines the Optimal behavior of the Infra and technological components, bench marks and baselines the low and high water mark for the specific environments and dynamically changes the bench mark baselines with the changing infra and user patterns without any manual intervention. == Types == In their Data Growth Demands a Single, Architected IT Operations Analytics Platform, Gartner Research describes five types of analytics technologies: Log analysis Unstructured text indexing, search and inference (UTISI) Topological analysis (TA) Multidimensional database search and analysis (MDSA) Complex operations event processing (COEP) Statistical pattern discovery and recognition (SPDR) == Tools and ITOA platforms == A number of vendors operate in the ITOA space:

    Read more →
  • Application enablement

    Application enablement

    Application enablement is an approach which brings telecommunications network providers and developers together to combine their network and web abilities in creating and delivering high demand advanced services and new intelligent applications. Network providers, in addition to bandwidth, provide abilities such as billing, location, presence, and security, which have allowed them to establish long-term relationships with end-users. By offering these select abilities as application programming interfaces (APIs), providers give developers access to a set of tools to create (mashup) new applications and services to run on provider networks. Unifying the strengths of providers and developers facilitates the creation of mash-up applications, and in turn, a better end user quality of experience (QoE) for improved profit margins. Apple's iOS with App Store, and Google's Android with Android Market exemplify this approach. Both have introduced mobile platforms that are supported by a comprehensive ecosystem in order to perpetuate innovation in product design, content and service offerings, and overall consumer behavior. By the end of April 2010, downloadable applications numbered over 200,000 for iPhone and over 50,000 for Android. == Background == Historically, telecommunication providers primarily based their business models on network performance, emphasizing connectivity, availability, and quality of service (QoS) as key sources of revenue and customer value. With the increasing demand for bandwidth-intensive data and video applications, maintaining service continuity has required substantial infrastructure investments. To address rising operational costs and declining average revenue per user (ARPU), providers have increasingly adopted customer-oriented strategies and diversified business models to expand their roles within the telecommunications value chain. Application enablement supports providers in making this transition by providing an environment, or ecosystem, where providers and developers can collaborate to build, test, manage, and distribute applications across networks including television, broadband, Internet, and mobile. This cooperative effort produces mutually beneficial results for all parties, opening up new revenue streams while enhancing value and rate of return (ROI). The following are some examples of key network abilities which function as application enablers in the telecommunications market: Billing systems Security for private transactions Network-based storage of digital content End-to-end bandwidth for high-quality transmissions Scoring abilities to identify end-user preferences and behaviors Subscriber data to customize the end-user experience Context information, such as location and presence, to localize services. == New business models == As network providers work toward effective collaboration with application and content developers, several new business models are emerging to help facilitate the business relationships: === Vendor-led === A type of business model driven by telecommunications vendors, who assist network providers in building relationships with application and content developers to lower the cost and complexity of managing third parties. Examples of this model include: Forum Nokia IBM Technology Partner Ecosystem Ng Connect Huawei Intouch program === Operator-led === Characterized by network providers who want to maintain a high degree of flexibility and control over applications created for their end-consumers, this model lets them create and manage their own developer program, development platform, and application store. Under this arrangement, independent developers provide their own branding, marketing communications, pricing and customer care. Network providers pursuing this model will often seek to partner with a large number of third parties using standardized on-boarding processes. Examples of this model include: o2 Litmus Orange Partner Joint Innovation Lab === Aggregator === Network providers who choose not to create/manage their own developer relationships will partner with one or multiple aggregators, to administer a portion of or their entire application strategy. Examples of this model include: Ovi Operator Partnership Blackberry Operator Partnership Cellmania Buongiorno === Mass wholesale === Select network providers also participate in wholesale models that exist primarily for applications (BT's Ribbit- an Internet Protocol (IP) based calling and messaging platform) and devices (Verizon's Open Device initiative). This business-to-business approach reduces a large portion of the potential costs of third party application enablement (marketing, acquisition and support). Examples of this model include: BT's Ribbit Verizon Wireless ODI AT&T Synaptic Hosting === The enterprise customer === Some network providers are focusing on enabling applications in the enterprise space. In this model, the network provider establishes a platform for their large enterprise customers who want to blend custom software with enhanced abilities, and will provide standardized processes around mobilizing enterprise applications, and exposing core back-office abilities to allow for dynamic customer interaction. Examples of this model include: Vodafone Applications Service Verizon Private Network Sprint Solution Launchpad === Trusted partner === In this model, the network provider builds one-on-one relationships with trusted third-party developers by exposing customized network abilities, bringing a greater variety of brands to the network provider's portfolio. Network providers using this model tend to only have a few partners (in contrast to the operator led model). Under this scenario, network providers benefit from a pre-established customer base and the developer's marketing resources. Examples of this model include: 3/Skype Partnership (UK) Virgin Media and BBC iPlayer == Network operator developer resources == Operator led model o2 Litmus Orange Partner Joint Innovations Lab Aggregator model Ovi Operator Partnership Cellmania Buongiorno Mass wholesale model BT Ribbit Verizon Wireless ODI AT&T Synaptic Hosting Enterprise customer model Vodafone Applications Service Verizon Private Network Sprint Solution Launchpad == Rerencesfe ==

    Read more →
  • Knowledge assessment methodology

    Knowledge assessment methodology

    The knowledge assessment methodology (KAM) is "an interactive benchmarking tool created by the World Bank's Knowledge for Development Program to help countries identify the challenges and opportunities they face in making the transition to the knowledge-based economy." KAM does so by providing information on knowledge economy indicators for 146 countries. Its products include the Knowledge Economy Index and the Knowledge Index.

    Read more →
  • Neural radiance field

    Neural radiance field

    A neural radiance field (NeRF) is a neural field for reconstructing a three-dimensional representation of a scene from two-dimensional images. The NeRF model enables downstream applications of novel view synthesis, scene geometry reconstruction, and obtaining the reflectance properties of the scene. Additional scene properties such as camera poses may also be jointly learned. First introduced in 2020, it has since gained significant attention for its potential applications in computer graphics and content creation. == Algorithm == The NeRF algorithm represents a scene as a radiance field parametrized by a deep neural network (DNN). The network predicts a volume density and view-dependent emitted radiance given the spatial location ( x , y , z ) {\displaystyle (x,y,z)} and viewing direction in Euler angles ( θ , Φ ) {\displaystyle (\theta ,\Phi )} of the camera. By sampling many points along camera rays, traditional volume rendering techniques can produce an image. === Data collection === A NeRF needs to be retrained for each unique scene. The first step is to collect images of the scene from different angles and their respective camera pose. These images are standard 2D images and do not require a specialized camera or software. Any camera is able to generate datasets, provided the settings and capture method meet the requirements for SfM (Structure from Motion). This requires tracking of the camera position and orientation, often through some combination of SLAM, GPS, or inertial estimation. Researchers often use synthetic data to evaluate NeRF and related techniques. For such data, images (rendered through traditional non-learned methods) and respective camera poses are reproducible and error-free. === Training === For each sparse viewpoint (image and camera pose) provided, camera rays are marched through the scene, generating a set of 3D points with a given radiance direction (into the camera). For these points, volume density and emitted radiance are predicted using the multi-layer perceptron (MLP). An image is then generated through classical volume rendering. Because this process is fully differentiable, the error between the predicted image and the original image can be minimized with gradient descent over multiple viewpoints, encouraging the MLP to develop a coherent model of the scene. == Variations and improvements == Early versions of NeRF were slow to optimize and required that all input views were taken with the same camera in the same lighting conditions. These performed best when limited to orbiting around individual objects, such as a drum set, plants or small toys. Since the original paper in 2020, many improvements have been made to the NeRF algorithm, with variations for special use cases. === Fourier feature mapping === In 2020, shortly after the release of NeRF, the addition of Fourier Feature Mapping improved training speed and image accuracy. Deep neural networks struggle to learn high frequency functions in low dimensional domains; a phenomenon known as spectral bias. To overcome this shortcoming, points are mapped to a higher dimensional feature space before being fed into the MLP. γ ( v ) = [ a 1 cos ⁡ ( 2 π B 1 T v ) a 1 sin ⁡ ( 2 π B 1 T v ) ⋮ a m cos ⁡ ( 2 π B m T v ) a m sin ⁡ ( 2 π B m T v ) ] {\displaystyle \gamma (\mathrm {v} )={\begin{bmatrix}a_{1}\cos(2{\pi }{\mathrm {B} }_{1}^{T}\mathrm {v} )\\a_{1}\sin(2\pi {\mathrm {B} }_{1}^{T}\mathrm {v} )\\\vdots \\a_{m}\cos(2{\pi }{\mathrm {B} }_{m}^{T}\mathrm {v} )\\a_{m}\sin(2{\pi }{\mathrm {B} }_{m}^{T}\mathrm {v} )\end{bmatrix}}} Where v {\displaystyle \mathrm {v} } is the input point, B i {\displaystyle \mathrm {B} _{i}} are the frequency vectors, and a i {\displaystyle a_{i}} are coefficients. This allows for rapid convergence to high frequency functions, such as pixels in a detailed image. === Bundle-adjusting neural radiance fields === One limitation of NeRFs is the requirement of knowing accurate camera poses to train the model. Often times, pose estimation methods are not completely accurate, nor is the camera pose even possible to know. These imperfections result in artifacts and suboptimal convergence. So, a method was developed to optimize the camera pose along with the volumetric function itself. Called Bundle-Adjusting Neural Radiance Field (BARF), the technique uses a dynamic low-pass filter (DLPF) to go from coarse to fine adjustment, minimizing error by finding the geometric transformation to the desired image. This corrects imperfect camera poses and greatly improves the quality of NeRF renders. === Multiscale representation === Conventional NeRFs struggle to represent detail at all viewing distances, producing blurry images up close and overly aliased images from distant views. In 2021, researchers introduced a technique to improve the sharpness of details at different viewing scales known as mip-NeRF (comes from mipmap). Rather than sampling a single ray per pixel, the technique fits a gaussian to the conical frustum cast by the camera. This improvement effectively anti-aliases across all viewing scales. mip-NeRF also reduces overall image error and is faster to converge at about half the size of ray-based NeRF. === Learned initializations === In 2021, researchers applied meta-learning to assign initial weights to the MLP. This rapidly speeds up convergence by effectively giving the network a head start in gradient descent. Meta-learning also allowed the MLP to learn an underlying representation of certain scene types. For example, given a dataset of famous tourist landmarks, an initialized NeRF could partially reconstruct a scene given one image. === NeRF in the wild === Conventional NeRFs are vulnerable to slight variations in input images (objects, lighting) often resulting in ghosting and artifacts. As a result, NeRFs struggle to represent dynamic scenes, such as bustling city streets with changes in lighting and dynamic objects. In 2021, researchers at Google developed a new method for accounting for these variations, named NeRF in the Wild (NeRF-W). This method splits the neural network (MLP) into three separate models. The main MLP is retained to encode the static volumetric radiance. However, it operates in sequence with a separate MLP for appearance embedding (changes in lighting, camera properties) and an MLP for transient embedding (changes in scene objects). This allows the NeRF to be trained on diverse photo collections, such as those taken by mobile phones at different times of day. === Relighting === In 2021, researchers added more outputs to the MLP at the heart of NeRFs. The output now included: volume density, surface normal, material parameters, distance to the first surface intersection (in any direction), and visibility of the external environment in any direction. The inclusion of these new parameters lets the MLP learn material properties, rather than pure radiance values. This facilitates a more complex rendering pipeline, calculating direct and global illumination, specular highlights, and shadows. As a result, the NeRF can render the scene under any lighting conditions with no re-training. === Plenoctrees === Although NeRFs had reached high levels of fidelity, their costly compute time made them useless for many applications requiring real-time rendering, such as VR/AR and interactive content. Introduced in 2021, Plenoctrees (plenoptic octrees) enabled real-time rendering of pre-trained NeRFs through division of the volumetric radiance function into an octree. Rather than assigning a radiance direction into the camera, viewing direction is taken out of the network input and spherical radiance is predicted for each region. This makes rendering over 3000x faster than conventional NeRFs. === Sparse Neural Radiance Grid === Similar to Plenoctrees, this method enabled real-time rendering of pretrained NeRFs. To avoid querying the large MLP for each point, this method bakes NeRFs into Sparse Neural Radiance Grids (SNeRG). A SNeRG is a sparse voxel grid containing opacity and color, with learned feature vectors to encode view-dependent information. A lightweight, more efficient MLP is then used to produce view-dependent residuals to modify the color and opacity. To enable this compressive baking, small changes to the NeRF architecture were made, such as running the MLP once per pixel rather than for each point along the ray. These improvements make SNeRG extremely efficient, outperforming Plenoctrees. === Instant NeRFs === In 2022, researchers at Nvidia enabled real-time training of NeRFs through a technique known as Instant Neural Graphics Primitives. An innovative input encoding reduces computation, enabling real-time training of a NeRF, an improvement orders of magnitude above previous methods. The speedup stems from the use of spatial hash functions, which have O ( 1 ) {\displaystyle O(1)} access times, and parallelized architectures which run fast on modern GPUs. == Related techniques == === Plenoxels === Plen

    Read more →