AI Generator Letter

AI Generator Letter — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Per-pixel lighting

    Per-pixel lighting

    In computer graphics, per-pixel lighting refers to any technique for lighting an image or scene that calculates illumination for each pixel on a rendered image. This is in contrast to other popular methods of lighting such as vertex lighting, which calculates illumination at each vertex of a 3D model and then interpolates the resulting values over the model's faces to calculate the final per-pixel color values. Per-pixel lighting is commonly used with techniques, such as blending, alpha blending, alpha to coverage, anti-aliasing, texture filtering, clipping, hidden-surface determination, Z-buffering, stencil buffering, shading, mipmapping, normal mapping, bump mapping, displacement mapping, parallax mapping, shadow mapping, specular mapping, shadow volumes, high-dynamic-range rendering, ambient occlusion (screen space ambient occlusion, screen space directional occlusion, ray-traced ambient occlusion), ray tracing, global illumination, and tessellation. Each of these techniques provides some additional data about the surface being lit or the scene and light sources that contributes to the final look and feel of the surface. Most modern video game engines implement lighting using per-pixel techniques instead of vertex lighting to achieve increased detail and realism. The id Tech 4 engine, used to develop such games as Brink and Doom 3, was one of the first game engines to implement a completely per-pixel shading engine. All versions of the CryENGINE, Frostbite Engine, and Unreal Engine, among others, also implement per-pixel shading techniques. Deferred shading is a recent development in per-pixel lighting notable for its use in the Frostbite Engine and Battlefield 3. Deferred shading techniques are capable of rendering potentially large numbers of small lights inexpensively (other per-pixel lighting approaches require full-screen calculations for each light in a scene, regardless of size). == History == While only recently have personal computers and video hardware become powerful enough to perform full per-pixel shading in real-time applications such as games, many of the core concepts used in per-pixel lighting models have existed for decades. Frank Crow published a paper describing the theory of shadow volumes in 1977. This technique uses the stencil buffer to specify areas of the screen that correspond to surfaces that lie in a "shadow volume", or a shape representing a volume of space eclipsed from a light source by some object. These shadowed areas are typically shaded after the scene is rendered to buffers by storing shadowed areas with the stencil buffer. Jim Blinn first introduced the idea of normal mapping in a 1978 SIGGRAPH paper. Blinn pointed out that the earlier idea of unlit texture mapping proposed by Edwin Catmull was unrealistic for simulating rough surfaces. Instead of mapping a texture onto an object to simulate roughness, Blinn proposed a method of calculating the degree of lighting a point on a surface should receive based on an established "perturbation" of the normals across the surface. == Hardware rendering == Real-time applications, such as video games, usually implement per-pixel lighting through the use of pixel shaders, allowing the GPU hardware to process the effect. The scene to be rendered is first rasterized onto a number of buffers storing different types of data to be used in rendering the scene, such as depth, normal direction, and diffuse color. Then, the data is passed into a shader and used to compute the final appearance of the scene, pixel-by-pixel. Deferred shading is a per-pixel shading technique that has recently become feasible for games. With deferred shading, a "g-buffer" is used to store all terms needed to shade a final scene on the pixel level. The format of this data varies from application to application depending on the desired effect, and can include normal data, positional data, specular data, diffuse data, emissive maps and albedo, among others. Using multiple render targets, all of this data can be rendered to the g-buffer with a single pass, and a shader can calculate the final color of each pixel based on the data from the g-buffer in a final "deferred pass". Because deferred shading assumes only one visible fragment per pixel sample, transparent objects are generally handled in a separate forward pass. == Software rendering == Per-pixel lighting is also performed in software on many high-end commercial rendering applications which typically do not render at interactive framerates. This is called offline rendering or software rendering. NVidia's mental ray rendering software, which is integrated with such suites as Autodesk's Softimage is a well-known example.

    Read more →
  • Predictions of the end of Wikipedia

    Predictions of the end of Wikipedia

    Various observers have predicted the end of Wikipedia since it rose to prominence, with potential pitfalls from lack of quality-control, artificial intelligence or inconsistencies among contributors. Alternative online encyclopedias have been proposed as replacements for Wikipedia, including WolframAlpha, as well as the both now-defunct Knol (from Google) and Owl (from AOL). A 2013 review raised alarms regarding Wikipedia's shortcomings on hoaxes, on vandalism, an imbalance of material, and inadequate quality control of articles. Earlier critiques lamented the vulgar content and absence of sufficient references in articles. Others suggest that the unwarranted deletion of useful articles from Wikipedia may portend its end, which itself inspired the creation of the now inactive Deletionpedia. Contrary to such predictions, Wikipedia has constantly grown in both size and influence. Recent developments with artificial intelligence in Wikimedia projects have prompted new predictions that AI applications, which consume free and open content, will replace Wikipedia. == Personnel == Wikipedia is crowdsourced by a few million volunteer editors. Of the millions of registered editors, only tens of thousands contribute the majority of its contents, and a few thousand do quality control and maintenance work. As the encyclopedia expanded in the 2010s, the number of active editors did not grow proportionately. Various sources predicted that Wikipedia will eventually have too few editors to be functional and collapse from lack of participation. English Wikipedia has 818 volunteer administrators who perform various functions, including functions similar to those carried out by a forum moderator. Critics have described their actions as harsh, bureaucratic, biased, unfair, or capricious and predicted that the resulting outrage would lead to the site's closure. Various 2012 articles reported that a decline in English Wikipedia's recruitment of new administrators could end Wikipedia. === Decline in editors (2014–2015) === A 2014 trend analysis published in The Economist stated that "The number of editors for the English-language version has fallen by a third in seven years." The attrition rate for active editors in English Wikipedia was described by The Economist as substantially higher than in other (non-English) Wikipedias. It reported that in other languages, the number of "active editors" (those with at least five edits per month) has been relatively constant since 2008: some 42,000 editors, with narrow seasonal variances of about 2,000 editors up or down. In the English Wikipedia, the number of active editors peaked in 2007 at about 50,000 editors, and fell to 30,000 editors in 2014. Given that the trend analysis published in The Economist presented the number of active editors for non-English Wikipedias as remaining relatively constant, sustaining their numbers at approximately 42,000 active editors, the contrast pointed to the effectiveness of Wikipedia in those languages to retain their active editors on a renewable and sustained basis. Though different language versions of Wikipedia have different policies, no comment identified a particular policy difference as potentially making a difference in the rate of editor attrition for English Wikipedia. Editor count showed a slight uptick a year later, and no clear trend after that. In a 2013 article, Tom Simonite of MIT Technology Review said that for several years running, the number of Wikipedia editors had been falling, and cited the bureaucratic structure and rules as a factor. Simonite alleged that some Wikipedians use the labyrinthine rules and guidelines to dominate others and have a vested interest in keeping the status quo. A January 2016 article in Time by Chris Wilson said Wikipedia might lose many editors because a collaboration of occasional editors and smart software will take the lead. Andrew Lih and Andrew Brown both maintain editing Wikipedia with smartphones is difficult and discourages new potential contributors. Lih alleges there is serious disagreement among existing contributors on how to resolve this. In 2015, Lih feared for Wikipedia's long-term future while Brown feared problems with Wikipedia would remain and rival encyclopedias would not replace it. == Viewers and fundraisers == As of 2015, with more viewing by smartphones, there had been a marked decline in persons who viewed Wikipedia from their computers, and according to The Washington Post "[people are] far less likely to donate". At the time, the Wikimedia Foundation reported reserves equivalent to one year's budgeted expenditures. On the other hand, the number of paid staff had ballooned, so those expenses increased. In 2021, Andreas Kolbe, a former co-editor-in-chief of The Signpost, wrote that the Wikimedia Foundation was reaching its 10-year goal of a US$100 million endowment, five years earlier than planned, which may surprise donors and users around the world who regularly see Wikipedia fundraising banners. He also said accounting methods disguise the size of operating surpluses, top managers earn $300,000 – 400,000 a year, and over 40 people work exclusively on fundraising. == Artificial intelligence == Wikipedia faces a decline in human visitors, raising concerns about its long-term sustainability and community participation. The Wikimedia Foundation (WMF), when reporting this decline, attributed this in part to the lack of clicks from users of large language models and search engines that are using content from Wikipedia. Data published in August 2025 showed that after the launch of ChatGPT and the rise of other AI-powered search summaries, some types of articles on Wikipedia — especially those that closely resemble the kind of content ChatGPT produces — experienced a noticeable drop in readership. Overall human pageviews reportedly fell by about 8% between 2024 and 2025, suggesting that AI-overviews and chatbots are increasingly being used in place of direct visits to Wikipedia. According to industry web analytics data, ChatGPT's estimated monthly web traffic surpassed that of Wikipedia since May 2025, as visits to ChatGPT continued to grow while Wikipedia’s total site traffic declined. == Timeline of predictions == On the eve of the 20th anniversary of Wikipedia, associate professor of the Department of Communication Studies at Northeastern University Joseph Reagle conducted a retrospective study of numerous "predictions of the ends of Wikipedia" over two decades, divided into chronological waves: "Early growth (2001–2002)", "Nascent identity (2001–2005)", "Production model (2005–2010)", "Contributor attrition (2009–2017)" and the current period "(2020–)". Each wave brought its distinctive fatal predictions, which never came true; as a result, Reagle concluded Wikipedia was not in danger. Concern grew in 2023 that the ubiquity and proliferation of artificial intelligence (AI) may adversely affect Wikipedia. Rapid improvements and widespread application of AI may render Wikipedia obsolete or reduce its importance. A 2023 study found that AI, when applied to Wikipedia, works most efficiently for error-correction, while Wikipedia still needs to be written by humans.

    Read more →
  • PyTorch

    PyTorch

    PyTorch is an open-source deep learning library, originally developed by Meta Platforms and currently developed with support from the Linux Foundation. The successor to Torch, PyTorch provides a high-level API that builds upon optimised, low-level implementations of deep learning algorithms and architectures, such as the Transformer, or SGD. Notably, this API simplifies model training and inference to a few lines of code. PyTorch allows for automatic parallelization of training and, internally, implements CUDA bindings that speed training further by leveraging GPU resources. PyTorch utilises the tensor as a fundamental data type, similarly to NumPy. Training is facilitated by a reversed automatic differentiation system, Autograd, that constructs a directed acyclic graph of the operations (and their arguments) executed by a model during its forward pass. With a loss, backpropagation is then undertaken. As of 2025, PyTorch remains one of the most popular deep learning libraries, alongside others such as TensorFlow and Keras. It can be installed using Anaconda package managers. A number of commercial deep learning architectures are built on top of PyTorch, including ChatGPT, Tesla Autopilot, Uber's Pyro, and Hugging Face's Transformers. == History == In 2001, Torch was written and released under a GPL. It was a machine-learning library written in C++ and CUDA, supporting methods including neural networks, support vector machines (SVM), hidden Markov models, etc. Around 2010, it was rewritten by Ronan Collobert, Clement Farabet and Koray Kavuckuoglu. This was known as Torch7 or LuaTorch. This was written so that the backend was in C and the frontend was in Lua. In mid-2016, some developers refactored it to decouple the frontend and the backend, with strong influence from torch-autograd and Chainer. In turn, torch-autograd was influenced by HIPS/autograd. Development on Torch7 ceased in 2018 and was subsumed by the PyTorch project. Meta (formerly known as Facebook) operates both PyTorch and Convolutional Architecture for Fast Feature Embedding (Caffe2), but models defined by the two frameworks were mutually incompatible. The Open Neural Network Exchange (ONNX) project was created by Meta and Microsoft in September 2017 to decouple deep learning frameworks from hardware-specific runtimes, allowing models to be converted between frameworks and optimized for execution providers like NVIDIA’s TensorRT. Caffe2 was merged into PyTorch at the end of March 2018. In September 2022, Meta announced that PyTorch would be governed by the independent PyTorch Foundation, a newly created subsidiary of the Linux Foundation. PyTorch 2.0 was released on 15 March 2023, introducing TorchDynamo, a Python-level compiler that makes code run up to two times faster, along with significant improvements in training and inference performance across major cloud platforms. == PyTorch tensors == PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers. PyTorch supports various sub-types of multi-dimensional arrays, or Tensors. PyTorch Tensors are similar to NumPy Arrays, but can also be operated on by a CUDA-capable NVIDIA GPU. PyTorch has also been developing support for other GPU platforms, for example, AMD's ROCm and Apple's Metal Framework. == PyTorch neural networks == PyTorch defines a module called nn (torch.nn) to describe neural networks and to support training. This module offers a comprehensive collection of building blocks for neural networks, including various layers and activation functions, enabling the construction of complex models. Networks are built by inheriting from the torch.nn module and defining the sequence of operations in the forward() function. == PyTorch Serialized File Format == Pytorch can save and load models using its own file format, which is a ZIP64 archive containing the model weights in a Python pickle file, and other information such as the byte order. The file extensions .pt and .pth are commonly used for these files. == Example == The following program shows the low-level functionality of the library with a simple example. The following code block defines a neural network with linear layers using the nn module.

    Read more →
  • OpenVX

    OpenVX

    OpenVX is an open, royalty-free standard for cross-platform acceleration of computer vision applications. It is designed by the Khronos Group to facilitate portable, optimized and power-efficient processing of methods for vision algorithms. This is aimed for embedded and real-time programs within computer vision and related scenarios. It uses a connected graph representation of operations. == Overview == OpenVX specifies a higher level of abstraction for programming computer vision use cases than compute frameworks such as OpenCL. The high level makes the programming easy and the underlying execution will be efficient on different computing architectures. This is done while having a consistent and portable vision acceleration API. OpenVX is based on a connected graph of vision nodes that can execute the preferred chain of operations. It uses an opaque memory model, allowing to move image data between the host (CPU) memory and accelerator, such as GPU memory. As a result, the OpenVX implementation can optimize the execution through various techniques, such as acceleration on various processing units or dedicated hardware. This architecture facilitates applications programmed in OpenVX on different systems with different power and performance, including battery-sensitive, vision-enabled, wearable displays. OpenVX is complementary to the open source vision library OpenCV. OpenVX in some applications offers a better optimized graph management than OpenCV. == History == OpenVX 1.0 specification was released in October 2014. OpenVX sample implementation was released in December 2014. OpenVX 1.1 specification was released on May 2, 2016. OpenVX 1.2 was released on May 1, 2017. Updated OpenVX adopters program and OpenVX 1.2 conformance test suite was released on November 21, 2017. OpenVX 1.2.1 was released on November 27, 2018. OpenVX 1.3 was released on October 22, 2019. == Implementations, frameworks and libraries == AMD MIVisionX Archived 2019-08-05 at the Wayback Machine - for AMD's CPUs and GPUs. Cadence - for Cadence Design Systems's Tensilica Vision DSPs. Imagination - for Imagination Technologies's PowerVR GPUs Synopsys - for Synopsys' DesignWare EV Vision Processors Texas Instruments’ OpenVX (TIOVX) - for Texas Instruments’ Jacinto™ ADAS SoCs. NVIDIA VisionWorks - for CUDA-capable Nvidia GPUs and SoCs. OpenVINO - for Intel's CPUs, GPUs, VPUs, and FPGAs.

    Read more →
  • Empirical dynamic modeling

    Empirical dynamic modeling

    Empirical dynamic modeling (EDM) is a framework for analysis and prediction of nonlinear dynamical systems. Applications include population dynamics, ecosystem service, medicine, neuroscience, dynamical systems, geophysics, and human-computer interaction. EDM was originally developed by Robert May and George Sugihara. It can be considered a methodology for data modeling, predictive analytics, dynamical system analysis, machine learning and time series analysis. == Description == Mathematical models have tremendous power to describe observations of real-world systems. They are routinely used to test hypothesis, explain mechanisms and predict future outcomes. However, real-world systems are often nonlinear and multidimensional, in some instances rendering explicit equation-based modeling problematic. Empirical models, which infer patterns and associations from the data instead of using hypothesized equations, represent a natural and flexible framework for modeling complex dynamics. Donald DeAngelis and Simeon Yurek illustrated that canonical statistical models are ill-posed when applied to nonlinear dynamical systems. A hallmark of nonlinear dynamics is state-dependence: system states are related to previous states governing transition from one state to another. EDM operates in this space, the multidimensional state-space of system dynamics rather than on one-dimensional observational time series. EDM does not presume relationships among states, for example, a functional dependence, but projects future states from localised, neighboring states. EDM is thus a state-space, nearest-neighbors paradigm where system dynamics are inferred from states derived from observational time series. This provides a model-free representation of the system naturally encompassing nonlinear dynamics. A cornerstone of EDM is recognition that time series observed from a dynamical system can be transformed into higher-dimensional state-spaces by time-delay embedding with Takens's theorem. The state-space models are evaluated based on in-sample fidelity to observations, conventionally with Pearson correlation between predictions and observations. == Methods == Primary EDM algorithms include Simplex projection, Sequential locally weighted global linear maps (S-Map) projection, Multivariate embedding in Simplex or S-Map, Convergent cross mapping (CCM), and Multiview Embeding, described below. Nearest neighbors are found according to: NN ( y , X , k ) = ‖ X N i E − y ‖ ≤ ‖ X N j E − y ‖ if 1 ≤ i ≤ j ≤ k {\displaystyle {\text{NN}}(y,X,k)=\|X_{N_{i}}^{E}-y\|\leq \|X_{N_{j}}^{E}-y\|{\text{ if }}1\leq i\leq j\leq k} === Simplex === Simplex projection is a nearest neighbor projection. It locates the k {\displaystyle k} nearest neighbors to the location in the state-space from which a prediction is desired. To minimize the number of free parameters k {\displaystyle k} is typically set to E + 1 {\displaystyle E+1} defining an E + 1 {\displaystyle E+1} dimensional simplex in the state-space. The prediction is computed as the average of the weighted phase-space simplex projected T p {\displaystyle Tp} points ahead. Each neighbor is weighted proportional to their distance to the projection origin vector in the state-space. Find k {\displaystyle k} nearest neighbor: N k ← NN ( y , X , k ) {\displaystyle N_{k}\gets {\text{NN}}(y,X,k)} Define the distance scale: d ← ‖ X N 1 E − y ‖ {\displaystyle d\gets \|X_{N_{1}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − ‖ X N i E − y ‖ / d ) {\displaystyle w_{i}\gets \exp(-\|X_{N_{i}}^{E}-y\|/d)} Average of state-space simplex: y ^ ← ∑ i = 1 k ( w i X N i + T p ) / ∑ i = 1 k w i {\displaystyle {\hat {y}}\gets \sum _{i=1}^{k}\left(w_{i}X_{N_{i}+T_{p}}\right)/\sum _{i=1}^{k}w_{i}} === S-Map === S-Map extends the state-space prediction in Simplex from an average of the E + 1 {\displaystyle E+1} nearest neighbors to a linear regression fit to all neighbors, but localised with an exponential decay kernel. The exponential localisation function is F ( θ ) = exp ( − θ d / D ) {\displaystyle F(\theta )={\text{exp}}(-\theta d/D)} , where d {\displaystyle d} is the neighbor distance and D {\displaystyle D} the mean distance. In this way, depending on the value of θ {\displaystyle \theta } , neighbors close to the prediction origin point have a higher weight than those further from it, such that a local linear approximation to the nonlinear system is reasonable. This localisation ability allows one to identify an optimal local scale, in-effect quantifying the degree of state dependence, and hence nonlinearity of the system. Another feature of S-Map is that for a properly fit model, the regression coefficients between variables have been shown to approximate the gradient (directional derivative) of variables along the manifold. These Jacobians represent the time-varying interaction strengths between system variables. Find k {\displaystyle k} nearest neighbor: N ← NN ( y , X , k ) {\displaystyle N\gets {\text{NN}}(y,X,k)} Sum of distances: D ← 1 k ∑ i = 1 k ‖ X N i E − y ‖ {\displaystyle D\gets {\frac {1}{k}}\sum _{i=1}^{k}\|X_{N_{i}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − θ ‖ X N i E − y ‖ / D ) {\displaystyle w_{i}\gets \exp(-\theta \|X_{N_{i}}^{E}-y\|/D)} Reweighting matrix: W ← diag ( w i ) {\displaystyle W\gets {\text{diag}}(w_{i})} Design matrix: A ← [ 1 X N 1 X N 1 − 1 … X N 1 − E + 1 1 X N 2 X N 2 − 1 … X N 2 − E + 1 ⋮ ⋮ ⋮ ⋱ ⋮ 1 X N k X N k − 1 … X N k − E + 1 ] {\displaystyle A\gets {\begin{bmatrix}1&X_{N_{1}}&X_{N_{1}-1}&\dots &X_{N_{1}-E+1}\\1&X_{N_{2}}&X_{N_{2}-1}&\dots &X_{N_{2}-E+1}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&X_{N_{k}}&X_{N_{k}-1}&\dots &X_{N_{k}-E+1}\end{bmatrix}}} Weighted design matrix: A ← W A {\displaystyle A\gets WA} Response vector at T p {\displaystyle Tp} : b ← [ X N 1 + T p X N 2 + T p ⋮ X N k + T p ] {\displaystyle b\gets {\begin{bmatrix}X_{N_{1}+T_{p}}\\X_{N_{2}+T_{p}}\\\vdots \\X_{N_{k}+T_{p}}\end{bmatrix}}} Weighted response vector: b ← W b {\displaystyle b\gets Wb} Least squares solution (SVD): c ^ ← argmin c ‖ A c − b ‖ 2 2 {\displaystyle {\hat {c}}\gets {\text{argmin}}_{c}\|Ac-b\|_{2}^{2}} Local linear model c ^ {\displaystyle {\hat {c}}} is prediction: y ^ ← c ^ 0 + ∑ i = 1 E c ^ i y i {\displaystyle {\hat {y}}\gets {\hat {c}}_{0}+\sum _{i=1}^{E}{\hat {c}}_{i}y_{i}} === Multivariate Embedding === Multivariate Embedding recognizes that time-delay embeddings are not the only valid state-space construction. In Simplex and S-Map one can generate a state-space from observational vectors, or time-delay embeddings of a single observational time series, or both. === Convergent Cross Mapping === Convergent cross mapping (CCM) leverages a corollary to the Generalized Takens Theorem that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables X {\displaystyle X} and Y {\displaystyle Y} , X {\displaystyle X} causes Y {\displaystyle Y} . Since X {\displaystyle X} and Y {\displaystyle Y} belong to the same dynamical system, their reconstructions (via embeddings) M x {\displaystyle M_{x}} , and M y {\displaystyle M_{y}} , also map to the same system. The causal variable X {\displaystyle X} leaves a signature on the affected variable Y {\displaystyle Y} , and consequently, the reconstructed states based on Y {\displaystyle Y} can be used to cross predict values of X {\displaystyle X} . CCM leverages this property to infer causality by predicting X {\displaystyle X} using the M y {\displaystyle M_{y}} library of points (or vice versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of M y {\displaystyle M_{y}} are used. If the prediction skill of X {\displaystyle X} increases and saturates as the entire M y {\displaystyle M_{y}} is used, this provides evidence that X {\displaystyle X} is casually influencing Y {\displaystyle Y} . === Multiview Embedding === Multiview Embedding is a Dimensionality reduction technique where a large number of state-space time series vectors are combitorially assessed towards maximal model predictability. == Extensions == Extensions to EDM techniques include: Generalized Theorems for Nonlinear State Space Reconstruction Extended Convergent Cross Mapping Dynamic stability S-Map regularization Visual analytics with EDM Convergent Cross Sorting Expert system with EDM hybrid Sliding windows based on the extended convergent cross-mapping Empirical Mode Modeling Accounting for missing data and variable step sizes Accounting for observation noise Hierarchical Bayesian EDM via Gaussian processes Intelligent and Adaptive Control Optimal control via Empirical dynamic programming Multiview distance regularised S-map

    Read more →
  • Composite Capability/Preference Profiles

    Composite Capability/Preference Profiles

    Composite Capability/Preference Profiles (CC/PP) is a specification for defining capabilities and preferences of user agents (also known as "delivery context"). The delivery context can be used to guide the process of tailoring content for a user agent. CC/PP is a vocabulary extension of the Resource Description Framework (RDF). The CC/PP specification is maintained by the W3C's Ubiquitous Web Applications Working Group (UWAWG) Working Group. == History == Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies 1.0 became a W3C recommendation on 15 January 2004. A "Last-Call Working-Draft" of CC/PP 2.0 was issued in April 2007

    Read more →
  • Mycin

    Mycin

    MYCIN was an early backward chaining expert system that used black box to identify bacteria causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics, with the dosage adjusted for patient's body weight — the name derived from the antibiotics themselves, as many antibiotics have the suffix "-mycin". The Mycin system was also used for the diagnosis of blood clotting diseases. MYCIN was developed over five or six years in the early 1970s at Stanford University. It was written in Lisp as the doctoral dissertation of Edward Shortliffe under the direction of Bruce G. Buchanan, Stanley N. Cohen and others. MYCIN emerged from the Stanford Heuristic Programming Project. MYCIN demonstrated the potential for expert systems in building high-performance medical reasoning programs. MYCIN is often viewed as a pioneer in the field of expert systems, even being referred to as the "grandaddy of them all-the one that launched the field" by Dr. Allen Newell. MYCIN led to the EMYCIN expert system shell ("essential MYCIN") for acquiring knowledge, reasoning with it, and explaining the results, without the specific medical knowledge. It can be described as "EMYCIN = Prolog + uncertainty + caching + questions + explanations + contexts - variables". An introduction is in Chapter 16 of Paradigms of Artificial Intelligence Programming (PAIP). == Method == MYCIN operated using a fairly simple inference engine and a knowledge base of ~600 rules by obtaining individual inferential facts identified by experts and encoding such facts as individual production rules. No other AI program at the time contained as much domain-specific knowledge clearly separated from its inference procedures as MYCIN. It would query the physician running the program via a long series of simple yes/no or textual questions. At the end, it provided a list of possible culprit bacteria ranked from high to low based on the probability of each diagnosis, its confidence in each diagnosis' probability, the reasoning behind each diagnosis (that is, MYCIN would also list the questions and rules which led it to rank a diagnosis a particular way), and its recommended course of drug treatment. MYCIN could additionally respond to queries by physicians related to why it asked the user a certain question, how it arrived at a conclusion, and why it did not consider certain factors. The developers performed studies showing that MYCIN's performance was minimally affected by perturbations in the uncertainty metrics associated with individual rules, suggesting that the power in the system was related more to its knowledge representation and reasoning scheme than to the details of its numerical uncertainty model. Some observers felt that it should have been possible to use classical Bayesian statistics. MYCIN's developers argued that this would require either unrealistic assumptions of probabilistic independence, or require the experts to provide estimates for an unfeasibly large number of conditional probabilities. Subsequent studies later showed that the certainty factor model could indeed be interpreted in a probabilistic sense, and highlighted problems with the implied assumptions of such a model. However the modular structure of the system would prove very successful, leading to the development of graphical models such as Bayesian networks. === Context === A context in MYCIN determines what types of objects can be reasoned about. They are similar to variables in Prolog, or environment variables in operating systems. === Evidence combination === In MYCIN it was possible that two or more rules might draw conclusions about a parameter with different weights of evidence. For example, one rule may conclude that the organism in question is E. Coli with a certainty of 0.8 whilst another concludes that it is E. Coli with a certainty of 0.5 or even −0.8. In the event the certainty is less than zero the evidence is actually against the hypothesis. In order to calculate the certainty factor MYCIN combined these weights using the formula below to yield a single certainty factor: C F ( x , y ) = { X + Y − X Y if X , Y > 0 X + Y + X Y if X , Y < 0 X + Y 1 − min ( | X | , | Y | ) otherwise {\displaystyle CF(x,y)={\begin{cases}X+Y-XY&{\text{if }}X,Y>0\\X+Y+XY&{\text{if }}X,Y<0\\{\frac {X+Y}{1-\min(|X|,|Y|)}}&{\text{otherwise}}\end{cases}}} Where X and Y are the certainty factors. This formula can be applied more than once if more than two rules draw conclusions about the same parameter. It is commutative, so it does not matter in which order the weights were combined. The combination formula was designed to have the following desirable properties: −1 can be interpreted as "false", +1 as "true", and 0 as "uncertain". Combining unknown with anything leaves it unchanged. Combining true with anything (except false) gives true. Similarly for false. Combining true and false is a division-by-zero error. Combining +x and -x gives unknown. Combining two positives (except true) gives a larger positive. Similarly for negatives. Combining a positive and a negative gives something in between. === Examples === The following examples come from Chapter 16 of PAIP, which contains an implementation in Common Lisp of a modified and simplified version of MYCIN for pedagogical purposes. A rule, and an English paraphrase generated by the system: == Results == An evaluation of MYCIN was conducted at the Stanford Medical School. The first phase of the evaluation consisted of 10 test cases of diverse origin, chosen by a physician who was not acquainted with MYCIN's methods or knowledge base. These cases were presented to 7 physicians and 1 senior medical student. 10 prescriptions were compiled for each of the cases, 1 recommended by MYCIN, 1 prescribed by the treating physician at the county hospital, and 8 by the aforementioned individuals. The second phase of the evaluation consisted of eight infectious disease specialists being provided the clinical summary and set of 10 prescriptions for each of the 10 cases and tasked to provide their own recommendations for each case and assess the 10 prescriptions. MYCIN received an acceptability rating of 65%, which was comparable to the 42.5% to 62.5% rating of five faculty members. This study is often cited as showing the potential for disagreement about therapeutic decisions, even among experts, when there is no "gold standard" for correct treatment. == Practical use == MYCIN was never actually used in practice. This wasn't because of any weakness in its performance. Some observers raised ethical and legal issues related to the use of computers in medicine, regarding the responsibility of the physicians in case the system gave wrong diagnosis. However, the greatest problem, and the reason that MYCIN was not used in routine practice, was the state of technologies for system integration, especially at the time it was developed. MYCIN was a stand-alone system that required a user to enter all relevant information about a patient by typing in responses to questions MYCIN posed. MYCIN ran on the DEC KI10 PDP-10, supporting a large time-shared system available over the early Internet (ARPANet), before personal computers were developed. MYCIN's greatest influence was accordingly its demonstration of the power of its representation and reasoning approach. Rule-based systems in many non-medical domains were developed in the years that followed MYCIN's introduction of the approach. In the 1980s, expert system "shells" were introduced (including one based on MYCIN, known as E-MYCIN (followed by Knowledge Engineering Environment - KEE)) and supported the development of expert systems in a wide variety of application areas. A difficulty that rose to prominence during the development of MYCIN and subsequent complex expert systems has been the extraction of the necessary knowledge for the inference engine to use from the human expert in the relevant fields into the rule base (the so-called "knowledge acquisition bottleneck").

    Read more →
  • KL-ONE

    KL-ONE

    KL-ONE (pronounced "kay ell won") is a knowledge representation system in the tradition of semantic networks and frames; that is, it is a frame language. The system is an attempt to overcome semantic indistinctness in semantic network representations and to explicitly represent conceptual information as a structured inheritance network. == Overview == There is a whole family of KL-ONE-like systems. One of the innovations that KL-ONE initiated was the use of a deductive classifier, an automated reasoning engine that can validate a frame ontology and deduce new information about the ontology based on the initial information provided by a domain expert. Frames in KL-ONE are called concepts. These form hierarchies using subsume-relations; in the KL-ONE terminology a super class is said to subsume its subclasses. Multiple inheritance is allowed. Actually a concept is said to be well-formed only if it inherits from more than one other concept. All concepts, except the top concept (usually THING), must have at least one super class. In KL-ONE descriptions are separated into two basic classes of concepts: primitive and defined. Primitives are domain concepts that are not fully defined. This means that given all the properties of a concept, this is not sufficient to classify it. They may also be viewed as incomplete definitions. Using the same view, defined concepts are complete definitions. Given the properties of a concept, these are necessary and sufficient conditions to classify the concept. The slot-concept is called roles and the values of the roles are role-fillers. There are several different types of roles to be used in different situations. The most common and important role type is the generic RoleSet that captures the fact that the role may be filled with more than one filler.

    Read more →
  • Progressive Graphics File

    Progressive Graphics File

    PGF (Progressive Graphics File) is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio. PGF can operate at higher compression ratios without taking more encoding/decoding time and without generating the characteristic "blocky and blurry" artifacts of the original DCT-based JPEG standard. It also allows more sophisticated progressive downloads. == Color models == PGF supports a wide variety of color models: Grayscale with 1, 8, 16, or 31 bits per pixel Indexed color with palette size of 256 RGB color image with 12, 16 (red: 5 bits, green: 6 bits, blue: 5 bits), 24, or 48 bits per pixel ARGB color image with 32 bits per pixel Lab color image with 24 or 48 bits per pixel CMYK color image with 32 or 64 bits per pixel == Technical discussion == PGF claims to achieve an improved compression quality over JPEG adding or improving features such as scalability. Its compression performance is similar to the original JPEG standard. Very low and very high compression rates (including lossless compression) are also supported in PGF. The ability of the design to handle a very large range of effective bit rates is one of the strengths of PGF. For example, to reduce the number of bits for a picture below a certain amount, the advisable thing to do with the first JPEG standard is to reduce the resolution of the input image before encoding it — something that is ordinarily not necessary for that purpose when using PGF because of its wavelet scalability properties. The PGF process chain contains the following four steps: Color space transform (in case of color images) Discrete Wavelet Transform Quantization (in case of lossy data compression) Hierarchical bit-plane run-length encoding === Color components transformation === Initially, images have to be transformed from the RGB color space to another color space, leading to three components that are handled separately. PGF uses a fully reversible modified YUV color transform. The transformation matrices are: [ Y r U r V r ] = [ 1 4 1 2 1 4 1 − 1 0 0 − 1 1 ] [ R G B ] ; [ R G B ] = [ 1 3 4 − 1 4 1 − 1 4 − 1 4 1 − 1 4 3 4 ] [ Y r U r V r ] {\displaystyle {\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}={\begin{bmatrix}{\frac {1}{4}}&{\frac {1}{2}}&{\frac {1}{4}}\\1&-1&0\\0&-1&1\end{bmatrix}}{\begin{bmatrix}R\\G\\B\end{bmatrix}};\qquad \qquad {\begin{bmatrix}R\\G\\B\end{bmatrix}}={\begin{bmatrix}1&{\frac {3}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&{\frac {3}{4}}\end{bmatrix}}{\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}} The chrominance components can be, but do not necessarily have to be, down-scaled in resolution. === Wavelet transform === The color components are then wavelet transformed to an arbitrary depth. In contrast to JPEG 1992 which uses an 8x8 block-size discrete cosine transform, PGF uses one reversible wavelet transform: a rounded version of the biorthogonal CDF 5/3 wavelet transform. This wavelet filter bank is exactly the same as the reversible wavelet used in JPEG 2000. It uses only integer coefficients, so the output does not require rounding (quantization) and so it does not introduce any quantization noise. === Quantization === After the wavelet transform, the coefficients are scalar-quantized to reduce the amount of bits to represent them, at the expense of a loss of quality. The output is a set of integer numbers which have to be encoded bit-by-bit. The parameter that can be changed to set the final quality is the quantization step: the greater the step, the greater is the compression and the loss of quality. With a quantization step that equals 1, no quantization is performed (it is used in lossless compression). In contrast to JPEG 2000, PGF uses only powers of two, therefore the parameter value i represents a quantization step of 2i. Just using powers of two makes no need of integer multiplication and division operations. === Coding === The result of the previous process is a collection of sub-bands which represent several approximation scales. A sub-band is a set of coefficients — integer numbers which represent aspects of the image associated with a certain frequency range as well as a spatial area of the image. The quantized sub-bands are split further into blocks, rectangular regions in the wavelet domain. They are typically selected in a way that the coefficients within them across the sub-bands form approximately spatial blocks in the (reconstructed) image domain and collected in a fixed size macroblock. The encoder has to encode the bits of all quantized coefficients of a macroblock, starting with the most significant bits and progressing to less significant bits. In this encoding process, each bit-plane of the macroblock gets encoded in two so-called coding passes, first encoding bits of significant coefficients, then refinement bits of significant coefficients. Clearly, in lossless mode all bit-planes have to be encoded, and no bit-planes can be dropped. Only significant coefficients are compressed with an adaptive run-length/Rice (RLR) coder, because they contain long runs of zeros. The RLR coder with parameter k (logarithmic length of a run of zeros) is also known as the elementary Golomb code of order 2k. === Comparison with other file formats === JPEG 2000 is slightly more space-efficient in handling natural images. The PSNR for the same compression ratio is on average 3% better than the PSNR of PGF. It has a small advantage in compression ratio but longer encoding and decoding times. PNG (Portable Network Graphics) is more space-efficient in handling images with many pixels of the same color. There are several self-proclaimed advantages of PGF over the ordinary JPEG standard: Superior compression performance: The image quality (measured in PSNR) for the same compression ratio is on average 3% better than the PSNR of JPEG. At lower bit rates (e.g. less than 0.25 bits/pixel for gray-scale images), PGF has a much more significant advantage over certain modes of JPEG: artifacts are less visible and there is almost no blocking. The compression gains over JPEG are attributed to the use of DWT. Multiple resolution representation: PGF provides seamless compression of multiple image components, with each component carrying from 1 to 31 bits per component sample. With this feature there is no need for separately stored preview images (thumbnails). Progressive transmission by resolution accuracy, commonly referred to as progressive decoding: PGF provides efficient code-stream organizations which are progressive by resolution. This way, after a smaller part of the whole file has been received, it is possible to see a lower quality of the final picture, the quality can be improved monotonically getting more data from the source. Lossless and lossy compression: PGF provides both lossless and lossy compression in a single compression architecture. Both lossy and lossless compression are provided by the use of a reversible (integer) wavelet transform. Side channel spatial information: Transparency and alpha planes are fully supported ROI extraction: Since version 5, PGF supports extraction of regions of interest (ROI) without decoding the whole image. == Available software == The author published libPGF via a SourceForge, under the GNU Lesser General Public License version 2.0. Xeraina offers a free Windows console encoder and decoder, and PGF viewers based on WIC for 32bit and 64bit Windows platforms. Other WIC applications including File Explorer are able to display PGF images after installing this viewer. Digikam is a popular open-source image editing and cataloging software that uses libPGF for its thumbnails. It makes use of the progressive decoding feature of PGF images to store a single version of each thumbnail, which can then be decoded to different resolutions without loss, thus allowing users to dynamically change the size of the thumbnails without having to recalculate them again.

    Read more →
  • OntoWiki

    OntoWiki

    OntoWiki was a free and open-source semantic wiki application, meant to serve as an ontology editor and a knowledge acquisition system. It is a web-based application written in PHP and using either a MySQL database or a Virtuoso triple store. OntoWiki is form-based rather than syntax-based, and thus tries to hide as much of the complexity of knowledge representation formalisms from users as possible. OntoWiki is mainly being developed by the Agile Knowledge Engineering and Semantic Web (AKSW) research group at the University of Leipzig, a group also known for the DBpedia project among others, in collaboration with volunteers around the world. In 2009 the AKSW research group got a budget of €425,000 from the Federal Ministry of Education and Research of Germany for the development of the OntoWiki. In 2010 OntoWiki became part of the technology stack supporting the LOD2 (linked open data) project. Leipzig University is one of the consortium members of the project, which is funded by a €6.5m EU grant. The development ended in 2016 due to the lack of capacity migrating from PHP 5 to 7 including the required Zend Framework from version 1 to 2.

    Read more →
  • Computational Intelligence (journal)

    Computational Intelligence (journal)

    Computational Intelligence Journal is a peer-reviewed scientific journal covering research on artificial intelligence and computer science. The journal published novel research as well as innovative applications in a broad range of AI, covering Computational Intelligence is an artificial intelligence journal publishing novel research on a broad range of experimental and theoretical topics in AI and computer science. With a broad scope, the journal covers machine learning, knowledge mining, web intelligence, AI language, and philosophical implications. The journal was established in 1985 and is published by Wiley-Blackwell. Currently, the editors-in-chief is Diane Inkpen. The quality of the journal as an academic publishing venue is evaluated according to public citation impact metrics. in 2022, the Computational Intelligence Journal CiteScore of Scopus was 5.3, while Clarivate's Web of Science gives it 0.39 in the Journal Citation Indicator and 2,8 in the Journal Impact Factor.

    Read more →
  • Andrej Karpathy

    Andrej Karpathy

    Andrej Karpathy (born 23 October 1986) is a Slovak-Canadian AI researcher, who co-founded and formerly worked at OpenAI, where he specialized in deep learning and computer vision. He also worked as the director of artificial intelligence and Autopilot Vision at Tesla, and in 2024 he founded Eureka Labs, an AI education platform. In 2026 he joined Anthropic as part of the pretraining team. == Education and early life == Karpathy was born in Bratislava, Czechoslovakia (now Slovakia), and moved with his family to Toronto when he was 15. He completed his Computer Science and Physics bachelor's degrees at University of Toronto in 2009 and his master's degree at University of British Columbia in 2011, where he worked on physically simulated figures (for example, a simulated runner or a simulated person in a crowd) with his adviser Michiel van de Panne. In 2006, Karpathy began posting videos on YouTube on his channel, badmephisto. He garnered fame by posting Rubik's cube tutorials which have been used by famous speedcubers such as Feliks Zemdegs. The channel has over 9 million views as of June 2025. Karpathy received a PhD from Stanford University in 2015 under the supervision of Fei-Fei Li, focusing on the intersection of natural language processing and computer vision, and deep learning models suited for this task. == Career and research == He authored and was the primary instructor of the first deep learning course at Stanford, CS 231n: Convolutional Neural Networks for Visual Recognition. The course became one of the largest classes at Stanford, growing from 150 students in 2015 to 750 in 2017. Karpathy is a founding member of the artificial intelligence research group OpenAI, where he was a research scientist from 2015 to 2017. In June 2017 he became Tesla's director of artificial intelligence and reported to Elon Musk. He was named one of MIT Technology Review's Innovators Under 35 for 2020. After taking a several-months-long sabbatical from Tesla, he announced he was leaving the company in July 2022. As of February 2023, he makes YouTube videos on how to create artificial neural networks. On February 9, 2023, Karpathy announced he was returning to OpenAI. A year later on February 13, 2024, an OpenAI spokesperson confirmed that Karpathy had left OpenAI. In the same year, he was named one of Time Magazine's 100 Most Influential People in AI. On July 16, 2024, Karpathy announced on his X account that he started a new AI education company called Eureka Labs. Their first product was the AI course, LLM101n. He also has a broader educational effort, the "Zero to Hero" series on LLM fundamentals. The company also advocates for AI teaching assistants, a concept which has been criticized due to data privacy concerns and the removal of personal connection between teacher and student. In February 2025, Karpathy coined the term vibe coding to describe how AI tools allow hobbyists to construct apps and websites just by typing prompts. On May 19, 2026, he announced that he joined Anthropic via a statement on X, while the company stated that he will be leading a team for research in pretraining.

    Read more →
  • Cross-entropy method

    Cross-entropy method

    The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases: Draw a sample from a probability distribution. Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems. == Estimation via importance sampling == Consider the general problem of estimating the quantity ℓ = E u [ H ( X ) ] = ∫ H ( x ) f ( x ; u ) d x {\displaystyle \ell =\mathbb {E} _{\mathbf {u} }[H(\mathbf {X} )]=\int H(\mathbf {x} )\,f(\mathbf {x} ;\mathbf {u} )\,{\textrm {d}}\mathbf {x} } , where H {\displaystyle H} is some performance function and f ( x ; u ) {\displaystyle f(\mathbf {x} ;\mathbf {u} )} is a member of some parametric family of distributions. Using importance sampling this quantity can be estimated as ℓ ^ = 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) g ( X i ) {\displaystyle {\hat {\ell }}={\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{g(\mathbf {X} _{i})}}} , where X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} is a random sample from g {\displaystyle g\,} . For positive H {\displaystyle H} , the theoretically optimal importance sampling density (PDF) is given by g ∗ ( x ) = H ( x ) f ( x ; u ) / ℓ {\displaystyle g^{}(\mathbf {x} )=H(\mathbf {x} )f(\mathbf {x} ;\mathbf {u} )/\ell } . This, however, depends on the unknown ℓ {\displaystyle \ell } . The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF g ∗ {\displaystyle g^{}} . == Generic CE algorithm == Choose initial parameter vector v ( 0 ) {\displaystyle \mathbf {v} ^{(0)}} ; set t = 1. Generate a random sample X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} from f ( ⋅ ; v ( t − 1 ) ) {\displaystyle f(\cdot ;\mathbf {v} ^{(t-1)})} Solve for v ( t ) {\displaystyle \mathbf {v} ^{(t)}} , where v ( t ) = argmax v ⁡ 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) f ( X i ; v ( t − 1 ) ) log ⁡ f ( X i ; v ) {\displaystyle \mathbf {v} ^{(t)}=\mathop {\textrm {argmax}} _{\mathbf {v} }{\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})}}\log f(\mathbf {X} _{i};\mathbf {v} )} If convergence is reached then stop; otherwise, increase t by 1 and reiterate from step 2. In several cases, the solution to step 3 can be found analytically. Situations in which this occurs are When f {\displaystyle f\,} belongs to the natural exponential family When f {\displaystyle f\,} is discrete with finite support When H ( X ) = I { x ∈ A } {\displaystyle H(\mathbf {X} )=\mathrm {I} _{\{\mathbf {x} \in A\}}} and f ( X i ; u ) = f ( X i ; v ( t − 1 ) ) {\displaystyle f(\mathbf {X} _{i};\mathbf {u} )=f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})} , then v ( t ) {\displaystyle \mathbf {v} ^{(t)}} corresponds to the maximum likelihood estimator based on those X k ∈ A {\displaystyle \mathbf {X} _{k}\in A} . == Continuous optimization—example == The same CE algorithm can be used for optimization, rather than estimation. Suppose the problem is to maximize some function S {\displaystyle S} , for example, S ( x ) = e − ( x − 2 ) 2 + 0.8 e − ( x + 2 ) 2 {\displaystyle S(x)={\textrm {e}}^{-(x-2)^{2}}+0.8\,{\textrm {e}}^{-(x+2)^{2}}} . To apply CE, one considers first the associated stochastic problem of estimating P θ ( S ( X ) ≥ γ ) {\displaystyle \mathbb {P} _{\boldsymbol {\theta }}(S(X)\geq \gamma )} for a given level γ {\displaystyle \gamma \,} , and parametric family { f ( ⋅ ; θ ) } {\displaystyle \left\{f(\cdot ;{\boldsymbol {\theta }})\right\}} , for example the 1-dimensional Gaussian distribution, parameterized by its mean μ t {\displaystyle \mu _{t}\,} and variance σ t 2 {\displaystyle \sigma _{t}^{2}} (so θ = ( μ , σ 2 ) {\displaystyle {\boldsymbol {\theta }}=(\mu ,\sigma ^{2})} here). Hence, for a given γ {\displaystyle \gamma \,} , the goal is to find θ {\displaystyle {\boldsymbol {\theta }}} so that D K L ( I { S ( x ) ≥ γ } ‖ f θ ) {\displaystyle D_{\mathrm {KL} }({\textrm {I}}_{\{S(x)\geq \gamma \}}\|f_{\boldsymbol {\theta }})} is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above. It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value ≥ γ {\displaystyle \geq \gamma } . The worst of the elite samples is then used as the level parameter for the next iteration. This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm. === Pseudocode === // Initialize parameters μ := −6 σ2 := 100 t := 0 maxits := 100 N := 100 Ne := 10 // While maxits not exceeded and not converged while t < maxits and σ2 > ε do // Obtain N samples from current sampling distribution X := SampleGaussian(μ, σ2, N) // Evaluate objective function at sampled points S := exp(−(X − 2) ^ 2) + 0.8 exp(−(X + 2) ^ 2) // Sort X by objective function values in descending order X := sort(X, S) // Update parameters of sampling distribution via elite samples μ := mean(X(1:Ne)) σ2 := var(X(1:Ne)) t := t + 1 // Return mean of final sampling distribution as solution return μ == Related methods == Simulated annealing Genetic algorithms Harmony search Estimation of distribution algorithm Tabu search Natural Evolution Strategy Ant colony optimization algorithms

    Read more →
  • Ebert test

    Ebert test

    The Ebert test gauges whether a computer-based synthesized voice can tell a joke with sufficient skill to cause people to laugh. It was proposed by film critic Roger Ebert at the 2011 TED conference as a challenge to software developers to have a computerized voice master the inflections, delivery, timing, and intonations of human speech. The test is similar to the Turing test proposed by Alan Turing in 1950 as a way to gauge a computer's ability to exhibit intelligent behavior by generating performance indistinguishable from a human being. If the computer can successfully tell a joke, and do the timing and delivery as well as Henny Youngman, then that's the voice I want. Ebert lost his voice in 2006 after undergoing surgery to treat thyroid cancer. He employed a Scottish company called CereProc, which custom-tailors text-to-speech software for voiceless customers who record their voices at length before losing them, and mined tapes and DVD commentaries featuring Ebert to create a voice that sounded more like his own voice. He first publicly used the voice they devised for him in his March 2, 2010, appearance on The Oprah Winfrey Show. The audience of Ebert's 2011 TED talk about joke delivery by synthesized voices erupted with laughter when a synthesized voice delivered the following joke: "A guy goes into a psychiatrist. The psychiatrist says, 'You’re crazy.' The guy says, 'I want a second opinion.' The psychiatrist says, 'All right, you’re ugly, too.'"

    Read more →
  • Ian Goodfellow

    Ian Goodfellow

    Ian J. Goodfellow (born 1987) is an American computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He is a research scientist at Google DeepMind, was previously employed as a research scientist at Google Brain and director of machine learning at Apple as well as one of the first employees at OpenAI, and has made several important contributions to the field of deep learning, including the invention of the generative adversarial network (GAN). Goodfellow co-wrote, as the first author, the textbook Deep Learning (2016) and wrote the chapter on deep learning in the authoritative textbook of the field of artificial intelligence, Artificial Intelligence: A Modern Approach (used in more than 1,500 universities in 135 countries). == Education == Goodfellow obtained his BSc and MSc in computer science from Stanford University under the supervision of Andrew Ng, and his PhD in machine learning from the Université de Montréal in February 2015, under the supervision of Yoshua Bengio and Aaron Courville. Goodfellow's thesis is titled Deep learning of representations and its application to computer vision. == Career == After graduation, Goodfellow joined Google as part of the Google Brain research team. In March 2016, he left Google to join the newly founded OpenAI research laboratory. 11 months later, in March 2017, Goodfellow returned to Google Research, but left again in 2019. In 2019, Goodfellow joined Apple as director of machine learning in the Special Projects Group. He resigned from Apple in April 2022 to protest Apple's plan to require in-person work for its employees. Shortly after, Goodfellow then joined Google DeepMind as a research scientist. In 2025, Goodfellow left Google. As of July 2026, based on information on Goodfellow's LinkedIn profile, he is co-founding a startup company. == Research == Goodfellow is best known for inventing generative adversarial networks (GANs), using deep learning to generate images. This approach uses two neural networks to competitively improve an image's quality. A “generator” network creates a synthetic image based on an initial set of images such as a collection of faces. A “discriminator” network tries to determine whether images are authentic or created by the generator. The generate-detect cycle is repeated. For each iteration, the generator and the discriminator use the other's feedback to improve or detect the generated images, until the discriminator can no longer distinguish between generated and authentic images. However, GANs have also been used to create deepfakes. At Google, Goodfellow developed a system enabling Google Maps to automatically transcribe addresses from photos taken by Street View cars and demonstrated security vulnerabilities of machine learning systems. == Recognition == In 2017, Goodfellow was cited in MIT Technology Review's 35 Innovators Under 35. In 2019, he was included in Foreign Policy's list of 100 Global Thinkers.

    Read more →