AI Essay Stock Market

AI Essay Stock Market — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Imaging

    Imaging

    Imaging is the process of creating visual representations of objects, scenes, or phenomena. The term encompasses both the formation of images through physical processes and the technologies used to capture, store, process, and display them. While traditional imaging relies on visible light, modern imaging systems can visualize information across the electromagnetic spectrum and through other physical phenomena such as sound waves, magnetic fields, and particle emissions, enabling the visualization of subjects invisible to the human eye. Imaging science is the multidisciplinary field concerned with the theoretical foundations and practical applications of image creation and analysis. The field draws on physics, mathematics, electrical engineering, computer science, computer vision, and perceptual psychology to develop systems that generate, collect, duplicate, analyze, modify, and visualize images. == Principles == === The imaging chain === The imaging chain is a conceptual framework describing the interconnected components of any imaging system. Understanding each link in this chain allows engineers and scientists to optimize system performance for specific applications. The chain begins with the subject and its observable properties, typically energy that is emitted, reflected, or transmitted. A light source or other energy source may illuminate the subject to make these properties detectable. The capture device then collects this energy using appropriate sensors: optical systems for electromagnetic radiation, transducers for acoustic waves, or antenna arrays for radio frequencies. In digital systems, a processor converts the captured signals into a format suitable for rendering, applying algorithms for noise reduction, enhancement, or reconstruction. Finally, a display renders the processed information as a visible image on media such as paper, screens, or projection surfaces. Throughout this process, the characteristics of the human visual system inform design decisions, as the ultimate purpose of most imaging systems is to convey information to human observers. === Coherent and non-coherent imaging === Imaging systems are often classified by whether they use coherent or non-coherent illumination. Coherent imaging employs an active source that produces waves with a consistent phase relationship, as in radar, synthetic aperture radar, medical ultrasound, and optical coherence tomography. These systems can capture phase information in addition to amplitude, enabling techniques such as holography and interferometry. Non-coherent imaging systems, including conventional photography, fluorescence microscopy, and telescopes, rely on illumination sources where light waves have random phase relationships. == Methods and applications == Imaging methods span a wide range of physical principles, each suited to particular applications. Optical imaging encompasses photography, cinematography, microscopy, and telescopic observation. These methods capture electromagnetic radiation in or near the visible spectrum and form the basis of most consumer and scientific imaging. Extensions include thermography, which visualizes infrared radiation to reveal temperature distributions, and multispectral imaging, which captures data across multiple wavelength bands for applications in remote sensing and materials analysis. Medical imaging comprises techniques designed to visualize the interior of the human body for diagnostic and therapeutic purposes. Radiography and computed tomography use X-rays to image dense structures such as bone. Magnetic resonance imaging exploits nuclear magnetic properties to produce detailed soft-tissue images without ionizing radiation. Ultrasound imaging uses high-frequency sound waves and is particularly valuable for real-time imaging and fetal monitoring. Nuclear medicine techniques such as positron emission tomography track radioactive tracers to reveal metabolic activity. Emerging modalities include photoacoustic imaging, which combines optical and acoustic principles, and Magneto-acousto-electrical tomography, which maps electrical conductivity in biological tissues. Acoustic imaging uses sound waves to create images. Beyond medical ultrasound, applications include sonar for underwater navigation and mapping, seismic imaging for geological exploration, and industrial non-destructive testing. Radar and microwave imaging employ radio waves to detect and image objects. Synthetic aperture radar produces high-resolution images from aircraft or satellites regardless of weather or lighting conditions, making it essential for Earth observation and reconnaissance. Ground-penetrating radar images subsurface structures for archaeological and engineering applications. Electron and particle imaging use beams of electrons or other particles to achieve resolutions far beyond the diffraction limit of visible light. Electron microscopes can image individual atoms, enabling advances in materials science and structural biology. Chemical imaging combines spectroscopy with spatial imaging to map the chemical composition of samples, with applications in pharmaceutical development, food safety, and forensics. LIDAR (Light Detection and Ranging) measures distances using laser pulses to create three-dimensional representations of surfaces and objects, widely used in autonomous vehicles, topographic mapping, and forestry. Computational and digital imaging encompasses image processing, computer graphics, three-dimensional rendering, and digital image restoration. Computer vision applies algorithmic analysis to extract information from images automatically. == History == Photography and imaging have always been intertwined. When Joseph Nicéphore Niépce created the first permanent photograph using heliography in 1826, and Louis Daguerre refined the process into the daguerreotype a decade later, they weren't just inventing a new art form, they were laying the groundwork for an entire scientific discipline built on silver halide chemistry. For most of the nineteenth century, photography remained the province of specialists. That changed with George Eastman's Kodak camera, introduced in 1888 with the slogan "You press the button, we do the rest." Suddenly, anyone could take pictures. Around the same time, Wilhelm Röntgen stumbled onto X-rays in 1895, an accident that would spawn the entire field of medical imaging. World War II proved to be a turning point. Radar technology, developed frantically on both sides of the conflict, introduced concepts that engineers would later adapt for synthetic aperture radar and medical ultrasound. Then the charge-coupled device came: Willard Boyle and George E. Smith built the first one at Bell Labs in 1969, and within a few decades it had made film nearly obsolete. Magnetic resonance imaging arrived in the 1970s, offering doctors something X-rays never could, detailed views of soft tissue without any radiation. Digital cameras took over fast. By the 2000s, film was already in decline; by the 2010s, smartphones had put a surprisingly capable camera in nearly every pocket. Features that once required real skill, proper exposure, sharp focus, accurate color, became automatic. Today, billions of photos get uploaded to social media every day. As a result, a growing issue is that generative artificial intelligence can fabricate photorealistic images from scratch. What counts as a "real" photograph is no longer necessarily obvious.

    Read more →
  • Empirical dynamic modeling

    Empirical dynamic modeling

    Empirical dynamic modeling (EDM) is a framework for analysis and prediction of nonlinear dynamical systems. Applications include population dynamics, ecosystem service, medicine, neuroscience, dynamical systems, geophysics, and human-computer interaction. EDM was originally developed by Robert May and George Sugihara. It can be considered a methodology for data modeling, predictive analytics, dynamical system analysis, machine learning and time series analysis. == Description == Mathematical models have tremendous power to describe observations of real-world systems. They are routinely used to test hypothesis, explain mechanisms and predict future outcomes. However, real-world systems are often nonlinear and multidimensional, in some instances rendering explicit equation-based modeling problematic. Empirical models, which infer patterns and associations from the data instead of using hypothesized equations, represent a natural and flexible framework for modeling complex dynamics. Donald DeAngelis and Simeon Yurek illustrated that canonical statistical models are ill-posed when applied to nonlinear dynamical systems. A hallmark of nonlinear dynamics is state-dependence: system states are related to previous states governing transition from one state to another. EDM operates in this space, the multidimensional state-space of system dynamics rather than on one-dimensional observational time series. EDM does not presume relationships among states, for example, a functional dependence, but projects future states from localised, neighboring states. EDM is thus a state-space, nearest-neighbors paradigm where system dynamics are inferred from states derived from observational time series. This provides a model-free representation of the system naturally encompassing nonlinear dynamics. A cornerstone of EDM is recognition that time series observed from a dynamical system can be transformed into higher-dimensional state-spaces by time-delay embedding with Takens's theorem. The state-space models are evaluated based on in-sample fidelity to observations, conventionally with Pearson correlation between predictions and observations. == Methods == Primary EDM algorithms include Simplex projection, Sequential locally weighted global linear maps (S-Map) projection, Multivariate embedding in Simplex or S-Map, Convergent cross mapping (CCM), and Multiview Embeding, described below. Nearest neighbors are found according to: NN ( y , X , k ) = ‖ X N i E − y ‖ ≤ ‖ X N j E − y ‖ if 1 ≤ i ≤ j ≤ k {\displaystyle {\text{NN}}(y,X,k)=\|X_{N_{i}}^{E}-y\|\leq \|X_{N_{j}}^{E}-y\|{\text{ if }}1\leq i\leq j\leq k} === Simplex === Simplex projection is a nearest neighbor projection. It locates the k {\displaystyle k} nearest neighbors to the location in the state-space from which a prediction is desired. To minimize the number of free parameters k {\displaystyle k} is typically set to E + 1 {\displaystyle E+1} defining an E + 1 {\displaystyle E+1} dimensional simplex in the state-space. The prediction is computed as the average of the weighted phase-space simplex projected T p {\displaystyle Tp} points ahead. Each neighbor is weighted proportional to their distance to the projection origin vector in the state-space. Find k {\displaystyle k} nearest neighbor: N k ← NN ( y , X , k ) {\displaystyle N_{k}\gets {\text{NN}}(y,X,k)} Define the distance scale: d ← ‖ X N 1 E − y ‖ {\displaystyle d\gets \|X_{N_{1}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − ‖ X N i E − y ‖ / d ) {\displaystyle w_{i}\gets \exp(-\|X_{N_{i}}^{E}-y\|/d)} Average of state-space simplex: y ^ ← ∑ i = 1 k ( w i X N i + T p ) / ∑ i = 1 k w i {\displaystyle {\hat {y}}\gets \sum _{i=1}^{k}\left(w_{i}X_{N_{i}+T_{p}}\right)/\sum _{i=1}^{k}w_{i}} === S-Map === S-Map extends the state-space prediction in Simplex from an average of the E + 1 {\displaystyle E+1} nearest neighbors to a linear regression fit to all neighbors, but localised with an exponential decay kernel. The exponential localisation function is F ( θ ) = exp ( − θ d / D ) {\displaystyle F(\theta )={\text{exp}}(-\theta d/D)} , where d {\displaystyle d} is the neighbor distance and D {\displaystyle D} the mean distance. In this way, depending on the value of θ {\displaystyle \theta } , neighbors close to the prediction origin point have a higher weight than those further from it, such that a local linear approximation to the nonlinear system is reasonable. This localisation ability allows one to identify an optimal local scale, in-effect quantifying the degree of state dependence, and hence nonlinearity of the system. Another feature of S-Map is that for a properly fit model, the regression coefficients between variables have been shown to approximate the gradient (directional derivative) of variables along the manifold. These Jacobians represent the time-varying interaction strengths between system variables. Find k {\displaystyle k} nearest neighbor: N ← NN ( y , X , k ) {\displaystyle N\gets {\text{NN}}(y,X,k)} Sum of distances: D ← 1 k ∑ i = 1 k ‖ X N i E − y ‖ {\displaystyle D\gets {\frac {1}{k}}\sum _{i=1}^{k}\|X_{N_{i}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − θ ‖ X N i E − y ‖ / D ) {\displaystyle w_{i}\gets \exp(-\theta \|X_{N_{i}}^{E}-y\|/D)} Reweighting matrix: W ← diag ( w i ) {\displaystyle W\gets {\text{diag}}(w_{i})} Design matrix: A ← [ 1 X N 1 X N 1 − 1 … X N 1 − E + 1 1 X N 2 X N 2 − 1 … X N 2 − E + 1 ⋮ ⋮ ⋮ ⋱ ⋮ 1 X N k X N k − 1 … X N k − E + 1 ] {\displaystyle A\gets {\begin{bmatrix}1&X_{N_{1}}&X_{N_{1}-1}&\dots &X_{N_{1}-E+1}\\1&X_{N_{2}}&X_{N_{2}-1}&\dots &X_{N_{2}-E+1}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&X_{N_{k}}&X_{N_{k}-1}&\dots &X_{N_{k}-E+1}\end{bmatrix}}} Weighted design matrix: A ← W A {\displaystyle A\gets WA} Response vector at T p {\displaystyle Tp} : b ← [ X N 1 + T p X N 2 + T p ⋮ X N k + T p ] {\displaystyle b\gets {\begin{bmatrix}X_{N_{1}+T_{p}}\\X_{N_{2}+T_{p}}\\\vdots \\X_{N_{k}+T_{p}}\end{bmatrix}}} Weighted response vector: b ← W b {\displaystyle b\gets Wb} Least squares solution (SVD): c ^ ← argmin c ‖ A c − b ‖ 2 2 {\displaystyle {\hat {c}}\gets {\text{argmin}}_{c}\|Ac-b\|_{2}^{2}} Local linear model c ^ {\displaystyle {\hat {c}}} is prediction: y ^ ← c ^ 0 + ∑ i = 1 E c ^ i y i {\displaystyle {\hat {y}}\gets {\hat {c}}_{0}+\sum _{i=1}^{E}{\hat {c}}_{i}y_{i}} === Multivariate Embedding === Multivariate Embedding recognizes that time-delay embeddings are not the only valid state-space construction. In Simplex and S-Map one can generate a state-space from observational vectors, or time-delay embeddings of a single observational time series, or both. === Convergent Cross Mapping === Convergent cross mapping (CCM) leverages a corollary to the Generalized Takens Theorem that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables X {\displaystyle X} and Y {\displaystyle Y} , X {\displaystyle X} causes Y {\displaystyle Y} . Since X {\displaystyle X} and Y {\displaystyle Y} belong to the same dynamical system, their reconstructions (via embeddings) M x {\displaystyle M_{x}} , and M y {\displaystyle M_{y}} , also map to the same system. The causal variable X {\displaystyle X} leaves a signature on the affected variable Y {\displaystyle Y} , and consequently, the reconstructed states based on Y {\displaystyle Y} can be used to cross predict values of X {\displaystyle X} . CCM leverages this property to infer causality by predicting X {\displaystyle X} using the M y {\displaystyle M_{y}} library of points (or vice versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of M y {\displaystyle M_{y}} are used. If the prediction skill of X {\displaystyle X} increases and saturates as the entire M y {\displaystyle M_{y}} is used, this provides evidence that X {\displaystyle X} is casually influencing Y {\displaystyle Y} . === Multiview Embedding === Multiview Embedding is a Dimensionality reduction technique where a large number of state-space time series vectors are combitorially assessed towards maximal model predictability. == Extensions == Extensions to EDM techniques include: Generalized Theorems for Nonlinear State Space Reconstruction Extended Convergent Cross Mapping Dynamic stability S-Map regularization Visual analytics with EDM Convergent Cross Sorting Expert system with EDM hybrid Sliding windows based on the extended convergent cross-mapping Empirical Mode Modeling Accounting for missing data and variable step sizes Accounting for observation noise Hierarchical Bayesian EDM via Gaussian processes Intelligent and Adaptive Control Optimal control via Empirical dynamic programming Multiview distance regularised S-map

    Read more →
  • Aporia (company)

    Aporia (company)

    Aporia is a machine learning observability platform based in Tel Aviv, Israel. The company has a US office located in San Jose, California. Aporia has developed software for monitoring and controlling undetected defects and failures used by other companies to detect and report anomalies, and warn in the early stages of faults. == History == Aporia was founded in 2019 by Liran Hason and Alon Gubkin. In April 2021, the company raised a $5 million seed round for its monitoring platform for ML models. In February 2022, the company closed a Series A round of $25 million for its ML observability platform. Aporia was named by Forbes as the Next Billion-Dollar Company in June 2022. In November, the company partnered with ClearML, an MLOPs platform, to improve ML pipeline optimization. In January 2023, Aporia launched Direct Data Connectors, a novel technology allowing organizations to monitor their ML models in minutes (previously the process of integrating ML monitoring into a customer’s cloud environment took weeks or more.) DDC (Direct Data Connectors) enables users to connect Aporia to their preferred data source and monitor all of their data at once, without data sampling or data duplication (which is a huge security risk for major organizations. In April 2023, Aporia announced the company partnered with Amazon Web Services (AWS) to provide more reliable ML observability to AWS consumers by deploying Aporia's architecture to their AWS environment, this will allow customers to monitor their models in production regardless of platform.

    Read more →
  • EfficientNet

    EfficientNet

    EfficientNet is a family of convolutional neural networks (CNNs) for computer vision published by researchers at Google AI in 2019. Its key innovation is compound scaling, which uniformly scales all dimensions of depth, width, and resolution using a single parameter. EfficientNet models have been adopted in various computer vision tasks, including image classification, object detection, and segmentation. == Compound scaling == EfficientNet introduces compound scaling, which, instead of scaling one dimension of the network at a time, such as depth (number of layers), width (number of channels), or resolution (input image size), uses a compound coefficient ϕ {\displaystyle \phi } to scale all three dimensions simultaneously. Specifically, given a baseline network, the depth, width, and resolution are scaled according to the following equations: depth multiplier: d = α ϕ width multiplier: w = β ϕ resolution multiplier: r = γ ϕ {\displaystyle {\begin{aligned}{\text{depth multiplier: }}d&=\alpha ^{\phi }\\{\text{width multiplier: }}w&=\beta ^{\phi }\\{\text{resolution multiplier: }}r&=\gamma ^{\phi }\end{aligned}}} subject to α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} and α ≥ 1 , β ≥ 1 , γ ≥ 1 {\displaystyle \alpha \geq 1,\beta \geq 1,\gamma \geq 1} . The α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} condition is such that increasing ϕ {\displaystyle \phi } by a factor of ϕ 0 {\displaystyle \phi _{0}} would increase the total FLOPs of running the network on an image approximately 2 ϕ 0 {\displaystyle 2^{\phi _{0}}} times. The hyperparameters α {\displaystyle \alpha } , β {\displaystyle \beta } , and γ {\displaystyle \gamma } are determined by a small grid search. The original paper suggested 1.2, 1.1, and 1.15, respectively. Architecturally, they optimized the choice of modules by neural architecture search (NAS), and found that the inverted bottleneck convolution (which they called MBConv) used in MobileNet worked well. The EfficientNet family is a stack of MBConv layers, with shapes determined by the compound scaling. The original publication consisted of 8 models, from EfficientNet-B0 to EfficientNet-B7, with increasing model size and accuracy. EfficientNet-B0 is the baseline network, and subsequent models are obtained by scaling the baseline network by increasing ϕ {\displaystyle \phi } . == Variants == EfficientNet has been adapted for fast inference on edge TPUs and centralized TPU or GPU clusters by NAS. EfficientNet V2 was published in June 2021. The architecture was improved by further NAS search with more types of convolutional layers. It also introduced a training method, which progressively increases image size during training, and uses regularization techniques like dropout, RandAugment, and Mixup. The authors claim this approach mitigates accuracy drops often associated with progressive resizing.

    Read more →
  • Micro stuttering

    Micro stuttering

    Micro stuttering is a visual artifact in real-time computer graphics in which the time intervals between consecutively displayed frames are uneven, even though the average frame rate reported by benchmarking software appears adequate. Tools such as 3DMark typically compute frame rates over intervals of one second or more, which can conceal momentary drops in the instantaneous frame rate that the viewer perceives as hitching or jerking of on-screen motion. At low frame rates the effect is visible as a stutter in moving images, degrading the experience in interactive applications such as video games. In severe cases a lower but more consistent frame rate can appear smoother than a higher but more erratic one. The term gained prominence in the late 2000s in discussions of multi-GPU rendering (see History), but micro stuttering also affects single-GPU systems. Common causes on modern hardware include real-time shader compilation, asset streaming from storage, VRAM exhaustion, and driver bugs. == Causes == === Shader compilation === A common cause of micro stuttering on modern PCs is real-time shader compilation. Shaders are small programs that instruct the GPU on how to render visual effects such as lighting, shadows, and reflections. On consoles, developers can pre-compile all shaders for the known, fixed hardware. On PCs, the variety of GPU architectures means shaders must often be compiled at run time, either when the game launches or during gameplay itself. When the rendering engine encounters a shader that has not yet been compiled, the CPU must finish the compilation before the GPU can draw the affected object. This causes a spike in frame time that the player perceives as a hitch. The problem has been particularly associated with games built on Unreal Engine 4 running under DirectX 12, because DX12 shifts more shader management responsibility to the application. Several techniques exist to reduce shader compilation stutter. Pipeline State Object (PSO) pre-caching records the shader permutations used at runtime so that they can be compiled in advance on subsequent launches. Asynchronous shader compilation moves the work to background CPU threads to avoid blocking the main rendering thread. Platform-level services such as Steam's shader pre-caching distribute previously compiled shaders to users with matching GPU hardware. The Steam Deck, which contains a single fixed GPU, benefits from pre-compiled shader caches because all units share the same hardware configuration. === Other causes === Micro stuttering on single-GPU systems can have several additional causes. CPU bottlenecks or scheduling interruptions from background tasks can prevent the processor from preparing frames at regular intervals. Asset streaming during gameplay (loading textures, geometry, or audio from storage) can produce hitches sometimes called traversal stutter; the use of solid-state drives and technologies such as DirectStorage has reduced but not eliminated this. VRAM exhaustion forces data to be swapped between video memory and system memory over the PCI Express bus, which is slower. Graphics driver bugs can also introduce stutter; Nvidia released hotfix driver 551.46 in February 2024 to correct intermittent micro stuttering when V-Sync was enabled. == Measurement == Micro stuttering drew attention to the limitations of average frame rate as a performance metric. In 2013, Scott Wasson at The Tech Report published a series of articles advocating frame time analysis, in which the delivery time of every individual frame is recorded and plotted rather than collapsed into a single frames-per-second figure. This approach was adopted by other hardware review publications in the following years. GPU reviews now routinely report 1% low and 0.1% low frame rates alongside the average. The 1% low is the average frame rate of the slowest 1% of frames in a sample; it serves as an indicator of worst-case smoothness. A large gap between the average and the 1% low suggests poor frame pacing. Tools for capturing per-frame timing data include FRAPS, PresentMon, OCAT, CapFrameX, and MSI Afterburner with RivaTuner Statistics Server. == Mitigation == === Frame pacing === Frame pacing is a software technique that regulates the timing of frame delivery to produce even intervals between displayed frames. Game engines, GPU drivers, and platform libraries all implement frame pacing strategies to varying degrees. On mobile platforms, Google provides the Android Frame Pacing library (Swappy) as part of the Android Game Development Kit. In December 2025, the Khronos Group published the VK_EXT_present_timing Vulkan extension, giving developers explicit control over presentation timing in a cross-platform graphics API for the first time. === Variable refresh rate === Variable refresh rate (VRR) display technologies allow a monitor's refresh rate to change to match the GPU's frame output. Implementations include Nvidia G-Sync (2013), AMD FreeSync (2015), and the VESA Adaptive-Sync standard built into DisplayPort 1.2a and later. VRR eliminates the screen tearing that results from a mismatch between frame rate and refresh rate, and avoids the frame-holding behaviour of V-Sync that can itself cause stutter. It is effective at smoothing moderate frame rate fluctuations but cannot compensate for large sudden spikes in frame time such as those caused by shader compilation or heavy asset streaming. VRR support has become standard in gaming monitors, televisions (via HDMI 2.1), and the Xbox Series X/S and PlayStation 5 consoles. === Frame generation === Beginning with DLSS 3 on the GeForce RTX 40 series in 2022, Nvidia introduced AI-based frame generation, which uses dedicated optical flow hardware and a neural network to create new frames between traditionally rendered ones. AMD followed with FSR 3 in 2023, using an algorithmic approach, and the AI-based FSR 4 for the Radeon RX 9000 series in 2025. DLSS 4, released in January 2025 for the GeForce RTX 50 series, can generate up to three frames per rendered frame using a technique called Multi Frame Generation. Frame generation increases the displayed frame rate but introduces its own frame pacing concerns. If the underlying rendered frames are unevenly timed, the interpolated frames can make the unevenness more apparent rather than less. DLSS 4 addresses this with hardware-level flip metering on the GPU's display engine, which controls the timing of frame presentation more precisely than the CPU-based pacing used in DLSS 3. Both vendors pair frame generation with latency-reduction features (Nvidia Reflex and AMD Anti-Lag+) to offset the additional input latency that results from inserting synthetic frames into the pipeline. === Frame rate limiters === Capping the frame rate below the display's maximum refresh rate, using tools such as RivaTuner Statistics Server, in-game limiters, or driver-level settings, is a common way to improve frame pacing. Preventing the GPU from running ahead of the display reduces variability in frame delivery times and can produce a smoother result than an uncapped but more irregular frame rate. == History == === Multi-GPU configurations === Micro stuttering was first widely documented in the late 2000s as a side effect of multi-GPU configurations using Alternate Frame Rendering (AFR), in which consecutive frames are assigned to alternating GPUs. Because each GPU may take a different amount of time to complete its assigned frame — due to varying scene complexity, driver scheduling, or inter-GPU communication overhead — the resulting frame delivery is irregular even when the average frame rate is high. Both Nvidia SLI and AMD CrossFireX were affected, with dual-GPU setups exhibiting the worst frame pacing irregularities. In 2012 benchmarks using Battlefield 3, dual Radeon HD 7970 cards in CrossFire showed 85% variation in frame delivery times compared with 7% for a single card, while dual GeForce GTX 680 cards in SLI showed only 7% variation compared with 5% for a single card. Multi-GPU micro stuttering became a significant factor in the eventual decline and discontinuation of consumer multi-GPU gaming. Nvidia restricted SLI to a handful of enthusiast-class cards from the GeForce 10 series onward, then replaced it with NVLink on the GeForce RTX 20 series, which saw limited gaming adoption. AMD ceased active CrossFire development around 2017. By the mid-2020s, neither vendor's current consumer GPUs support multi-GPU rendering for games. Other factors that contributed to the decline include DirectX 12 placing multi-GPU support in the hands of game developers rather than driver authors, the incompatibility of temporal anti-aliasing and other temporal rendering techniques with AFR, and the increasing size, power draw, and cost of individual GPUs. The third-party utility RadeonPro could reduce CrossFire micro stuttering through dynamic V-Sync and frame pacing adjustments, and AMD later introduced a driver-level frame paci

    Read more →
  • Human–AI interaction

    Human–AI interaction

    Human–AI interaction is a developing field of research and a sub-field of human–computer interaction (HCI). HCI is a field of research that explores the interactions between humans and computer-based technology, focusing on design implementation, user experience, and psychological factors. With the proliferation of artificial intelligence (AI), there has developed a sub-section of HCI research dedicated specifically to artificial intelligence and how people interact with and are impacted by it. This is human–AI interaction, abbreviated either as HAX or HAII. == Introduction == Artificial intelligence (AI), in general, has fluid definitions and varied research applications, but in brief can be applied to mechanizing tasks that would require human intelligence to complete. AI are tools designed to replicate the human abilities of navigating uncertainty, active learning, and processing information in different contexts. Within the context of HCI and HAX research, artificial intelligence can be broken into two sub-fields, natural language processing (NLP) and computer vision (CV). AI technologies notably include machine-learning, deep-learning and neural networks, and large-language models (LLMs). As a new and rapidly developing technology, AI is changing how computers work and therefore changing how humans interact with computers. Unlike the traditional human-computer interaction, where a human directs a machine, human-AI interaction is characterized by a more collaborative relationship between the computer program (the AI) and the human user, as AI is perceived as an active agent rather than a tool. This changing dynamic creates new questions and necessitates new research methods that are not present in traditional HCI research. According to a scoping review on the state of the discipline, the HAX field comprises research on the "design, development, and evaluation of AI systems" and encompasses the themes of human-AI collaboration, human-AI competition, human-AI conflict, and human-AI symbiosis. == Design == Machine learning and artificial intelligence have been used for decades in targeted advertising and to recommend content in social media. Ethical Guidelines (Framework for ethical AI development) == User Experience (UX) == This section should handle research on how users interact with tools. What techniques do they use, do they develop habits, what types of programs and devices are they using to access these tools, what do they use these tools to do exactly. === Cognitive Frameworks in AI Tool Users === AI has been viewed with various expectations, attributions, and often misconceptions. Many people exclusively understand AI as the LLM chatbots they interact with, like ChatGPT or Claude, or other generative AI programs. [Insert section: discuss how people interact with these specific AI tools as a connection to the following paragraphs] Most fundamentally, humans have a mental model of understanding AI's reasoning and motivation for its decision recommendations, and building a holistic and precise mental model of AI helps people create prompts to receive more valuable responses from AI. However, these mental models are not whole because people can only gain more information about AI through their limited interaction with it; more interaction with AI builds a better mental model that a person may build to produce better prompt outcomes. Research on human-AI interaction has emphasized that users develop mental models of AI systems and revise those models through repeated use, feedback, and explanation, while design research has stressed the importance of communicating capabilities and limitations early and supporting trust calibration through explanation and correction. In a 2025 SSRN working paper, John DeVadoss proposed "Hypothetico-Deductive Interaction" (HDI), a framework that describes human-AI interaction as a mutual process of conjecture and refutation in which users test assumptions about an AI system's capabilities while the system infers and updates assumptions about user goals through its responses and clarifying questions. DeVadoss argued that this framing helps explain prompt iteration, weak capability awareness, and trust miscalibration, and suggested design responses such as clearer communication of uncertainty, easier correction, actionable explanations, and safer failure modes. == Research themes == === Human-AI collaboration === Human-AI collaboration occurs when the human and AI supervise the task on the same level and extent to achieve the same goal. Some collaboration occurs in the form of augmenting human capability. AI may help human ability in analysis and decision-making through providing and weighing a volume of information, and learning to defer to the human decision when it recognizes its unreliability. It is especially beneficial when the human can detect a task that AI can be trusted to make few errors so that there is not a lot of excessive checking process required on the human's end. Some findings show signs of human-AI augmentation, or human–AI symbiosis, in which AI enhances human ability in a way that co-working on a task with AI produces better outcomes than a human working alone. For example: the quality and speed of customer service tasks increase when a human agent collaborates with AI, training on specific models allows AI to improve diagnoses in clinical settings, and AI with human-intervention can improve creativity of artwork while fully AI-generated haikus were rated negatively. Human-AI synergy, a concept in which human-AI collaboration would produce more optimal outcomes than either human or AI working alone could explain why AI does not always help with performance. Some AI features and development may accelerate human-AI synergy, while others may stagnate it. For example, when AI updates for better performance, it sometimes worsens the team performance with human and AI by reducing the compatibility with the new model and the mental model a user has developed on the previous version. Research has found that AI often supports human capabilities in the form of human-AI augmentation and not human-AI synergy, potentially because people rely too much on AI and stop thinking on their own. Prompting people to actively engage in analysis and think when to follow AI recommendations reduces their over-reliance, especially for individuals with higher need for cognition. === Human-AI competition === Robots and computers have substituted routine tasks historically completed by humans, but agentic AI has made it possible to also replace cognitive tasks including taking phone calls for appointments and driving a car. At the point of 2016, research has estimated that 45% of paid activities could be replaced by AI by 2030. Perceived autonomy of robots is known to increase people's negative attitude toward them, and worry about the technology taking over leads people to reject it. There has been a consistent tendency of algorithm aversion in which people prefer human advice over AI advice. However, people are not always able to tell apart tasks completed by AI or other humans. See AI takeover for more information. It is also notable that this sentiment is more prominent in the Western cultures as Westerners tend to show less positive views about AI compared to East Asians. == Research on the psychological impacts of AI == === Perception on others who use AI === As much as people perceive and make judgment about AI itself, they also form impressions of themselves and others who use AI. In the workplace, employees who disclose the use of AI in their tasks are more likely to receive feedback that they are not as hardworking as those who are in the same job who receive non-AI help to complete the same tasks. AI use disclosure diminishes the perceived legitimacy in the employee's task and decision making which ultimately leads observers to distrust people who use AI. Although these negative effects of AI use disclosure are weakened by the observers who use AI frequently themselves, the effect is still not attenuated by the observers' positive attitude towards AI. === Bias, AI, and human === Although AI provides a wide range of information and suggestions to its users, AI itself is not free of biases and stereotypes, and it does not always help people reduce their cognitive errors and biases. People are prone to such errors by failing to see other potential ideas and cases that are not listed by AI responses and committing to a decision suggested by AI that directly contradicts the correct information and directions that they are already aware of. Gender bias is also reflected as the female gendering of AI technologies which conceptualizes females as a helpful assistant. == Emotional connection with AI == Human-AI interaction has been theorized in the context of interpersonal relationships mainly in social psychology, communications and media studies, and as a technology interface through the lens of hu

    Read more →
  • AI warfare

    AI warfare

    AI warfare refers to the use of artificial intelligence technologies to automate military operation and enhance or bypass human decision-making in armed conflicts. AI is used to rapidly analyze large volumes of military intelligence data, including making recommendations or decisions on who and what to target. Abdul-Rahman al-Rawi, a 20-year-old student, was the first acknowledged civilian killed by AI-assisted airstrike in a U.S. strike in Iraq in 2024. In 2026, the U.S. declared it would become an 'AI-first' warfighting force. Husain et al (2018) coined the term hyperwar to refer to warfare which is algorithmic or controlled by artificial intelligence, with little to no human decision-making. == 2026 Iran war == The 2026 Iran war has been described as the "first AI war", although the Untied States and Israel have previously used AI to identify targets during the Gaza war. The U.S. has used AI tools to attack Iran. These tools have been used for military intelligence, targeting, and damage assessment in the war in Iran. Using the Maven smart system, the U.S. attacked 1,000 targets in the first 24 hours of the war and 5,000 targets over the course of 10 days. While the U.S. had used Maven in 2022 to share targeting information with Ukraine and strike against Iraq, Syria, and against the Houthis in 2024, Iran's attacks are its biggest. Authorities are looking into whether artificial intelligence was involved in the airstrike on an Iranian girls' school that killed 170 civilians, the majority of whom were female students. The United States Central Command emphasized that humans were making final targeting decisions. Per a White House tally released on April 8, the U.S. military hit over 13,000 targets in Iran during the war's first 38 days, including more than 2,000 command-and-control sites, 1,500 air defense targets, and 1,450 industrial infrastructure targets. == Gaza war == As part of the Gaza war, the Israel Defense Forces (IDF) have used artificial intelligence to rapidly and automatically perform much of the process of determining what to bomb. IDF's Unit 8200 developed AI systems, dubbed the Gospel and Lavender, to find targets for the Israeli Air Force to bomb. The Gospel automatically provides targeting recommendations to human analysts, who decide whether to approve strikes. Lavender identified 37,000 Hamas-linked individuals early in the war, and was used alongside the Gospel, which chooses buildings or structures as targets. According to a report by +972 Magazine and Local Call, strikes assisted by Lavender were routinely permitted to kill 5–20 civilians for each suspected Hamas militant, who were often bombed at home with their families. The IDF denies these claims, maintaining that every strike is assessed to minimize collateral damage, and that there is no policy "to kill tens of thousands of people in their homes." Israel deployed AI technologies during the Gaza war for audio analysis, facial recognition, and airstrike targeting. One such system was used to help identify the location of Hamas commander Ibrahim Biari through phone call analysis, leading to strikes that killed him as well as more than 125 civilians. == 2022 Russian Ukraine war == Kyiv launched a project with Palantir called Brave1 Dataroom to build AI systems using the extensive combat data Ukraine has gathered since Russia’s full-scale invasion in 2022. The country has also created tools for in-depth airstrike analysis, introduced AI to process large volumes of intelligence, and incorporated these technologies into the planning of long-range strike operations. == Involved companies == Maven Smart System is developed by Palantir. It integrates Anthropic's Claude as its large language model, and uses Amazon's AWS servers as its cloud infrastructure. Since Anthropic's refusal to support autonomous weapons development and domestic surveillance efforts. In its place, other AI firms, including OpenAI, have been brought in to take over that role. == Involved state actors == In 2024, the United States Department of Defense had 800-plus active AI-related projects and requested $1.8 billion in AI funding, with Project Maven and Project Artemis (AI-resistant drones developed together with Ukraine) being the main ones. The technology has been used in Iran, Iraq, Syria and Yemen to identify targets. China is pursuing intelligentized warfare, integrating AI across all combat domains—land, sea, air, space, and cyber—with military AI spending exceeding $1.6 billion annually. == International regulation == Since 2014, states meeting within the framework of the Convention on Certain Conventional Weapons have discussed lethal autonomous weapon systems. In 2016, the treaty's states parties established an open-ended Group of Governmental Experts on Lethal Autonomous Weapons Systems to continue those discussions. The discussions have addressed international humanitarian law, accountability, possible prohibitions and regulations, and the extent of human control required over AI-enabled weapons.

    Read more →
  • Semantic analysis (machine learning)

    Semantic analysis (machine learning)

    In machine learning, semantic analysis of a text corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents. Semantic analysis strategies include: Metalanguages based on first-order logic, which can analyze the speech of humans. Understanding the semantics of a text is symbol grounding: if language is grounded, it is equal to recognizing a machine-readable meaning. For the restricted domain of spatial analysis, a computer-based language understanding system was demonstrated. Latent semantic analysis (LSA), a class of techniques where documents are represented as vectors in a term space. A prominent example is probabilistic latent semantic analysis (PLSA). Latent Dirichlet allocation, which involves attributing document terms to topics. n-grams and hidden Markov models, which work by representing the term stream as a Markov chain, in which each term is derived from preceding terms. == Stochastic semantic analysis ==

    Read more →
  • Image texture

    Image texture

    An image texture is the small-scale structure perceived on an image, based on the spatial arrangement of color or intensities. It can be quantified by a set of metrics calculated in image processing. Image texture metrics give us information about the whole image or selected regions. Image textures can be artificially created or found in natural scenes captured in an image. Image textures are one way that can be used to help in segmentation or classification of images. For more accurate segmentation the most useful features are spatial frequency and an average grey level. To analyze an image texture in computer graphics, there are two ways to approach the issue: structured approach and statistical approach. == Structured approach == A structured approach sees an image texture as a set of primitive texels in some regular or repeated pattern. This works well when analyzing artificial textures. To obtain a structured description a characterization of the spatial relationship of the texels is gathered by using Voronoi tessellation of the texels. == Statistical approach == A statistical approach sees an image texture as a quantitative measure of the arrangement of intensities in a region. In general this approach is easier to compute and is more widely used, since natural textures are made of patterns of irregular subelements. === Edge detection === The use of edge detection is to determine the number of edge pixels in a specified region, helps determine a characteristic of texture complexity. After edges have been found the direction of the edges can also be applied as a characteristic of texture and can be useful in determining patterns in the texture. These directions can be represented as an average or in a histogram. Consider a region with N pixels. the gradient-based edge detector is applied to this region by producing two outputs for each pixel p: the gradient magnitude Mag(p) and the gradient direction Dir(p). The edgeness per unit area can be defined by F e d g e n e s s = | { p | M a g ( p ) > T } | N {\displaystyle F_{edgeness}={\frac {|\{p|Mag(p)>T\}|}{N}}} for some threshold T. To include orientation with edgeness histograms for both gradient magnitude and gradient direction can be used. Hmag(R) denotes the normalized histogram of gradient magnitudes of region R, and Hdir(R) denotes the normalized histogram of gradient orientations of region R. Both are normalized according to the size NR Then F m a g , d i r = ( H m a g ( R ) , H d i r ( R ) ) {\displaystyle F_{mag,dir}=(H_{mag}(R),H_{dir}(R))} is a quantitative texture description of region R. === Co-occurrence matrices === The co-occurrence matrix captures numerical features of a texture using spatial relations of similar gray tones. Numerical features computed from the co-occurrence matrix can be used to represent, compare, and classify textures. The following are a subset of standard features derivable from a normalized co-occurrence matrix: A n g u l a r 2 n d M o m e n t = ∑ i ∑ j p [ i , j ] 2 C o n t r a s t = ∑ i = 1 N g ∑ j = 1 N g n 2 p [ i , j ] , where | i − j | = n C o r r e l a t i o n = ∑ i = 1 N g ∑ j = 1 N g ( i j ) p [ i , j ] − μ x μ y σ x σ y E n t r o p y = − ∑ i ∑ j p [ i , j ] l n ( p [ i , j ] ) {\displaystyle {\begin{aligned}Angular{\text{ }}2nd{\text{ }}Moment&=\sum _{i}\sum _{j}p[i,j]^{2}\\Contrast&=\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}n^{2}p[i,j]{\text{, where }}|i-j|=n\\Correlation&={\frac {\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}(ij)p[i,j]-\mu _{x}\mu _{y}}{\sigma _{x}\sigma _{y}}}\\Entropy&=-\sum _{i}\sum _{j}p[i,j]ln(p[i,j])\\\end{aligned}}} where p [ i , j ] {\displaystyle p[i,j]} is the [ i , j ] {\displaystyle [i,j]} th entry in a gray-tone spatial dependence matrix, and Ng is the number of distinct gray-levels in the quantized image. One negative aspect of the co-occurrence matrix is that the extracted features do not necessarily correspond to visual perception. It is used in dentistry for the objective evaluation of lesions [DOI: 10.1155/2020/8831161], treatment efficacy [DOI: 10.3390/ma13163614; DOI: 10.11607/jomi.5686; DOI: 10.3390/ma13173854; DOI: 10.3390/ma13132935] and bone reconstruction during healing [DOI: 10.5114/aoms.2013.33557; DOI: 10.1259/dmfr/22185098; EID: 2-s2.0-81455161223; DOI: 10.3390/ma13163649]. === Laws texture energy measures === Another approach is to use local masks to detect various types of texture features. Laws originally used four vectors representing texture features to create sixteen 2D masks from the outer products of the pairs of vectors. The four vectors and relevant features were as follows: L5 = [ +1 +4 6 +4 +1 ] (Level) E5 = [ -1 -2 0 +2 +1 ] (Edge) S5 = [ -1 0 2 0 -1 ] (Spot) R5 = [ +1 -4 6 -4 +1 ] (Ripple) To these 4, a fifth is sometimes added: W5 = [ -1 +2 0 -2 +1 ] (Wave) From Laws' 4 vectors, 16 5x5 "energy maps" are then filtered down to 9 in order to remove certain symmetric pairs. For instance, L5E5 measures vertical edge content and E5L5 measures horizontal edge content. The average of these two measures is the "edginess" of the content. The resulting 9 maps used by Laws are as follows: L5E5/E5L5 L5R5/R5L5 E5S5/S5E5 S5S5 R5R5 L5S5/S5L5 E5E5 E5R5/R5E5 S5R5/R5S5 Running each of these nine maps over an image to create a new image of the value of the origin ([2,2]) results in 9 "energy maps," or conceptually an image with each pixel associated with a vector of 9 texture attributes. === Autocorrelation and power spectrum === The autocorrelation function of an image can be used to detect repetitive patterns of textures. == Texture segmentation == The use of image texture can be used as a description for regions into segments. There are two main types of segmentation based on image texture, region based and boundary based. Though image texture is not a perfect measure for segmentation it is used along with other measures, such as color, that helps solve segmenting in image. === Region based === Attempts to group or cluster pixels based on texture properties. === Boundary based === Attempts to group or cluster pixels based on edges between pixels that come from different texture properties.

    Read more →
  • Double descent

    Double descent

    Double descent in statistics and machine learning is the phenomenon where a model's error rate on the test set initially decreases with the number of parameters, then peaks, then decreases again. This phenomenon has been considered surprising, as it contradicts assumptions about overfitting in classical machine learning. The increase usually occurs near the interpolation threshold, where the number of parameters is the same as the number of training data points (the model is just large enough to fit the training data). Or, more precisely, it is the maximum number of samples on which the model/training procedure achieves approximately on average 0 training error. == History == Early observations of what would later be called double descent in specific models date back to 1989. The term "double descent" was coined by Belkin et. al. in 2019, when the phenomenon gained popularity as a broader concept exhibited by many models. The latter development was prompted by a perceived contradiction between the conventional wisdom that too many parameters in the model result in a significant overfitting error (an extrapolation of the bias–variance tradeoff), and the empirical observations in the 2010s that some modern machine learning techniques tend to perform better with larger models. == Theoretical models == Double descent occurs in linear regression with isotropic Gaussian covariates and isotropic Gaussian noise. A model of double descent at the thermodynamic limit has been analyzed using the replica trick, and the result has been confirmed numerically. A number of works have suggested that double descent can be explained using the concept of effective dimension: While a network may have a large number of parameters, in practice only a subset of those parameters are relevant for generalization performance, as measured by the local Hessian curvature. This explanation is formalized through PAC-Bayes compression-based generalization bounds, which show that less complex models are expected to generalize better under a Solomonoff prior.

    Read more →
  • Feature engineering

    Feature engineering

    Feature engineering is a preprocessing step in supervised machine learning and statistical modeling which transforms raw data into a more effective set of inputs. Each input comprises several attributes, known as features. By providing models with relevant information, feature engineering significantly enhances their predictive accuracy and decision-making capability. Beyond machine learning, the principles of feature engineering are applied in various scientific fields, including physics. For example, physicists construct dimensionless numbers such as the Reynolds number in fluid dynamics, the Nusselt number in heat transfer, and the Archimedes number in sedimentation. They also develop first approximations of solutions, such as analytical solutions for the strength of materials in mechanics. == Clustering == One of the applications of feature engineering has been clustering of feature-objects or sample-objects in a dataset. Especially, feature engineering based on matrix decomposition has been extensively used for data clustering under non-negativity constraints on the feature coefficients. These include Non-Negative Matrix Factorization (NMF), Non-Negative Matrix-Tri Factorization (NMTF), Non-Negative Tensor Decomposition/Factorization (NTF/NTD), etc. The non-negativity constraints on coefficients of the feature vectors mined by the above-stated algorithms yields a part-based representation, and different factor matrices exhibit natural clustering properties. Several extensions of the above-stated feature engineering methods have been reported in literature, including orthogonality-constrained factorization for hard clustering, and manifold learning to overcome inherent issues with these algorithms. Other classes of feature engineering algorithms include leveraging a common hidden structure across multiple inter-related datasets to obtain a consensus (common) clustering scheme. An example is Multi-view Classification based on Consensus Matrix Decomposition (MCMD), which mines a common clustering scheme across multiple datasets. MCMD is designed to output two types of class labels (scale-variant and scale-invariant clustering), and: is computationally robust to missing information, can obtain shape- and scale-based outliers, and can handle high-dimensional data effectively. Coupled matrix and tensor decompositions are popular in multi-view feature engineering. == Predictive modelling == Feature engineering in machine learning and statistical modeling involves selecting, creating, transforming, and extracting data features. Key components include feature creation from existing data, transforming and imputing missing or invalid features, reducing data dimensionality through methods like Principal Components Analysis (PCA), Independent Component Analysis (ICA), and Linear Discriminant Analysis (LDA), and selecting the most relevant features for model training based on importance scores and correlation matrices. Features vary in significance. Even relatively insignificant features may contribute to a model. Feature selection can reduce the number of features to prevent a model from becoming too specific to the training data set (overfitting). Feature explosion occurs when the number of identified features is too large for effective model estimation or optimization. Common causes include: Feature templates - implementing feature templates instead of coding new features Feature combinations - combinations that cannot be represented by a linear system Feature explosion can be limited via techniques such as regularization, kernel methods, and feature selection. == Automation == Automation of feature engineering is a research topic that dates back to the 1990s. Machine learning software that incorporates automated feature engineering has been commercially available since 2016. Related academic literature can be roughly separated into two types: Multi-relational Decision Tree Learning (MRDTL) uses a supervised algorithm that is similar to a decision tree. Deep Feature Synthesis uses simpler methods. === Multi-relational Decision Tree Learning (MRDTL) === Multi-relational Decision Tree Learning (MRDTL) extends traditional decision tree methods to relational databases, handling complex data relationships across tables. It innovatively uses selection graphs as decision nodes, refined systematically until a specific termination criterion is reached. Most MRDTL studies base implementations on relational databases, which results in many redundant operations. These redundancies can be reduced by using techniques such as tuple id propagation. === Open-source implementations === There are a number of open-source libraries and tools that automate feature engineering on relational data and time series: featuretools is a Python library for transforming time series and relational data into feature matrices for machine learning. MCMD: An open-source feature engineering algorithm for joint clustering of multiple datasets. OneBM or One-Button Machine combines feature transformations and feature selection on relational data with feature selection techniques. OneBM helps data scientists reduce data exploration time allowing them to try and error many ideas in short time. On the other hand, it enables non-experts, who are not familiar with data science, to quickly extract value from their data with a little effort, time, and cost. getML community is an open source tool for automated feature engineering on time series and relational data. It is implemented in C/C++ with a Python interface. It has been shown to be at least 60 times faster than tsflex, tsfresh, tsfel, featuretools or kats. tsfresh is a Python library for feature extraction on time series data. It evaluates the quality of the features using hypothesis testing. tsflex is an open source Python library for extracting features from time series data. Despite being 100% written in Python, it has been shown to be faster and more memory efficient than tsfresh, seglearn or tsfel. seglearn is an extension for multivariate, sequential time series data to the scikit-learn Python library. tsfel is a Python package for feature extraction on time series data. kats is a Python toolkit for analyzing time series data. === Deep feature synthesis === The deep feature synthesis (DFS) algorithm beat 615 of 906 human teams in a competition. == Feature stores == The feature store is where the features are stored and organized for the explicit purpose of being used to either train models (by data scientists) or make predictions (by applications that have a trained model). It is a central location where you can either create or update groups of features created from multiple different data sources, or create and update new datasets from those feature groups for training models or for use in applications that do not want to compute the features but just retrieve them when it needs them to make predictions. A feature store includes the ability to store code used to generate features, apply the code to raw data, and serve those features to models upon request. Useful capabilities include feature versioning and policies governing the circumstances under which features can be used. Feature stores can be standalone software tools or built into machine learning platforms. == Alternatives == Feature engineering can be a time-consuming and error-prone process, as it requires domain expertise and often involves trial and error. Deep learning algorithms may be used to process a large raw dataset without having to resort to feature engineering. However, deep learning algorithms still require careful preprocessing and cleaning of the input data. In addition, choosing the right architecture, hyperparameters, and optimization algorithm for a deep neural network can be a challenging and iterative process.

    Read more →
  • Random feature

    Random feature

    Random features (RF) are a technique used in machine learning to approximate kernel methods, introduced by Ali Rahimi and Ben Recht in their 2007 paper "Random Features for Large-Scale Kernel Machines", and extended by. RF uses a Monte Carlo approximation to kernel functions by randomly sampled feature maps. It is used for datasets that are too large for traditional kernel methods like support vector machine, kernel ridge regression, and gaussian process. == Mathematics == === Kernel method === Given a feature map ϕ : R d → V {\textstyle \phi :\mathbb {R} ^{d}\to V} , where V {\textstyle V} is a Hilbert space (more specifically, a reproducing kernel Hilbert space), the kernel trick replaces inner products in feature space ⟨ ϕ ( x i ) , ϕ ( x j ) ⟩ V {\displaystyle \langle \phi (x_{i}),\phi (x_{j})\rangle _{V}} by a kernel function k ( x i , x j ) : R d × R d → R {\displaystyle k(x_{i},x_{j}):\mathbb {R} ^{d}\times \mathbb {R} ^{d}\to \mathbb {R} } Kernel methods replaces linear operations in high-dimensional space by operations on the kernel matrix: K X := [ k ( x i , x j ) ] i , j ∈ 1 : N {\displaystyle K_{X}:=[k(x_{i},x_{j})]_{i,j\in 1:N}} where N {\textstyle N} is the number of data points. === Random kernel method === The problem with kernel methods is that the kernel matrix K X {\textstyle K_{X}} has size N × N {\textstyle N\times N} . This becomes computationally infeasible when N {\textstyle N} reaches the order of a million. The random kernel method replaces the kernel function k {\textstyle k} by an inner product in low-dimensional feature space R D {\textstyle \mathbb {R} ^{D}} : k ( x , y ) ≈ ⟨ z ( x ) , z ( y ) ⟩ {\displaystyle k(x,y)\approx \langle z(x),z(y)\rangle } where z {\textstyle z} is a randomly sampled feature map z : R d → R D {\textstyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{D}} . This converts kernel linear regression into linear regression in feature space, kernel SVM into SVM in feature space, etc. Since we have K X ≈ Z X T Z X {\displaystyle K_{X}\approx Z_{X}^{T}Z_{X}} where Z X = [ z ( x 1 ) , … , z ( x N ) ] {\displaystyle Z_{X}=[z(x_{1}),\dots ,z(x_{N})]} , these methods no longer involve matrices of size O ( N 2 ) {\textstyle O(N^{2})} , but only random feature matrices of size O ( D N ) {\textstyle O(DN)} . == Random Fourier feature == === Radial basis function kernel === The radial basis function (RBF) kernel on two samples x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} is defined as k ( x i , x j ) = exp ⁡ ( − ‖ x i − x j ‖ 2 2 σ 2 ) {\displaystyle k(x_{i},x_{j})=\exp \left(-{\frac {\|x_{i}-x_{j}\|^{2}}{2\sigma ^{2}}}\right)} where ‖ x i − x j ‖ 2 {\displaystyle \|x_{i}-x_{j}\|^{2}} is the squared Euclidean distance and σ {\displaystyle \sigma } is a free parameter defining the shape of the kernel. It can be approximated by a random Fourier feature map z : R d → R 2 D {\displaystyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{2D}} : z ( x ) := 1 D [ cos ⁡ ⟨ ω 1 , x ⟩ , sin ⁡ ⟨ ω 1 , x ⟩ , … , cos ⁡ ⟨ ω D , x ⟩ , sin ⁡ ⟨ ω D , x ⟩ ] T {\displaystyle z(x):={\frac {1}{\sqrt {D}}}[\cos \langle \omega _{1},x\rangle ,\sin \langle \omega _{1},x\rangle ,\ldots ,\cos \langle \omega _{D},x\rangle ,\sin \langle \omega _{D},x\rangle ]^{T}} where ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are IID samples from the multidimensional normal distribution N ( 0 , σ − 2 I ) {\displaystyle N(0,\sigma ^{-2}I)} . Since cos , sin {\displaystyle \cos ,\sin } are bounded, there is a stronger convergence guarantee by Hoeffding's inequality. === Random Fourier features === By Bochner's theorem, the above construction can be generalized to arbitrary positive definite shift-invariant kernel k ( x , y ) = k ( x − y ) {\displaystyle k(x,y)=k(x-y)} . Define its Fourier transform p ( ω ) = 1 2 π ∫ R d e − j ⟨ ω , Δ ⟩ k ( Δ ) d Δ {\displaystyle p(\omega )={\frac {1}{2\pi }}\int _{\mathbb {R} ^{d}}e^{-j\langle \omega ,\Delta \rangle }k(\Delta )d\Delta } then ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are sampled IID from the probability distribution with probability density p {\displaystyle p} . This applies for other kernels like the Laplace kernel and the Cauchy kernel. === Neural network interpretation === Given a random Fourier feature map z {\displaystyle z} , training the feature on a dataset by featurized linear regression is equivalent to fitting complex parameters θ 1 , … , θ D ∈ C {\displaystyle \theta _{1},\dots ,\theta _{D}\in \mathbb {C} } such that f θ ( x ) = R e ( ∑ k θ k e i ⟨ ω k , x ⟩ ) {\displaystyle f_{\theta }(x)=\mathrm {Re} \left(\sum _{k}\theta _{k}e^{i\langle \omega _{k},x\rangle }\right)} which is a neural network with a single hidden layer, with activation function t ↦ e i t {\displaystyle t\mapsto e^{it}} , zero bias, and the parameters in the first layer frozen. In the overparameterized case, when 2 D ≥ N {\displaystyle 2D\geq N} , the network linearly interpolates the dataset { ( x i , y i ) } i ∈ 1 : N {\displaystyle \{(x_{i},y_{i})\}_{i\in 1:N}} , and the network parameters is the least-norm solution: θ ^ = arg ⁡ min θ ∈ C D , f θ ( x k ) = y k ∀ k ∈ 1 : N ‖ θ ‖ {\displaystyle {\hat {\theta }}=\arg \min _{\theta \in \mathbb {C} ^{D},f_{\theta }(x_{k})=y_{k}\forall k\in 1:N}\|\theta \|} At the limit of D → ∞ {\displaystyle D\to \infty } , the L2 norm ‖ θ ^ ‖ → ‖ f K ‖ H {\displaystyle \|{\hat {\theta }}\|\to \|f_{K}\|_{H}} where f K {\displaystyle f_{K}} is the interpolating function obtained by the kernel regression with the original kernel, and ‖ ⋅ ‖ H {\displaystyle \|\cdot \|_{H}} is the norm in the reproducing kernel Hilbert space for the kernel. == Other examples == === Random binning features === A random binning features map partitions the input space using randomly shifted grids at randomly chosen resolutions and assigns to an input point a binary bit string that corresponds to the bins in which it falls. The grids are constructed so that the probability that two points x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} are assigned to the same bin is proportional to K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . The inner product between a pair of transformed points is proportional to the number of times the two points are binned together, and is therefore an unbiased estimate of K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . Since this mapping is not smooth and uses the proximity between input points, Random Binning Features works well for approximating kernels that depend only on the L 1 {\displaystyle L_{1}} distance between datapoints. === Orthogonal random features === Orthogonal random features uses a random orthogonal matrix instead of a random Fourier matrix. == Historical context == In NIPS 2006, deep learning had just become competitive with linear models like PCA and linear SVMs for large datasets, and people speculated about whether it could compete with kernel SVMs. However, there was no way to train kernel SVM on large datasets. The two authors developed the random feature method to train those. It was then found that the O ( 1 / D ) {\displaystyle O(1/D)} variance bound did not match practice: the variance bound predicts that approximation to within 0.01 {\displaystyle 0.01} requires D ∼ 10 4 {\displaystyle D\sim 10^{4}} , but in practice required only ∼ 10 2 {\displaystyle \sim 10^{2}} . Attempting to discover what caused this led to the subsequent two papers.

    Read more →
  • Transportation Economic Development Impact System

    Transportation Economic Development Impact System

    Transportation Economic Development Impact System (TREDIS) is an economic analysis system sold by consulting firm Economic Development Research Group that is used in planning major transportation investments in the US and Canada. The role of economic impact analysis and TREDIS in the transportation planning process is explained in guidebooks of the US Department of Transportation and the American Association of State Highway and Transportation Officials. TREDIS has been most commonly used for assessing the expected economic impacts of statewide highway programs, regional multi-modal plans and public transport investment. Its history and theoretical foundation are explained in peer reviewed journal articles. == How It Works == TREDIS has a series of modules that calculate different forms of impacts and benefits. One module is an accounting framework that calculates user benefits, including impacts on cargo transportation and commuting costs, based on transportation forecasting results. A second module calculates wider economic development benefits, including impacts on business productivity, economic development and multiplier effects from the input-output analysis. It applies an economic model to estimate impacts on jobs, income, gross regional product and business output, by sector of the economy. A third module applies cost-benefit analysis from alternative perspectives.

    Read more →
  • World model (artificial intelligence)

    World model (artificial intelligence)

    A world model in artificial intelligence is a machine learning system that builds an internal representation of an environment. The model predicts how that environment changes over time in response to actions. Researchers design world models to help agents plan, reason, and act without constant real-world trial and error. World models differ from systems that merely classify or generate outputs. They simulate dynamics such as physics, object interactions, and causality. Early ideas date to the 1990s. Modern versions power robots, autonomous driving, and interactive video generation. == History == Jürgen Schmidhuber introduced the term world model in machine learning in 1990. He proposed recurrent neural networks that predict future states from observations and use those predictions to train agents. David Ha and Schmidhuber revived the concept in a 2018 paper. Their agents learned to drive virtual cars and play video games inside self-generated simulations. Yann LeCun advanced the idea in a 2022 position paper titled "A Path Towards Autonomous Machine Intelligence". He argued that intelligence requires predictive models of the world rather than pure pattern matching. LeCun proposed the joint embedding predictive architecture (JEPA) as a practical foundation. LeCun and collaborators developed several JEPA variants. V-JEPA 2 reached state-of-the-art performance on video understanding and physical reasoning at the time. It supports zero-shot robot control in unfamiliar environments. Introduced in March 2026, LeWorldModel trains stably end-to-end from raw pixels and uses two loss terms and avoids hand-crafted heuristics. LeCun founded Advanced Machine Intelligence Labs in 2026 to further develop world models. Google DeepMind introduced Genie in 2024. The model learned interactive environments from unlabeled internet videos. Genie 2 followed in late 2024 and added three-dimensional generation. The Genie series set benchmarks for general-purpose simulation. Genie 3 was introduced in August 2025. It produces photorealistic, real-time interactive worlds from text prompts which are displayed at 24 frames per second and explored in real time with text or image prompts. The model supports persistent three-dimensional worlds and real-time interaction. Waymo adopted Genie 3 in February 2026 and used it to create a specialized world model for autonomous driving simulation, called the Waymo World Model. It produces synchronized camera and lidar outputs and creates edge cases that real robotaxis rarely encounter. The edge cases were reported to be unusual by PCMag. General Intuition announced a $133.7 million seed round. World Labs raised $1 billion. AMI raised $1.03 billion. In April 2026, Alibaba announced Happy Oyster, its world model designed for real-time and “flowy” world model. It includes a directing mode for world building based on text and image prompts and a wandering mode for exploring the resulting world. It can generate 3-minute in-world video clips. Also in April, World Labs, co-founded by Li Fei Fei, unveiled Spark 2.0, an open-source 3D Gaussian splatting rendering engine that targets smartphone-class devices. In June 2026, Nvidia released Cosmos 3, a family of open-weight models. It combines previously independent physical reasoning, world simulation, and action generation. Cosmos 3 integrates can process and generate text, image, video, audio, and action sequences. The model employs a Mixture-of-Transformers" (MoT) approach. An autoregressive (AR) transformer handles reasoning and next-token prediction, while a diffusion transformer (DT) does multimodal generation. Encoders (ViT for vision, VAE for visual/audio, and domain-specific for actions) and generate a shared representation space using 3D multi-dimensional rotary position embedding (mRoPE) for spatial and temporal information. The family includes Cosmos3-Nano (16B parameters) for workstations; Cosmos3-Super (64B parameters) for research. == Architecture == World models process raw sensory data such as video frames or lidar scans. They compress this input into compact latent representations. The system then predicts future representations rather than pixel-by-pixel reconstructions. Many modern world models use joint embedding predictive architecture (JEPA). An encoder turns observations into embeddings. A predictor estimates one or a suite of embeddings from the current one and an action. In some cases a critic chooses one embedding as the best result. A regularizer keeps embeddings well-behaved. The model trains by minimizing prediction error in embedding space. This approach avoids the high cost of generating every detail. Some architectures add explicit components. A fast reactive path handles immediate responses. A slower deliberative path performs longer-horizon planning. Video prediction accuracy or robot success rates are key metrics, but do not always predict real-world performance. Generative world models such as Genie 3 combine these with a simulator. They accept text prompts or layouts and output consistent video, lidar, or three-dimensional scenes. World models often train with self-supervised learning. They use large unlabeled datasets of video or robot interactions. Self-supervised learning can speed learning. Reinforcement learning can fine-tune a model for specific tasks. == Applications == World models support robot learning. Agents train inside simulations and transfer skills to the physical world. This reduces the need for dangerous or expensive real-world trials. Autonomous vehicles use world models to test rare events. Waymo's system simulates tornadoes or unusual pedestrian behavior. Companies train planners without putting vehicles on public roads. Interactive entertainment benefits from world models. Genie 3 lets users generate playable environments from simple descriptions. Game studios prototype levels faster. Scientific simulation gains from these models. Researchers model physical systems or biological processes at scale. Planners in logistics or urban design test strategies inside accurate digital twins. == Comparison with large language models == Both world models and large language models (LLMs) use inferencing on their inputs to make predictions. LLMs operate on textual inputs. They predict the next token in text sequences. They excel at language-oriented tasks such as translation or summarization. However, they lack understanding of physics. World models operate on sensor inputs such as pixels. They predict state changes in that data in latent space. This design supports planning and causal reasoning. LLMs generate fluent text but often fail at consistent physical predictions. Their architecture employs transformers with refinements such as mixture of experts. World models divide an inferencing task into work performed by encoders, predictors, simulators, and other pieces. They typically handle multimodal inputs such as video, lidar, radar, and audio, guided by textual prompting. LLMs power chatbots and code assistants. World models drive embodied agents that act in dynamic environments, such as autonomous driving. The two may be combined in hybrid systems. For example, a LLM handles instructions, while a world model manages low-level control. World model proponents such as LeCun claim that because LLMs are trained only on text, they have no ability to predict anything beyond text, such as real-world events. == Benchmarks == World model benchmarks test physical understanding, long-term consistency, planning, and generalization from sensor data. Meta introduced three benchmarks for V-JEPA 2. IntPhys 2 measures a model's ability to detect physics violations. It presents pairs of videos that diverge when one breaks physical rules. Humans score near 100% accuracy. V-JEPA 2 achieves little better than random chance on many conditions. Minimal Video Pairs (MVPBench) tests physical understanding through multiple-choice questions based on short video clips. It probes object interactions and causality. Something-Something tests action recognition. Epic-Kitchens-100 tests human action anticipation. DeepMind benchmark: Interactive evaluation measures consistency over minutes of interaction, memory of off-screen objects, and response to user actions or text prompts. Waymo benchmark: Output generation quality: Metrics include realism, controllability (via text prompts), and usefulness for training planners in simulated worlds. However, pixel reconstruction error rate with episodic rewards often fails. Other: Epic-Kitchens-100 (often measured with Recall@5) Ego4D 50 Salads, Breakfast, etc. Potential benchmarks: Zero-shot transfer to robots Long-horizon planning Implausible prediction rate

    Read more →
  • H (company)

    H (company)

    H Company, also known simply as H, is a French artificial intelligence startup which develops "action-oriented" artificial intelligence agents for enterprise automation and productivity. In May 2024, H Company closed a record-setting $220 million seed round, at the time the largest AI raise in Europe. In 2026, H Company released Holo 3, the latest generation of its computer-use AI models. The update marked a major advance in agentic AI, enabling agents to navigate any user interface, interpret screens, and complete complex, multi-step tasks across enterprise systems—much like a human user. This breakthrough positioned H Company at the frontier of computer-use autonomy, accelerating the integration of AI in enterprise workflows. == History == H Company was founded in 2023 in Paris by Laurent Sifre, Charles Kantor, and three DeepMind veterans: Daan Wiestra, Karl Tuyls, Julien Perollat. In May 2024, the firm secured what was then the largest European AI seed round, totaling $220 million led by US investors including Eric Schmidt (former Google CEO), Amazon, and backed by Accel, Bpifrance, UiPath, Eurazeo, Xavier Niel, Yuri Milner, Bernard Arnault, Samsung and others. In August 2024, three cofounders (Wiestra, Tuyls, Perollat) left the company over operational disagreements. In November 2024, H launched Runner H, its first agentic-API platform, which combined a large language model (LLM) and a reduced, 2-billion parameter vision-language model (VLM). In May 2025, H Company acquired Mithril Security, and in June 2025 the company widened its offering for agentic models. In June 2025, Gautier Cloix (formerly CEO Palantir France) replaced Charles Kantor as CEO of H Company, aiming to pivot the company towards a "forward deployed engineers" model. In July 2025, H Company introduced Surfer-H-CLI, an open-source, web-native Chrome agent designed for browser-based automation—able to search, scroll, click, and type on behalf of users and controllable via any visual language model (VLM). When paired with its June 2025 open-sourced 3B-parameter Holo-1 model, Surfer-H-CLI achieved 92.2% WebVoyager benchmark accuracy. == Activity == H Company creates enterprise AI models and agents (agentic AI) to automate and optimize complex workflows. H Company specifically designs AI agents called computer use capable of autonomously interfacing with any software (local or cloud-based) to detect and automate repetitive operations. H Company is based in Paris, France, with international offices in London and New York. H Company raised $220 million since its inception. Gautier Cloix is president and CEO of the company. H Company client include the French national lottery FDJ United. In March 2026, H Company released Holo3, a family of artificial intelligence models designed to operate digital systems by interacting directly with user interfaces. Holo3 enables agents ("virtual humanoids") to understand what is displayed in front-end environments—such as web pages, desktop applications, and other graphical user interfaces—and perform actions such as clicking, typing, and navigating across them to complete multi-step tasks. On the OSWorld-Verified benchmark, Holo3 reportedly achieved about 78.9%, surpassing the scores of OpenAI’s GPT‑5.4 and Anthropic’s Claude Opus 4.6 on this specific test, at roughly one-tenth of the inference cost of these proprietary systems. The release has been presented as a significant step toward automating routine digital workflows, allowing organizations to offload repetitive on-screen work, such as data entry and reconciliation across multiple tools, to AI-based agents.

    Read more →