Plinian Core is a set of vocabulary terms that can be used to describe different aspects of biological species information. Under "biological species Information" all kinds of properties or traits related to taxa—biological and non-biological—are included. Thus, for instance, terms pertaining descriptions, legal aspects, conservation, management, demographics, nomenclature, or related resources are incorporated. == Description == The Plinian Core is aimed to facilitate the exchange of information about the species and upper taxa. What is in scope? Species level catalogs of any kind of biological objects or data. Terminology associated with biological collection data. Striving for compatibility with other biodiversity-related standards. Facilitating the addition of components and attributes of biological data. What is not in scope? Data interchange protocols. Non-biodiversity-related data. Occurrence level data. This standard is named after Pliny the Elder, a very influential figure in the study of the biological species. Plinian Core design requirements includes: ease of use, to be self-contained, able to support data integration from multiple databases, and ability to handle different levels of granularity. Core terms can be grouped in its current version as follows: Metadata Base Elements Record Metadata Nomenclature and Classification Taxonomic description Natural history Invasive species Habitat and Distribution Demography and Threats Uses, Management and Conservation associatedParty, MeasurementOrFact, References, AncillaryData == Background == Plinian Core started as a collaborative project between Instituto Nacional de Biodiversidad and GBIF Spain in 2005. A series of iterations in which elements were defined and implanted in different projects resulted in a "Plinian Core Flat" [deprecated]. As a result, a new development was impulse to overcome them in 2012. New formal requirements, additional input and a will to better support the standard and its documentation, as well as to align it with the processes of TDWG, the world reference body for biodiversity information standards. A new version, Plinian Core v3.x.x was defined. This provides more flexibility to fully represent the information of a species in a variety of scenarios. New elements to deal with aspects such as IPR, related resources, referenced, etc. were introduced, and elements already included were better-defined and documented. Partner for the development of Plinian Core in this new phase incorporated the University of Granada (UG, Spain), the Alexander von Humboldt Institute (IAvH, Colombia), the National Commission for the Knowledge and Use of Biodiversity (Conabio, Mexico) and the University of São Paulo (USP, Brazil). A "Plinian Core Task Group" within TDWG "Interest Group on species Information" was constituted and currently working on its development. == Levels of the standard == Plinian Core is presented in to levels: the abstract model and the application profiles. The abstract model (AM), comprising the abstract model schema(xsd) and the terms' URIs, is the normative part. It is all comprehensive, and allows for different levels of granularity in describing species properties. The AM should be taken as a "menu" from which to choose terms and level of detail needed in any specific project. The subsets of the abstract model intended to be implemented in specific projects are the "application profiles" (APs). Besides containing part of the elements of the AM, APs can impose additional specifications on the included elements, such as controlled vocabularies. Some examples of APs in use follow: Application profile CONABIO Application profile INBIO Application profile GBIF.ES Application profile Banco de Datos de la Naturaleza.Spain Application profile SIB-COLOMBIA == Relation to other standards == Plinian incorporates a number of elements already defined by other standards. The following table summarizes these standards and the elements used in Plinian Core:
E-on Vue
Vue is a software tool for world generation by Bentley Systems, with support for many visual effects, animations, and various other features. The tool has been used in several feature-length films. In 2024, Bentley Systems announced that Vue would be discontinued, and be freely available to those that still wish to use it. == Versions == == Features == This is a list of features as of the 2023 release of Vue: === Terrains === Heightfield terrains Procedural terrains Infinite terrains Planetary terrains Real-world terrains 3D terrain sculpting Terrain export === EcoSystem Instancing Technology === Material-based EcoSystems Global EcoSystems Dynamic EcoSystems 360° EcoSystem Population Paint EcoSystem instances EcoParticles Export EcoSystem populations === Vegetation === Built-in Plant editor Compatible with PlantFactory Vegetation assets === Atmosphere, Skies and Clouds === Standard atmospheric model Spectral atmospheric model Photometric atmospheric model Atmosphere presets Procedural Volumetric 3D cloud layers Standalone 3D Metaclouds Convert meshes to Clouds Cloud morphing Import OpenVDB Export standalone and cloud layer zones to OpenVDB Export skies as HDRI === Modeling === Primitive and Feature modeling 3D Text edition tool Metablobbing Hyperblobs Export baked hyperblobs Splines Built in Road Construction toolkit Random rock generator Export rocks === Texturing and UVs === Material presets PBR Substance support Node-based procedural materials Volumetric materials and Hypertextures Stacked UVs Unwrapped UVs Ptex === Interoperability, Integration And Export === Export single assets to generic 3D formats Full scene export Integration plugins Import and Export Camera data as FBX and Nuke.chan Python API ZBrush GoZ bridge === Animation === Animate objects, materials, atmospheres, clouds, waves... Automatic wind and breeze Localized wind effects per plant / per EcoSystem population Omni and directional ventilators for local modifications of plants Time spline editor Automatic keyframe creation Automatic synchronization of cameras and lights Animation export as AfterEffects Import motion tracking information === Lighting === Global illumination, Global Radiosity, Ambient occlusion Subsurface Scattering HDRI image based lighting Point light, Quadratic point light, Spotlight, Quadratic spotlight, Directional light Use IES distribution profiles on photometric lights Area lights, light panels, light portals Physically accurate caustics computation === Rendering === Render with Ray Tracer Render with Path Tracer Stereoscopic rendering 360/180 VR Panorama Render Option Spherical panoramic rendering Tone mapping options Multipass & G-Buffer Network rendering with HyperVue / RenderCows Network rendering with RenderNodes == Users == Blue Sky Studios Digital Domain DreamWorks Animation: Kung Fu Panda Industrial Light & Magic: Indiana Jones and the Kingdom of the Crystal Skull, Pirates of the Caribbean: Dead Man's Chest Sony Pictures Imageworks Warner Bros. Interactive Entertainment Weta Digital
Hallucination (artificial intelligence)
In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where a hallucination typically involves false percepts. For example, a chatbot powered by large language models (LLMs), like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Detecting and mitigating errors and hallucinations pose significant challenges for practical deployment and reliability of LLMs in high-stakes scenarios, such as chip design, supply chain logistics, and medical diagnostics. Some software engineers and statisticians have criticized the specific term "AI hallucination" for unreasonably anthropomorphizing computers. Symbolic artificial intelligence models generally do not produce hallucinations, unlike large language models. == Term == === Origin === Since the 1980s, the term "hallucination" has been used in computer vision with a positive connotation to describe the process of adding detail to an image. For example, the task of generating high-resolution face images from low-resolution inputs is called face hallucination. The first documented use of the term "hallucination" in this sense is in the PhD thesis of Eric Mjolsness in 1986. A notable work is the face hallucination algorithm by Simon Baker and Takeo Kanade published in 1999. In the 2000s, hallucinations were described in statistical machine translation as a failure mode. Since the 2010s, the term has undergone a semantic shift to signify the generation of factually incorrect or misleading outputs by AI systems in tasks like machine translation and object detection. In 2015, hallucinations were identified in visual semantic role labeling tasks by Saurabh Gupta and Jitendra Malik. In 2015, computer scientist Andrej Karpathy used the term "hallucinated" in a blog post to describe his recurrent neural network (RNN) language model generating an incorrect citation link. In 2017, Google researchers used the term to describe the responses generated by neural machine translation (NMT) models when they are not related to the source text, and in 2018, the term was used in computer vision to describe instances where non-existent objects are erroneously detected because of adversarial attacks. In July 2021, Meta warned during its release of BlenderBot 2 that the system is prone to "hallucinations", which Meta defined as "confident statements that are not true". Following OpenAI's ChatGPT release in beta version in November 2022, some users complained that such chatbots often seem to pointlessly embed plausible-sounding random falsehoods within their generated content. Many news outlets, including The New York Times, started to use the term "hallucinations" to describe these models' frequently incorrect or inconsistent responses. In 2023, the Cambridge dictionary updated its definition of hallucination to include this new sense specific to the field of AI. Some researchers have highlighted a lack of consistency in how the term is used, but also identified several alternative terms in the literature, such as confabulations, fabrications, and factual errors. === Definitions and alternatives === Uses, definitions and characterizations of the term "hallucination" in the context of LLMs include: "a tendency to invent facts in moments of uncertainty" (OpenAI, May 2023) "a model's logical mistakes" (OpenAI, May 2023) "fabricating information entirely, but behaving as if spouting facts" (CNBC, May 2023) "making up information" (The Verge, February 2023) "probability distributions" (in scientific contexts) Journalist Benj Edwards, in Ars Technica, writes that the term "hallucination" is controversial, but that some form of metaphor remains necessary; Edwards suggests "confabulation" as an analogy for processes that involve "creative gap-filling". In July 2024, a White House report on fostering public trust in AI research mentioned hallucinations only in the context of reducing them. Notably, when acknowledging David Baker's Nobel Prize-winning work with AI-generated proteins, the Nobel committee avoided the term entirely, instead referring to "imaginative protein creation". Hicks, Humphries, and Slater, in their article in Ethics and Information Technology, argue that the output of LLMs is "bullshit" under Harry Frankfurt's definition of the term, and that the models are "in an important way indifferent to the truth of their outputs", with true statements only accidentally true, and false ones accidentally false. Some researchers also use the derogatory term "botshit", often referring to uncritical use of AI. === Criticism === In the scientific community, some researchers avoid the term "hallucination", seeing it as potentially misleading. It has been criticized by Usama Fayyad, executive director of the Institute for Experimental Artificial Intelligence at Northeastern University, on the grounds that it misleadingly personifies large language models and is vague. Mary Shaw said, "The current fashion for calling generative AI's errors 'hallucinations' is appalling. It anthropomorphizes the software, and it spins actual errors as somehow being idiosyncratic quirks of the system even when they're objectively incorrect." In Salon, statistician Gary Smith argues that LLMs "do not understand what words mean" and consequently that the term "hallucination" unreasonably anthropomorphizes the machine. Murray Shanahan argues that anthropomorphic framing of LLM capabilities, including terms like "hallucination", encourages users and researchers to attribute cognitive processes to systems that operate through statistical pattern completion, and advocates for more careful linguistic practices when discussing LLM behavior. Kristina Šekrst argues that applying psychological vocabulary to LLM outputs obscures the difference between the appearance of mental properties and their genuine presence. Förster & Skop assert that tech companies use the hallucination metaphor to anthropomorphize models and deflect responsibility for non-factual outputs. Some see the AI outputs not as illusory but as prospective—that is, having some chance of being true, similar to early-stage scientific conjectures. The term has also been criticized for its association with psychedelic drug experiences. == In natural language generation == In natural language generation, there are several reasons why natural language models hallucinate: === Hallucination from data === Hallucinations can stem from incomplete, inaccurate or unrepresentative data sets. === Modeling-related causes === The pre-training of generative pretrained transformers (GPT) involves predicting the next word. It incentivizes GPT models to "give a guess" about what the next word is, even when they lack information. Some researchers take an anthropomorphic perspective and posit that hallucinations arise from a tension between novelty and usefulness. For instance, Amabile and Pratt define human creativity as the production of novel and useful ideas. By extension, a focus on novelty in machine creativity can lead to the production of original but inaccurate responses—that is, falsehoods—whereas a focus on usefulness may result in memorized content lacking originality. By 2022, newspapers such as The New York Times expressed concern that, as the adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems. === Interpretability research === In 2025, interpretability research by Anthropic on the LLM Claude identified internal circuits that cause it to decline to answer questions unless it knows the answer. By default, the circuit is active and the LLM doesn't answer. When the LLM has sufficient information, these circuits are inhibited and the LLM answers the question. Hallucinations were found to occur when this inhibition happens incorrectly, such as when Claude recognizes a name but lacks sufficient information about that person, causing it to generate plausible but untrue responses. === Examples === On 15 November 2022, researchers from Meta AI published Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning: "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. OpenAI's ChatGPT, released in beta version to the public on November 30, 2022, was based on the foundation model GPT-3.5 (a revision of GPT-3). Professor Ethan Mollick of Wharton called it an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kuba
Matrix regularization
In the field of statistical learning theory, matrix regularization generalizes notions of vector regularization to cases where the object to be learned is a matrix. The purpose of regularization is to enforce conditions, for example sparsity or smoothness, that can produce stable predictive functions. For example, in the more common vector framework, Tikhonov regularization optimizes over min x ‖ A x − y ‖ 2 + λ ‖ x ‖ 2 {\displaystyle \min _{x}\left\|Ax-y\right\|^{2}+\lambda \left\|x\right\|^{2}} to find a vector x {\displaystyle x} that is a stable solution to the regression problem. When the system is described by a matrix rather than a vector, this problem can be written as min X ‖ A X − Y ‖ 2 + λ ‖ X ‖ 2 , {\displaystyle \min _{X}\left\|AX-Y\right\|^{2}+\lambda \left\|X\right\|^{2},} where the vector norm enforcing a regularization penalty on x {\displaystyle x} has been extended to a matrix norm on X {\displaystyle X} . Matrix regularization has applications in matrix completion, multivariate regression, and multi-task learning. Ideas of feature and group selection can also be extended to matrices, and these can be generalized to the nonparametric case of multiple kernel learning. == Basic definition == Consider a matrix W {\displaystyle W} to be learned from a set of examples, S = ( X i t , y i t ) {\displaystyle S=(X_{i}^{t},y_{i}^{t})} , where i {\displaystyle i} goes from 1 {\displaystyle 1} to n {\displaystyle n} , and t {\displaystyle t} goes from 1 {\displaystyle 1} to T {\displaystyle T} . Let each input matrix X i {\displaystyle X_{i}} be ∈ R D T {\displaystyle \in \mathbb {R} ^{DT}} , and let W {\displaystyle W} be of size D × T {\displaystyle D\times T} . A general model for the output y {\displaystyle y} can be posed as y i t = ⟨ W , X i t ⟩ F , {\displaystyle y_{i}^{t}=\left\langle W,X_{i}^{t}\right\rangle _{F},} where the inner product is the Frobenius inner product. For different applications the matrices X i {\displaystyle X_{i}} will have different forms, but for each of these the optimization problem to infer W {\displaystyle W} can be written as min W ∈ H E ( W ) + R ( W ) , {\displaystyle \min _{W\in {\mathcal {H}}}E(W)+R(W),} where E {\displaystyle E} defines the empirical error for a given W {\displaystyle W} , and R ( W ) {\displaystyle R(W)} is a matrix regularization penalty. The function R ( W ) {\displaystyle R(W)} is typically chosen to be convex and is often selected to enforce sparsity (using ℓ 1 {\displaystyle \ell ^{1}} -norms) and/or smoothness (using ℓ 2 {\displaystyle \ell ^{2}} -norms). Finally, W {\displaystyle W} is in the space of matrices H {\displaystyle {\mathcal {H}}} with Frobenius inner product ⟨ … ⟩ F {\displaystyle \langle \dots \rangle _{F}} . == General applications == === Matrix completion === In the problem of matrix completion, the matrix X i t {\displaystyle X_{i}^{t}} takes the form X i t = e t ⊗ e i ′ , {\displaystyle X_{i}^{t}=e_{t}\otimes e_{i}',} where ( e t ) t {\displaystyle (e_{t})_{t}} and ( e i ′ ) i {\displaystyle (e_{i}')_{i}} are the canonical basis in R T {\displaystyle \mathbb {R} ^{T}} and R D {\displaystyle \mathbb {R} ^{D}} . In this case the role of the Frobenius inner product is to select individual elements w i t {\displaystyle w_{i}^{t}} from the matrix W {\displaystyle W} . Thus, the output y {\displaystyle y} is a sampling of entries from the matrix W {\displaystyle W} . The problem of reconstructing W {\displaystyle W} from a small set of sampled entries is possible only under certain restrictions on the matrix, and these restrictions can be enforced by a regularization function. For example, it might be assumed that W {\displaystyle W} is low-rank, in which case the regularization penalty can take the form of a nuclear norm. R ( W ) = λ ‖ W ‖ ∗ = λ ∑ i | σ i | , {\displaystyle R(W)=\lambda \left\|W\right\|_{}=\lambda \sum _{i}\left|\sigma _{i}\right|,} where σ i {\displaystyle \sigma _{i}} , with i {\displaystyle i} from 1 {\displaystyle 1} to min D , T {\displaystyle \min D,T} , are the singular values of W {\displaystyle W} . === Multivariate regression === Models used in multivariate regression are parameterized by a matrix of coefficients. In the Frobenius inner product above, each matrix X {\displaystyle X} is X i t = e t ⊗ x i {\displaystyle X_{i}^{t}=e_{t}\otimes x_{i}} such that the output of the inner product is the dot product of one row of the input with one column of the coefficient matrix. The familiar form of such models is Y = X W + b {\displaystyle Y=XW+b} Many of the vector norms used in single variable regression can be extended to the multivariate case. One example is the squared Frobenius norm, which can be viewed as an ℓ 2 {\displaystyle \ell ^{2}} -norm acting either entrywise, or on the singular values of the matrix: R ( W ) = λ ‖ W ‖ F 2 = λ ∑ i ∑ j | w i j | 2 = λ Tr ( W ∗ W ) = λ ∑ i σ i 2 . {\displaystyle R(W)=\lambda \left\|W\right\|_{F}^{2}=\lambda \sum _{i}\sum _{j}\left|w_{ij}\right|^{2}=\lambda \operatorname {Tr} \left(W^{}W\right)=\lambda \sum _{i}\sigma _{i}^{2}.} In the multivariate case the effect of regularizing with the Frobenius norm is the same as the vector case; very complex models will have larger norms, and, thus, will be penalized more. === Multi-task learning === The setup for multi-task learning is almost the same as the setup for multivariate regression. The primary difference is that the input variables are also indexed by task (columns of Y {\displaystyle Y} ). The representation with the Frobenius inner product is then X i t = e t ⊗ x i t . {\displaystyle X_{i}^{t}=e_{t}\otimes x_{i}^{t}.} The role of matrix regularization in this setting can be the same as in multivariate regression, but matrix norms can also be used to couple learning problems across tasks. In particular, note that for the optimization problem min W ‖ X W − Y ‖ 2 2 + λ ‖ W ‖ 2 2 {\displaystyle \min _{W}\left\|XW-Y\right\|_{2}^{2}+\lambda \left\|W\right\|_{2}^{2}} the solutions corresponding to each column of Y {\displaystyle Y} are decoupled. That is, the same solution can be found by solving the joint problem, or by solving an isolated regression problem for each column. The problems can be coupled by adding an additional regularization penalty on the covariance of solutions min W , Ω ‖ X W − Y ‖ 2 2 + λ 1 ‖ W ‖ 2 2 + λ 2 Tr ( W T Ω − 1 W ) {\displaystyle \min _{W,\Omega }\left\|XW-Y\right\|_{2}^{2}+\lambda _{1}\left\|W\right\|_{2}^{2}+\lambda _{2}\operatorname {Tr} \left(W^{T}\Omega ^{-1}W\right)} where Ω {\displaystyle \Omega } models the relationship between tasks. This scheme can be used to both enforce similarity of solutions across tasks, and to learn the specific structure of task similarity by alternating between optimizations of W {\displaystyle W} and Ω {\displaystyle \Omega } . When the relationship between tasks is known to lie on a graph, the Laplacian matrix of the graph can be used to couple the learning problems. == Spectral regularization == Regularization by spectral filtering has been used to find stable solutions to problems such as those discussed above by addressing ill-posed matrix inversions (see for example Filter function for Tikhonov regularization). In many cases the regularization function acts on the input (or kernel) to ensure a bounded inverse by eliminating small singular values, but it can also be useful to have spectral norms that act on the matrix that is to be learned. There are a number of matrix norms that act on the singular values of the matrix. Frequently used examples include the Schatten p-norms, with p = 1 or 2. For example, matrix regularization with a Schatten 1-norm, also called the nuclear norm, can be used to enforce sparsity in the spectrum of a matrix. This has been used in the context of matrix completion when the matrix in question is believed to have a restricted rank. In this case the optimization problem becomes: min ‖ W ‖ ∗ subject to W i , j = Y i j . {\displaystyle \min \left\|W\right\|_{}~~{\text{ subject to }}~~W_{i,j}=Y_{ij}.} Spectral Regularization is also used to enforce a reduced rank coefficient matrix in multivariate regression. In this setting, a reduced rank coefficient matrix can be found by keeping just the top n {\displaystyle n} singular values, but this can be extended to keep any reduced set of singular values and vectors. == Structured sparsity == Sparse optimization has become the focus of much research interest as a way to find solutions that depend on a small number of variables (see e.g. the Lasso method). In principle, entry-wise sparsity can be enforced by penalizing the entry-wise ℓ 0 {\displaystyle \ell ^{0}} -norm of the matrix, but the ℓ 0 {\displaystyle \ell ^{0}} -norm is not convex. In practice this can be implemented by convex relaxation to the ℓ 1 {\displaystyle \ell ^{1}} -norm. While entry-wise regularization with an ℓ 1 {\displaystyle \ell ^{1}} -norm will find solutions with a small number of nonzero elements, applying an ℓ 1 {
Eager learning
In artificial intelligence, eager learning is a learning method in which the system tries to construct a general, input-independent target function during training of the system, as opposed to lazy learning, where generalization beyond the training data is delayed until a query is made to the system. The main advantage gained in employing an eager learning method, such as an artificial neural network, is that the target function will be approximated globally during training, thus requiring much less space than using a lazy learning system. Eager learning systems also deal much better with noise in the training data. Eager learning is an example of offline learning, in which post-training queries to the system have no effect on the system itself, and thus the same query to the system will always produce the same result. The main disadvantage with eager learning is that it is generally unable to provide good local approximations in the target function.
Multimodal representation learning
Multimodal representation learning is a subfield of representation learning focused on integrating and interpreting information from different modalities, such as text, images, audio, or video, by projecting them into a shared latent space. This allows for semantically similar content across modalities to be mapped to nearby points within that space, facilitating a unified understanding of diverse data types. By automatically learning meaningful features from each modality and capturing their inter-modal relationships, multimodal representation learning enables a unified representation that enhances performance in cross-media analysis tasks such as video classification, event detection, and sentiment analysis. It also supports cross-modal retrieval and translation, including image captioning, video description, and text-to-image synthesis. == Motivation == The primary motivations for multimodal representation learning arise from the inherent nature of real-world data and the limitations of unimodal approaches. Since multimodal data offers complementary and supplementary information about an object or event from different perspectives, it is more informative than relying on a single modality. A key motivation is to narrow the heterogeneity gap that exists between different modalities by projecting their features into a shared semantic subspace. This allows semantically similar content across modalities to be represented by similar vectors, facilitating the understanding of relationships and correlations between them. Multimodal representation learning aims to leverage the unique information provided by each modality to achieve a more comprehensive and accurate understanding of concepts. These unified representations are crucial for improving performance in various cross-media analysis tasks such as video classification, event detection, and sentiment analysis. They also enable cross-modal retrieval, allowing users to search and retrieve content across different modalities. Additionally, it facilitates cross-modal translation, where information can be converted from one modality to another, as seen in applications like image captioning and text-to-image synthesis. The abundance of ubiquitous multimodal data in real-world applications, including understudied areas like healthcare, finance, and human-computer interaction (HCI), further motivates the development of effective multimodal representation learning techniques. == Approaches and methods == === Canonical-correlation analysis based methods === Canonical-correlation analysis (CCA) was first introduced in 1936 by Harold Hotelling and is a fundamental approach for multimodal learning. CCA aims to find linear relationships between two sets of variables. Given two data matrices X ∈ R n × p {\displaystyle X\in \mathbb {R} ^{n\times p}} and Y ∈ R n × q {\displaystyle Y\in \mathbb {R} ^{n\times q}} representing different modalities, CCA finds projection vectors w x ∈ R p {\displaystyle w_{x}\in \mathbb {R} ^{p}} and w y ∈ R q {\displaystyle w_{y}\in \mathbb {R} ^{q}} that maximizes the correlation between the projected variables: ρ = max w x , w y w x ⊤ Σ x y w y w x ⊤ Σ x x w x w y ⊤ Σ y y w y {\displaystyle \rho =\max _{w_{x},w_{y}}{\frac {w_{x}^{\top }\Sigma _{xy}w_{y}}{{\sqrt {w_{x}^{\top }\Sigma _{xx}w_{x}}}{\sqrt {w_{y}^{\top }\Sigma _{yy}w_{y}}}}}} such that Σ x x {\displaystyle \Sigma _{xx}} and Σ y y {\displaystyle \Sigma _{yy}} are the within-modality covariance matrices, and Σ x y {\displaystyle \Sigma _{xy}} is the between-modality covariance matrix. However, standard CCA is limited by its linearity, which led to the development of nonlinear extensions, such as kernel CCA and deep CCA. ==== Kernel CCA ==== Kernel canonical correlation analysis (KCCA) extends traditional CCA to capture nonlinear relationships between modalities by implicitly mapping the data into high dimensional feature spaces using kernel functions. Given kernel functions K x {\displaystyle K_{x}} and K y {\displaystyle K_{y}} with corresponding Gram matrices K x ∈ R n × n {\displaystyle K_{x}\in \mathbb {R} ^{n\times n}} and K y ∈ R n × n {\displaystyle K_{y}\in \mathbb {R} ^{n\times n}} , KCCA seeks coefficients α {\displaystyle \alpha } and β {\displaystyle \beta } that maximize: ρ = max α , β α ⊤ K x K y β α ⊤ K x 2 α β ⊤ K y 2 β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{\top }K_{x}Ky\beta }{{\sqrt {\alpha ^{\top }K_{x}^{2}\alpha }}{\sqrt {\beta ^{\top }K_{y}^{2}\beta }}}}} To prevent overfitting, regularization terms are typically added, resulting in: ρ = max α , β α T K x K y β α T ( K x 2 + λ x K x ) α β T ( K y 2 + λ y K y ) β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{T}K_{x}K_{y}\beta }{{\sqrt {\alpha ^{T}\left(K_{x}^{2}+\lambda _{x}K_{x}\right)\alpha }}{\sqrt {\;\beta ^{T}\left(K_{y}^{2}+\lambda _{y}K_{y}\right)\beta }}}}} where λ x {\displaystyle \lambda _{x}} and λ y {\displaystyle \lambda _{y}} are regularization parameters. KCCA has proven effective for tasks such as cross-modal retrieval and semantic analysis, though it faces computational challenges with large datasets due to its O ( n 2 ) {\displaystyle O(n^{2})} memory requirement for sorting kernel matrices. KCCA was proposed independently by several researchers. ==== Deep CCA ==== Deep canonical correlation analysis (DCCA), introduced in 2013, employs neural networks to learn nonlinear transformations for maximizing the correlation between modalities. DCCA uses separate neural networks f x {\displaystyle f_{x}} and f y {\displaystyle f_{y}} for each modality to transform the original data before applying CCA: max W x , W y , θ x , θ y corr ( f x ( X ; θ x ) , f y ( Y ; θ y ) ) {\displaystyle \max _{W_{x},W_{y},\theta _{x},\theta _{y}}\operatorname {corr} \left(f_{x}(X;\theta _{x}),f_{y}(Y;\theta _{y})\right)} where θ x {\displaystyle \theta _{x}} and θ y {\displaystyle \theta _{y}} represent the parameters of the neural networks, and W x {\displaystyle W_{x}} and W y {\displaystyle W_{y}} are the CCA projection matrices. The correlation objective is computed as: corr ( H x , H y ) = tr ( T − 1 / 2 H x T H y S − 1 / 2 ) {\displaystyle \operatorname {corr} (H_{x},H_{y})=\operatorname {tr} \left(T^{-1/2}H_{x}^{T}H_{y}S^{-1/2}\right)} where H x = f x ( X ) {\displaystyle H_{x}=f_{x}(X)} and H y = f y ( Y ) {\displaystyle H_{y}=f_{y}(Y)} are the network outputs, T = H x T H x + r x I {\displaystyle T=H_{x}^{T}H_{x}+r_{x}I} , S = H y T H y + r y I {\displaystyle S=H_{y}^{T}H_{y}+r_{y}I} and r x , r y {\displaystyle r_{x},r_{y}} are the regularization parameters. DCCA overcomes the limitations of linear CCA and kernel CCA by learning complex nonlinear relationships while maintaining computational efficiency for large datasets through mini-batch optimization. === Graph-based methods === Graph-based approaches for multimodal representation learning leverage graph structure to model relationships between entities across different modalities. These methods typically represent each modality as a graph and then learn embedding that preserve cross-modal similarities, enabling more effective joint representation of heterogeneous data. One such method is cross-modal graph neural networks (CMGNNs) that extend traditional graph neural networks (GNNs) to handle data from multiple modalities by constructing graphs that capture both intra-modal and inter-modal relationships. These networks model interactions across modalities by representing them as nodes and their relationships as edges. Other graph-based methods include Probabilistic Graphical Models (PGMs) such as deep belief networks (DBN) and deep Boltzmann machines (DBM). These models can learn a joint representation across modalities, for instance, a multimodal DBN achieves this by adding a shared restricted Boltzmann Machine (RBM) hidden layer on top of modality-specific DBNs. Additionally, the structure of data in some domains like Human-Computer Interaction (HCI), such as the view hierarchy of app screens, can potentially be modeled using graph-like structures. The field of graph representation learning is also relevant, with ongoing progress in developing evaluation benchmarks. === Diffusion maps === Another set of methods relevant to multimodal representation learning are based on diffusion maps and their extensions to handle multiple modalities. ==== Multi-view diffusion maps ==== Multi-view diffusion maps address the challenge of achieving multi-view dimensionality reduction by effectively utilizing the availability of multiple views to extract a coherent low-dimensional representation of the data. The core idea is to exploit both the intrinsic relations within each view and the mutual relations between the different views, defining a cross-view model where a random walk process implicitly hops between objects in different views. A multi-view kernel matrix is constructed by combining these relations, defining a cross-view diffusion process and associ
The 2028 Global Intelligence Crisis
The 2028 Global Intelligence Crisis is a report authored by James van Geelen and Alap Shah and published by Citrini Research in February 2026, on the impact of artificial intelligence on humanity's future. Written in the form of a scenario analysis, it was viewed millions of times online and reportedly caused a fall in the stock market prices of major tech and financial firms. It also received criticism among others, for its allegedly flawed economic logic. The 'thought exercise', as the authors called it, painted a gloomy picture for the near future, where outputs keep growing while consumer's ability to spend collapses. "...driven by ai agents that don’t sleep, take sick days or require health insurance”, "outputs that are shown in national accounts increases, "but never circulates through the real economy"(which the report calls 'Ghost GDP'), the authors argued. In other words, the authors predict a scenario where the owners of the AI firms will accumulate a vast fortune but there will be scant demand from consumers as AI would cause massive unemployment. The authors caution the reader that what they make is a scenario and not a prediction. In the scenario they visualise, any service whose value proposition is “I will navigate complexity that you find tedious” is getting disrupted. The reports argues that the unique ability of human beings to analyse, decide, create, persuade, and coordinate was “the thing that could not be replicated at scale,” and call the historical scarcity of this precious entity 'friction'. When this friction becomes zero, a gamut of changes occur which then triggers a cascading of changes across the economy. ”Travel booking platforms are an early casualty; Financial advice. tax prep., and routine legal work follow suit. National unemployment rate go as high 10.2% and the S&P 500 goes for a massive 38% peak-to-trough crash. In contrast to the previous technological revolutions the high-earning professionals suffers more and get forced to take up roles in the gig economy. Labour supply becomes abundant and this cuts wages all across the economy. The dent in income for the employees then affects other sectors of the economy such as the residential mortgage market. The losses for the software companies triggers loan defaults and heralds peril for the private credit sector.