Aidan Gomez

Aidan Gomez

Aidan Gomez is a British-Canadian computer scientist working in the field of artificial intelligence, with a focus on natural language processing. He is the co-founder and CEO of the technology company Cohere. == Early life and education == Gomez grew up in Brighton, Ontario. He graduated from the University of Toronto with a bachelor's degree in computer science and mathematics. He was pursuing a PhD in computer science from the University of Oxford. He paused his studies to launch Cohere. He was granted the PhD in 2024. == Career == In 2017, as a 20 year-old intern at Google Brain, Gomez was one of eight authors of the research paper "Attention Is All You Need", which is credited with changing the AI industry and helping lead to the creation of ChatGPT. The paper proposed a novel deep learning architecture called the transformer, that enables machine learning models to analyze large amounts of data for patterns, and then use those patterns to make predictions while leveraging GPU parallelization. It has been commonly adopted for training large language models and in the development of generative AI. In the same year, Gomez founded FOR.ai, a program to help researchers learn machine learning techniques in a collaborative format. An outgrowth of this project was Cohere For AI (now Cohere Labs), which released Aya, an open-source multilingual LLM. As a PhD student, Gomez worked as a machine learning researcher at Google Brain. At that time, he co-authored the paper "One Model to Learn Them All" about multi-task learning by a single neural network. In 2019, Gomez left Google Brain to launch Cohere, an enterprise-focused company that helps businesses implement AI into chatbots, search engines, and other products. As of Sept 2025, Cohere has raised about US$1.6 billion at valuation north of $7 billion, as Gomez leads the company as its CEO. Gomez was named to the 2023 Time 100/AI list of the most influential people in the field of artificial intelligence. He and his fellow Cohere founders Ivan Zhang and Nick Frosst were named number 1 on 2023 Maclean's AI Trailblazers Power List. In April 2025, Gomez was elected to the board of Rivian. == Views on AI == Gomez has stated that warnings regarding the existential risk from artificial intelligence are overblown, and that real risks involve the automated spread of misinformation on social media. He said that the United States would win the AI arms race over China.

XLNet

The XLNet was an autoregressive Transformer designed as an improvement over BERT, with 340M parameters and trained on 33 billion words. It was released on 19 June 2019, under the Apache 2.0 license. It achieved state-of-the-art results on a variety of natural language processing tasks, including language modeling, question answering, and natural language inference. == Architecture == The main idea of XLNet is to model language autoregressively like the GPT models, but allow for all possible permutations of a sentence. Concretely, consider the following sentence:My dog is cute.In standard autoregressive language modeling, the model would be tasked with predicting the probability of each word, conditioned on the previous words as its context: We factorize the joint probability of a sequence of words x 1 , … , x T {\displaystyle x_{1},\ldots ,x_{T}} using the chain rule: Pr ( x 1 , … , x T ) = Pr ( x 1 ) Pr ( x 2 | x 1 ) Pr ( x 3 | x 1 , x 2 ) … Pr ( x T | x 1 , … , x T − 1 ) . {\displaystyle \Pr(x_{1},\ldots ,x_{T})=\Pr(x_{1})\Pr(x_{2}|x_{1})\Pr(x_{3}|x_{1},x_{2})\ldots \Pr(x_{T}|x_{1},\ldots ,x_{T-1}).} For example, the sentence "My dog is cute" is factorized as: Pr ( My , dog , is , cute ) = Pr ( My ) Pr ( dog | My ) Pr ( is | My , dog ) Pr ( cute | My , dog , is ) . {\displaystyle \Pr({\text{My}},{\text{dog}},{\text{is}},{\text{cute}})=\Pr({\text{My}})\Pr({\text{dog}}|{\text{My}})\Pr({\text{is}}|{\text{My}},{\text{dog}})\Pr({\text{cute}}|{\text{My}},{\text{dog}},{\text{is}}).} Schematically, we can write it as → My → My dog → My dog is → My dog is cute . {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My }}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My dog }}{\texttt {}}{\texttt {}}\to {\text{My dog is }}{\texttt {}}\to {\text{My dog is cute}}.} However, for XLNet, the model is required to predict the words in a randomly generated order. Suppose we have sampled a randomly generated order 3241, then schematically, the model is required to perform the following prediction task: is dog is dog is cute → My dog is cute {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\texttt {}}{\texttt {}}{\text{is }}{\texttt {}}\to {\texttt {}}{\text{dog is }}{\texttt {}}\to {\texttt {}}{\text{dog is cute}}\to {\text{My dog is cute}}} By considering all permutations, XLNet is able to capture longer-range dependencies and better model the bidirectional context of words. === Two-Stream Self-Attention === To implement permutation language modeling, XLNet uses a two-stream self-attention mechanism. The two streams are: Content stream: This stream encodes the content of each word, as in standard causally masked self-attention. Query stream: This stream encodes the content of each word in the context of what has gone before. In more detail, it is a masked cross-attention mechanism, where the queries are from the query stream, and the key-value pairs are from the content stream. The content stream uses the causal mask M causal = [ 0 − ∞ − ∞ … − ∞ 0 0 − ∞ … − ∞ 0 0 0 … − ∞ ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … 0 ] {\displaystyle M_{\text{causal}}={\begin{bmatrix}0&-\infty &-\infty &\dots &-\infty \\0&0&-\infty &\dots &-\infty \\0&0&0&\dots &-\infty \\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\dots &0\end{bmatrix}}} permuted by a random permutation matrix to P M causal P − 1 {\displaystyle PM_{\text{causal}}P^{-1}} . The query stream uses the cross-attention mask P ( M causal − ∞ I ) P − 1 {\displaystyle P(M_{\text{causal}}-\infty I)P^{-1}} , where the diagonal is subtracted away specifically to avoid the model "cheating" by looking at the content stream for what the current masked token is. Like the causal masking for GPT models, this two-stream masked architecture allows the model to train on all tokens in one forward pass. == Training == Two models were released: XLNet-Large, cased: 110M parameters, 24-layer, 1024-hidden, 16-heads XLNet-Base, cased: 340M parameters, 12-layer, 768-hidden, 12-heads. It was trained on a dataset that amounted to 32.89 billion tokens after tokenization with SentencePiece. The dataset was composed of BooksCorpus, and English Wikipedia, Giga5, ClueWeb 2012-B, and Common Crawl. It was trained on 512 TPU v3 chips, for 5.5 days. At the end of training, it still under-fitted the data, meaning it could have achieved lower loss with more training. It took 0.5 million steps with an Adam optimizer, linear learning rate decay, and a batch size of 8192.

Apertus (LLM)

Apertus is a public large language model, developed by the Swiss AI Initiative (a collaboration between EPFL, ETH Zurich, and the Swiss National Supercomputing Centre). It was released on September 2, 2025, under the free and open-source Apache 2.0 license. Designed initially for business and research use cases around the world, Apertus was trained on over 1800 languages, and comes in 8 billion or 70 billion parameter versions and is available on Hugging Face for download. The model was developed aiming to adhere to European copyright law, and is one of the first examples of AI as a public good in the vein of AI Sovereignty. It is also the first large model to comply with the European Union's Artificial Intelligence Act. At its launch, the model creators emphasized multilinguality, transparency, and auditability as priorities in contrast to commercial frontier model. While international reception was largely positive, the first iteration was significantly behind the capabilities of frontier models and needs adaptation for many use cases with chatbots being a secondary but not a primary use case. As of late 2025, it was considered the largest and most capable fully open model. The capability of future models will depend in part on how much more funding can be secured.

Connected-component labeling

Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with segmentation. Connected-component labeling is used in computer vision to detect connected regions in binary digital images, although color images and data with higher dimensionality can also be processed. When integrated into an image recognition system or human-computer interaction interface, connected component labeling can operate on a variety of information. Blob extraction is generally performed on the resulting binary image from a thresholding step, but it can be applicable to gray-scale and color images as well. Blobs may be counted, filtered, and tracked. Blob extraction is related to but distinct from blob detection. == Overview == A graph, containing vertices and connecting edges, is constructed from relevant input data. The vertices contain information required by the comparison heuristic, while the edges indicate connected 'neighbors'. An algorithm traverses the graph, labeling the vertices based on the connectivity and relative values of their neighbors. Connectivity is determined by the medium; image graphs, for example, can be 4-connected neighborhood or 8-connected neighborhood. Following the labeling stage, the graph may be partitioned into subsets, after which the original information can be recovered and processed . == Definition == The usage of the term connected-component labeling (CCL) and its definition is quite consistent in the academic literature, whereas connected-component analysis (CCA) varies both in terminology and in its definition of the problem. Rosenfeld et al. define connected components labeling as the “[c]reation of a labeled image in which the positions associated with the same connected component of the binary input image have a unique label.” Shapiro et al. define CCL as an operator whose “input is a binary image and [...] output is a symbolic image in which the label assigned to each pixel is an integer uniquely identifying the connected component to which that pixel belongs.” There is no consensus on the definition of CCA in the academic literature. It is often used interchangeably with CCL. A more extensive definition is given by Shapiro et al.: “Connected component analysis consists of connected component labeling of the black pixels followed by property measurement of the component regions and decision making.” The definition for connected-component analysis presented here is more general, taking the thoughts expressed in into account. == Algorithms == The algorithms discussed can be generalised to arbitrary dimensions, albeit with increased time and space complexity. === One component at a time === This is a fast and very simple method to implement and understand. It is based on graph traversal methods in graph theory. In short, once the first pixel of a connected component is found, all the connected pixels of that connected component are labelled before going onto the next pixel in the image. This algorithm is part of Vincent and Soille's watershed segmentation algorithm, other implementations also exist. In order to do that a linked list is formed that will keep the indexes of the pixels that are connected to each other, steps (2) and (3) below. The method of defining the linked list specifies the use of a depth or a breadth first search. For this particular application, there is no difference which strategy to use. The simplest kind of a last in first out queue implemented as a singly linked list will result in a depth first search strategy. It is assumed that the input image is a binary image, with pixels being either background or foreground and that the connected components in the foreground pixels are desired. The algorithm steps can be written as: Start from the first pixel in the image. Set current label to 1. Go to (2). If this pixel is a foreground pixel and it is not already labelled, give it the current label and add it as the first element in a queue, then go to (3). If it is a background pixel or it was already labelled, then repeat (2) for the next pixel in the image. Pop out an element from the queue, and look at its neighbours (based on any type of connectivity). If a neighbour is a foreground pixel and is not already labelled, give it the current label and add it to the queue. Repeat (3) until there are no more elements in the queue. Go to (2) for the next pixel in the image and increment current label by 1. Note that the pixels are labelled before being put into the queue. The queue will only keep a pixel to check its neighbours and add them to the queue if necessary. This algorithm only needs to check the neighbours of each foreground pixel once and doesn't check the neighbours of background pixels. The pseudocode is: algorithm OneComponentAtATime(data) input : imageData[xDim][yDim] initialization : label = 0, labelArray[xDim][yDim] = 0, statusArray[xDim][yDim] = false, queue1, queue2; for i = 0 to xDim do for j = 0 to yDim do if imageData[i][j] has not been processed do if imageData[i][j] is a foreground pixel do check its four neighbors(north, south, east, west) : if neighbor is not processed do if neighbor is a foreground pixel do add it to queue1 else update its status to processed end if labelArray[i][j] = label (give label) statusArray[i][j] = true (update status) while queue1 is not empty do For each pixel in the queue do : check its four neighbors if neighbor is not processed do if neighbor is a foreground pixel do add it to queue2 else update its status to processed end if give it the current label update its status to processed remove the current element from queue1 copy queue2 into queue1 end While increase the label end if else update its status to processed end if end if end if end for end for === Two-pass === Relatively simple to implement and understand, the two-pass algorithm, (also known as the Hoshen–Kopelman algorithm) iterates through 2-dimensional binary data. The algorithm makes two passes over the image: the first pass to assign temporary labels and record equivalences, and the second pass to replace each temporary label by the smallest label of its equivalence class. The input data can be modified in situ (which carries the risk of data corruption), or labeling information can be maintained in an additional data structure. Connectivity checks are carried out by checking neighbor pixels' labels (neighbor elements whose labels are not assigned yet are ignored), or say, the north-east, the north, the north-west and the west of the current pixel (assuming 8-connectivity). 4-connectivity uses only north and west neighbors of the current pixel. The following conditions are checked to determine the value of the label to be assigned to the current pixel (4-connectivity is assumed) Conditions to check: Does the pixel to the left (west) have the same value as the current pixel? Yes – We are in the same region. Assign the same label to the current pixel No – Check next condition Do both pixels to the north and west of the current pixel have the same value as the current pixel but not the same label? Yes – We know that the north and west pixels belong to the same region and must be merged. Assign the current pixel the minimum of the north and west labels, and record their equivalence relationship No – Check next condition Does the pixel to the left (west) have a different value and the one to the north the same value as the current pixel? Yes – Assign the label of the north pixel to the current pixel No – Check next condition Do the pixel's north and west neighbors have different pixel values than current pixel? Yes – Create a new label id and assign it to the current pixel The algorithm continues this way, and creates new region labels whenever necessary. The key to a fast algorithm, however, is how this merging is done. This algorithm uses the union-find data structure which provides excellent performance for keeping track of equivalence relationships. Union-find essentially stores labels which correspond to the same blob in a disjoint-set data structure, making it easy to remember the equivalence of two labels by the use of an interface method E.g.: findSet(l). findSet(l) returns the minimum label value that is equivalent to the function argument 'l'. Once the initial labeling and equivalence recording is completed, the second pass merely replaces each pixel label with its equivalent disjoint-set representative element. A faster-scanning algorithm for connected-region extraction is presented below. On the first pass: Iterate through each element of the data by column, then by row (Raster Scanning) If the element is not the background Get the neighboring elements of the current element If there are no neighbors, uniquely

Contract management software

Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

Belief–desire–intention model

For popular psychology, the belief–desire–intention (BDI) model of human practical reasoning was developed by Michael Bratman as a way of explaining future-directed intention. BDI is fundamentally reliant on folk psychology (the 'theory theory'), which is the notion that our mental models of the world are theories. It was used as a basis for developing the belief–desire–intention software model. == Applications == BDI was part of the inspiration behind the BDI software architecture, which Bratman was also involved in developing. Here, the notion of intention was seen as a way of limiting time spent on deliberating about what to do, by eliminating choices inconsistent with current intentions. BDI has also aroused some interest in psychology. BDI formed the basis for a computational model of childlike reasoning CRIBB.

Cognition Network Technology

Cognition Network Technology (CNT), also known as Definiens Cognition Network Technology, is an object-based image analysis method developed by Nobel laureate Gerd Binnig together with a team of researchers at Definiens AG in Munich, Germany. It serves for extracting information from images using a hierarchy of image objects (groups of pixels), as opposed to traditional pixel processing methods. To emulate the human mind's cognitive powers, Definiens used patented image segmentation and classification processes, and developed a method to render knowledge in a semantic network. CNT examines pixels not in isolation, but in context. It builds up a picture iteratively, recognizing groups of pixels as objects. It uses the color, shape, texture and size of objects as well as their context and relationships to draw conclusions and inferences, similar to human analysis. == History == In 1994 Professor Gerd Binnig founded Definiens. CNT was first available with the launch of the eCognition software in May 2000. In June 2010, Trimble Navigation Ltd (NASDAQ: TRMB) acquired Definiens business asset in earth sciences markets, including eCognition software, and also licensed Definiens' patented CNT. In 2014, Definiens was acquired by MedImmune, the global biologics research and development arm of AstraZeneca, for an initial consideration of $150 million. == Software == Definiens Tissue Studio Definiens Tissue Studio is a digital pathology image analysis software application based on CNT. The intended use of Definiens Tissue Studio is for biomarker translational research in formalin-fixed, paraffin-embedded tissue samples which have been treated with immunohistochemical staining assays, or hematoxylin and eosin (H&E). The central concept behind Definiens Tissue Studio is a user interface that facilitates machine learning from example digital histopathology images to derive an image analysis solution suitable for the measurement of biomarkers and/or histological features within pre-defined regions of interest on a cell-by-cell basis, and within sub-cellular compartments. The derived image analysis solution is then automatically applied to subsequent digital images to objectively measure defined sets of multiparametric image features. These data sets are used for further understanding the underlying biological processes that drive cancer and other diseases. Image processing and data analysis are performed either on a local desktop computer workstation, or on a server grid. eCognition The eCognition suite offers three components that can be used stand-alone or in combination to solve image analysis tasks. eCognition Developer is a development environment for object-based image analysis. It is used in earth sciences to develop rule sets (or applications) for the analysis of remote sensing data. eCognition Architect enables non-technical users to configure, calibrate and execute image analysis workflows created in eCognition Developer. eCognition Server software provides a processing environment for batch execution of image analysis jobs. eCognition software is utilized in numerous remote sensing and geospatial application scenarios and environments, using a variety of data types: Generic: Rapid Mapping, Change Detection, Object Recognition By environment: Diverse Landcover Mapping, Urban Analysis (i.e. impervious surface area analysis for taxation, property assessment for insurance, inventory of green infrastructure), Forestry (i.e. biomass measurement, species identification, firescar measurement), Agriculture (i.e. regional planning, precision farming, crisis response), Marine and Riparian (i.e. ecosystem evaluation, disaster management, harbor monitoring). Other: Defense, security, atmosphere and climate The online eCognition community was launched in July 2009 and had 2813 members as of July 9, 2010. Membership is distributed globally and user conferences are held regularly, the last having taken place in November 2009 in Munich, Germany. The bi-annual GEOBIA (Geographic Object-Based Image Analysis) conference is heavily attended by eCognition users, with the majority of presentations based on eCognition software.