Programming tool

A programming tool or software development tool is a computer program that is used to develop another computer program, usually by helping the developer manage computer files. For example, a programmer may use a tool called a source code editor to edit source code files, and then a compiler to convert the source code into machine code files. They may also use build tools that automatically package executable program and data files into shareable packages or install kits. A set of tools that are run one after another, with each tool feeding its output to the next one, is called a toolchain. An integrated development environment (IDE) integrates the function of several tools into a single program. Usually, an IDE provides a source code editor as well as other built-in or plug-in tools that help with compiling, debugging, and testing. Whether a program is considered a development tool can be subjective. Some programs, such as the GNU compiler collection, are used exclusively for software development while others, such as Notepad, are not meant specifically for development but are nevertheless often used for programming. == Categories == Notable categories of development tools: Assembler – Converts assembly language into machine code Bug tracking system – Software application that records software bugs Build automation – Building software via an unattended fashion Code review software – Activity where one or more people check a program's code Compiler – Software that translates code from one programming language to another Compiler-compiler – Program that generates parsers or compilers, a.k.a. parser generator Debugger – Software for debugging a computer program Decompiler – Program translating executable to source code Disassembler – Computer program to translate machine language into assembly language Documentation generator – Automation technology for creating software documentation Graphical user interface builder – Software development tool Linker – Program that combines intermediate build files into an executable file Loader – Loads executable files into memory and prepares them for execution by the CPU. Memory debugger – Software memory problem finder Minifier – Removal of unnecessary characters in code without changing its functionality Pretty-printer – Formatting to make code or markup easier to readPages displaying short descriptions of redirect targets Performance profiler – Measuring the time or resources used by a section of a computer program Static code analyzer – Analysis of computer programs without executing themPages displaying short descriptions of redirect targets Source code editor – Text editor specializing in software codePages displaying short descriptions of redirect targets Source code generation – Type of computer programmingPages displaying short descriptions of redirect targets Version control system – Stores and tracks versions of files

Tesla Dojo

Tesla Dojo is a series of supercomputers designed and built by Tesla for computer vision video processing and recognition. It was used for training Tesla's machine learning models to improve its Full Self-Driving (FSD) advanced driver-assistance system. It went into production in July 2023. Dojo's goal was to efficiently process millions of terabytes of video data captured from real-life driving situations from Tesla's 4+ million cars. This goal led to a considerably different architecture than conventional supercomputer designs. In August 2025, Bloomberg News reported that the Dojo project had been disbanded, though it was restarted in January 2026. == History == Tesla operates several massively parallel computing clusters for developing its Autopilot advanced driver assistance system. Its primary unnamed cluster using 5,760 Nvidia A100 graphics processing units (GPUs) was touted by Andrej Karpathy in 2021 at the fourth International Joint Conference on Computer Vision and Pattern Recognition (CCVPR 2021) to be "roughly the number five supercomputer in the world" at approximately 81.6 petaflops, based on scaling the performance of the Nvidia Selene supercomputer, which uses similar components. However, the performance of the primary Tesla GPU cluster has been disputed, as it was not clear if this was measured using single-precision or double-precision floating point numbers (FP32 or FP64). Tesla also operates a second 4,032 GPU cluster for training and a third 1,752 GPU cluster for automatic labeling of objects. The primary unnamed Tesla GPU cluster has been used for processing one million video clips, each ten seconds long, taken from Tesla Autopilot cameras operating in Tesla cars in the real world, running at 36 frames per second. Collectively, these video clips contained six billion object labels, with depth and velocity data; the total size of the data set was 1.5 petabytes. This data set was used for training a neural network intended to help Autopilot computers in Tesla cars understand roads. By August 2022, Tesla had upgraded the primary GPU cluster to 7,360 GPUs. Dojo was first mentioned by Elon Musk in April 2019 during Tesla's "Autonomy Investor Day". In August 2020, Musk stated it was "about a year away" due to power and thermal issues. Dojo was officially announced at Tesla's Artificial Intelligence (AI) Day on August 19, 2021. Tesla revealed details of the D1 chip and its plans for "Project Dojo", a datacenter that would house 3,000 D1 chips; the first "Training Tile" had been completed and delivered the week before. In October 2021, Tesla released a "Dojo Technology" whitepaper describing the Configurable Float8 (CFloat8) and Configurable Float16 (CFloat16) floating point formats and arithmetic operations as an extension of Institute of Electrical and Electronics Engineers (IEEE) standard 754. At the follow-up AI Day in September 2022, Tesla announced it had built several System Trays and one Cabinet. During a test, the company stated that Project Dojo drew 2.3 megawatts (MW) of power before tripping a local San Jose, California power substation. At the time, Tesla was assembling one Training Tile per day. In August 2023, Tesla powered on Dojo for production use as well as a new training cluster configured with 10,000 Nvidia H100 GPUs. In January 2024, Musk described Dojo as "a long shot worth taking because the payoff is potentially very high. But it's not something that is a high probability." In June 2024, Musk explained that ongoing construction work at Gigafactory Texas is for a computing cluster claiming that it is planned to comprise an even mix of "Tesla AI" and Nvidia/other hardware with a total thermal design power of at first 130 MW and eventually exceeding 500 MW. In August 2025, Bloomberg News reported that the Dojo project was disbanded, though Musk announced it would be restarted in January 2026 with a new chip iteration. == Technical architecture == The fundamental unit of the Dojo supercomputer is the D1 chip, designed by a team at Tesla led by ex-AMD CPU designer Ganesh Venkataramanan, including Emil Talpes, Debjit Das Sarma, Douglas Williams, Bill Chang, and Rajiv Kurian. The D1 chip is manufactured by the Taiwan Semiconductor Manufacturing Company (TSMC) using 7 nanometer (nm) semiconductor nodes, has 50 billion transistors and a large die size of 645 mm2 (1.0 square inch). Updating at Artificial Intelligence (AI) Day in 2022, Tesla announced that Dojo would scale by deploying multiple ExaPODs, in which there would be: 10 Cabinets per ExaPOD (1,062,000 cores, 3,000 D1 chips) 2 System Trays per Cabinet (106,200 cores, 300 D1 chips) 6 Training Tiles per System Tray (53,100 cores, along with host interface hardware) 25 D1 chips per Training Tile (8,850 cores) 354 computing cores per D1 chip According to Venkataramanan, Tesla's senior director of Autopilot hardware, Dojo will have more than an exaflop (a million teraflops) of computing power. For comparison, according to Nvidia, in August 2021, the (pre-Dojo) Tesla AI-training center used 720 nodes, each with eight Nvidia A100 Tensor Core GPUs for 5,760 GPUs in total, providing up to 1.8 exaflops of performance. === D1 chip === Each node (computing core) of the D1 processing chip is a general purpose 64-bit CPU with a superscalar core. It supports internal instruction-level parallelism, and includes simultaneous multithreading (SMT). It doesn't support virtual memory and uses limited memory protection mechanisms. Dojo software/applications manage chip resources. The D1 instruction set supports both 64-bit scalar and 64-byte single instruction, multiple data (SIMD) vector instructions. The integer unit mixes reduced instruction set computer (RISC-V) and custom instructions, supporting 8, 16, 32, or 64 bit integers. The custom vector math unit is optimized for machine learning kernels and supports multiple data formats, with a mix of precisions and numerical ranges, many of which are compiler composable. Up to 16 vector formats can be used simultaneously. ==== Node ==== Each D1 node uses a 32-byte fetch window holding up to eight instructions. These instructions are fed to an eight-wide decoder which supports two threads per cycle, followed by a four-wide, four-way SMT scalar scheduler that has two integer units, two address units, and one register file per thread. Vector instructions are passed further down the pipeline to a dedicated vector scheduler with two-way SMT, which feeds either a 64-byte SIMD unit or four 8×8×4 matrix multiplication units. The network on-chip (NOC) router links cores into a two-dimensional mesh network. It can send one packet in and one packet out in all four directions to/from each neighbor node, along with one 64-byte read and one 64-byte write to local SRAM per clock cycle. Hardware native operations transfer data, semaphores and barrier constraints across memories and CPUs. System-wide double data rate 4 (DDR4) synchronous dynamic random-access memory (SDRAM) memory works like bulk storage. ==== Memory ==== Each core has a 1.25 megabytes (MB) of SRAM main memory. Load and store speeds reach 400 gigabytes (GB) per second and 270 GB/sec, respectively. The chip has explicit core-to-core data transfer instructions. Each SRAM has a unique list parser that feeds a pair of decoders and a gather engine that feeds the vector register file, which together can directly transfer information across nodes. ==== Die ==== Twelve nodes (cores) are grouped into a local block. Nodes are arranged in an 18×20 array on a single die, of which 354 cores are available for applications. The die runs at 2 gigahertz (GHz) and totals 440 MB of SRAM (360 cores × 1.25 MB/core). It reaches 376 teraflops using 16-bit brain floating point (BF16) numbers or using configurable 8-bit floating point (CFloat8) numbers, which is a Tesla proposal, and 22 teraflops at FP32. Each die comprises 576 bi-directional serializer/deserializer (SerDes) channels along the perimeter to link to other dies, and moves 8 TB/sec across all four die edges. Each D1 chip has a thermal design power of approximately 400 watts. === Training Tile === The water-cooled Training Tile packages 25 D1 chips into a 5×5 array. Each tile supports 36 TB/sec of aggregate bandwidth via 40 input/output (I/O) chips - half the bandwidth of the chip mesh network. Each tile supports 10 TB/sec of on-tile bandwidth. Each tile has 11 GB of SRAM memory (25 D1 chips × 360 cores/D1 × 1.25 MB/core). Each tile achieves 9 petaflops at BF16/CFloat8 precision (25 D1 chips × 376 TFLOP/D1). Each tile consumes 15 kilowatts; 288 amperes at 52 volts. === System Tray === Six tiles are aggregated into a System Tray, which is integrated with a host interface. Each host interface includes 512 x86 cores, providing a Linux-based user environment. Previously, the Dojo System Tray was known as the Training Matrix, which includes six Training Tiles, 20 Dojo Interface Processor cards across four host servers, and Ethernet-l

DBOS

DBOS (Formerly Database-Oriented Operating System, now just DBOS) is an open source durable workflow execution software library written for the Python, TypeScript, Java, and Go programming languages. DBOS arose from a joint open source project from MIT and Stanford, after a discussion between Michael Stonebraker and Matei Zaharia on how to scale and improve scheduling and performance of millions of Apache Spark tasks. Today it is a commercial company that offers an open source system to add durable computing to any software, built on concepts derived from the joint research project. == History == === 2020: Academic R&D Project === DBOS originated in 2020 as a joint open source project between MIT, Stanford, and Carnegie Mellon. The project explored the idea of operating system services built atop a distributed database - a database-oriented operating system meant to simplify and improve the scalability, security and resilience of large-scale distributed applications. The basic concept was to run a multi-node multi-core, transactional, highly-available distributed database, such as VoltDB, as the only application for a microkernel, and then to implement scheduling, messaging, file systems and other operating system services on top of the database. The architectural philosophy is described by this quote from the abstract of their initial preprint: All operating system state should be represented uniformly as database tables, and operations on this state should be made via queries from otherwise stateless tasks. This design makes it easy to scale and evolve the OS without whole-system refactoring, inspect and debug system state, upgrade components without downtime, manage decisions using machine learning, and implement sophisticated security features. A prototype was built with competitive performance to existing systems. ==

Robert Abel and Associates

Robert Abel and Associates (RA&A) was an American pioneering animation production company specializing in television commercials made with computer graphics. Founded by Robert Abel and Con Pederson in 1971, RA&A was especially known for their art direction and won many Clio Awards. Abel and his team created some of the most advanced and impressive computer-animated works of their time, including full ray-traced renders and fluid character animation at a time when such things were largely unknown. A variety of high-profile television advertisements, graphics sequences for motion pictures (including The Andromeda Strain and Tron), and work on laserdisc video games such as Cube Quest, put Abel and his team on the map in the early 1980s. The company was also originally commissioned to create the visual effects for Star Trek: The Motion Picture, but were subsequently taken off the project for mishandling funds. The company was also notable on its work for The Jacksons' 1981 music video "Can You Feel It." RA&A was on the southwest corner of Highland Avenue and Romaine in the heart of Hollywood, California. RA&A closed in 1987 following an ill-fated merger with now-defunct Omnibus Computer Graphics, Inc., a company which had been based in Toronto. Many people who worked at RA&A went on to other ground-breaking projects, including the founding of Wavefront Technologies, Rhythm & Hues and other studios. Many RA&A people went on to win Academy Awards.

Comparison of user features of operating systems

Comparison of user features of operating systems refers to a comparison of the general user features of major operating systems in a narrative format. It does not encompass a full exhaustive comparison or description of all technical details of all operating systems. It is a comparison of basic roles and the most prominent features. It also includes the most important features of the operating system's origins, historical development, and role. == Overview == An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also include accounting software for cost allocation of processor time, mass storage, printing, and other resources. For hardware functions such as input and output and memory allocation, the operating system acts as an intermediary between programs and the computer hardware, although the application code is usually executed directly by the hardware and frequently makes system calls to an OS function or is interrupted by it. Operating systems are found on many devices that contain a computer – from cellular phones and video game consoles to web servers and supercomputers. As of June 2024, the dominant general-purpose desktop operating system is Microsoft Windows with a market share of around 72.91%. macOS by Apple Inc. is in second place (14.93%), and the varieties of Linux are collectively in third place (4.04%). In the mobile sector, including both smartphones and tablets, Android is dominant with a market share of 71%, followed by Apple's iOS with 28%; for smartphones alone, Android has 72% and iOS has 28%. Linux distributions are dominant in the server and supercomputing sectors. Other specialized classes of operating systems (special-purpose operating systems)), such as embedded and real-time systems, exist for many applications. Security-focused operating systems also exist. Some operating systems have low system requirements (i.e. light-weight Linux distribution). Others may have higher system requirements. Some operating systems require installation or may come pre-installed with purchased computers (OEM-installation), whereas others may run directly from media (i.e. live cd) or flash memory (i.e. USB stick). == MS-DOS == === Overview === MS-DOS (acronym for Microsoft Disk Operating System) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and some operating systems attempting to be compatible with MS-DOS, are sometimes referred to as "DOS" (which is also the generic acronym for disk operating system). MS-DOS was the main operating system for IBM PC compatible personal computers during the 1980s, from which point it was gradually superseded by operating systems offering a graphical user interface (GUI), in various generations of the graphical Microsoft Windows operating system. IBM licensed and re-released it in 1981 as PC DOS 1.0 for use in its PCs. Although MS-DOS and PC DOS were initially developed in parallel by Microsoft and IBM, the two products diverged after twelve years, in 1993, with recognizable differences in compatibility, syntax, and capabilities. During its lifetime, several competing products were released for the x86 platform, and MS-DOS went through eight versions, until development ceased in 2000. Initially, MS-DOS was targeted at Intel 8086 processors running on computer hardware using floppy disks to store and access not only the operating system, but application software and user data as well. Progressive version releases delivered support for other mass storage media in ever greater sizes and formats, along with added feature support for newer processors and rapidly evolving computer architectures. Ultimately, it was the key product in Microsoft's development from a programming language company to a diverse software development firm, providing the company with essential revenue and marketing resources. It was also the underlying basic operating system on which early versions of Windows ran as a GUI. == Microsoft Windows == === Overview === Microsoft Windows, commonly referred to as Windows, is a group of several proprietary graphical operating system families, all of which are developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. Active Microsoft Windows families include Windows NT and Windows IoT; these may encompass subfamilies, (e.g. Windows Server or Windows Embedded Compact) (Windows CE). Defunct Microsoft Windows families include Windows 9x, Windows Mobile, and Windows Phone. Microsoft announced an operating environment named Windows on 10 November 1983, as a graphical operating system shell for MS-DOS in response to the growing interest in graphical user interfaces (GUIs); Windows 1.0 first shipped on 20 November 1985. Microsoft Windows came to dominate the world's personal computer (PC) market with over 90% market share, overtaking Mac OS, which had been introduced in 1984, while Microsoft has in 2020 lost its dominance of the consumer operating system market, with Windows down to 30%, lower than Apple's 31% mobile-only share (65% for desktop operating systems only, i.e. "PCs" vs. Apple's 28% desktop share) in its home market, the US, and 32% globally (77% for desktops), where Google's Android leads. Apple came to see Windows as an unfair encroachment on their innovation in GUI development as implemented on products such as the Lisa and Macintosh (eventually settled in court in Microsoft's favor in 1993). As of January 2023, on PCs, Windows is still the most popular operating system in all countries. However, in 2014, Microsoft admitted losing the majority of the overall operating system market to Android, because of the massive growth in sales of Android smartphones. In 2014, the number of Windows devices sold was less than 25% that of Android devices sold. This comparison, however, may not be fully relevant, as the two operating systems traditionally target different platforms. Still, numbers for server use of Windows (that are comparable to competitors) show one third market share, similar to that for end user use. As of October 2020, the most recent version of Windows for PCs, tablets and embedded devices is Windows 10, version 20H2. The most recent version for server computers is Windows Server, version 20H2. A specialized version of Windows also runs on the Xbox One video game console. === Windows 95 === Windows 95 introduced a redesigned shell based around a desktop metaphor; File shortcuts (also known as shell links) were introduced and the desktop was re-purposed to hold shortcuts to applications, files and folders, reminiscent of Mac OS. In Windows 3.1 the desktop was used to display icons of running applications. In Windows 95, the currently running applications were displayed as buttons on a taskbar across the bottom of the screen. The taskbar also contained a notification area used to display icons for background applications, a volume control and the current time. The Start menu, invoked by clicking the "Start" button on the taskbar or by pressing the Windows key, was introduced as an additional means of launching applications or opening documents. While maintaining the program groups used by its predecessor Program Manager, it also displayed applications within cascading sub-menus. The previous File Manager program was replaced by Windows Explorer and the Explorer-based Control Panel and several other special folders were added such as My Computer, Dial Up Networking, Recycle Bin, Network Neighborhood, My Documents, Recent documents, Fonts, Printers, and My Briefcase among others. AutoRun was introduced for CD drives. The user interface looked dramatically different from prior versions of Windows, but its design language did not have a special name like Metro, Aqua or Material Design. Internally it was called "the new shell" and later simply "the shell". The subproject within Microsoft to develop the new shell was internally known as "Stimpy". In 1994, Microsoft designers Mark Malamud and Erik Gavriluk approached Brian Eno to compose music for the Windows 95 project. The result was the six-second start-up music-sound of the Windows 95 operating system, The Microsoft Sound and it was first released as a startup sound in May 1995 on Windows 95 May Test Release build 468. When released for Windows 95 and Windows NT 4.0, Internet Explorer 4 came with an optional Windows Desktop Update, which modified the shell to provide several additional updates to Windows Explorer, including a Quick Launch toolbar, and new features integrated with Internet Explorer, such as Active Desktop (which allowed Internet content to be displayed directly on the desktop). Some of the user interface elements introduced in Windows 95, such as the desktop, taskbar, Start menu and Windows

Latent semantic analysis

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents. An information retrieval technique using latent semantic structure was patented in 1988 by Scott Deerwester, Susan Dumais, George Furnas, Richard Harshman, Thomas Landauer, Karen Lochbaum and Lynn Streeter. In the context of its application to information retrieval, it is sometimes called latent semantic indexing (LSI). == Overview == === Occurrence matrix === LSA can use a document-term matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents. A typical example of the weighting of the elements of the matrix is tf-idf (term frequency–inverse document frequency): the weight of an element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used. === Rank lowering === After the construction of the occurrence matrix, LSA finds a low-rank approximation to the term-document matrix. There could be various reasons for these approximations: The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an approximation (a "least and necessary evil"). The original term-document matrix is presumed noisy: for example, anecdotal instances of terms are to be eliminated. From this point of view, the approximated matrix is interpreted as a de-noisified matrix (a better matrix than the original). The original term-document matrix is presumed overly sparse relative to the "true" term-document matrix. That is, the original matrix lists only the words actually in each document, whereas we might be interested in all words related to each document—generally a much larger set due to synonymy. The consequence of the rank lowering is that some dimensions are combined and depend on more than one term: {(car), (truck), (flower)} → {(1.3452 car + 0.2828 truck), (flower)} This mitigates the problem of identifying synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings. It also partially mitigates the problem with polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense. === Derivation === Let X {\displaystyle X} be a matrix where element ( i , j ) {\displaystyle (i,j)} describes the occurrence of term i {\displaystyle i} in document j {\displaystyle j} (this can be, for example, the frequency). X {\displaystyle X} will look like this: d j ↓ t i T → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] {\displaystyle {\begin{matrix}&{\textbf {d}}_{j}\\&\downarrow \\{\textbf {t}}_{i}^{T}\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}\end{matrix}}} Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document: t i T = [ x i , 1 … x i , j … x i , n ] {\displaystyle {\textbf {t}}_{i}^{T}={\begin{bmatrix}x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\end{bmatrix}}} Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term: d j = [ x 1 , j ⋮ x i , j ⋮ x m , j ] {\displaystyle {\textbf {d}}_{j}={\begin{bmatrix}x_{1,j}\\\vdots \\x_{i,j}\\\vdots \\x_{m,j}\\\end{bmatrix}}} Now the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} between two term vectors gives the correlation between the terms over the set of documents. The matrix product X X T {\displaystyle XX^{T}} contains all these dot products. Element ( i , p ) {\displaystyle (i,p)} (which is equal to element ( p , i ) {\displaystyle (p,i)} ) contains the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} ( = t p T t i {\displaystyle ={\textbf {t}}_{p}^{T}{\textbf {t}}_{i}} ). Likewise, the matrix X T X {\displaystyle X^{T}X} contains the dot products between all the document vectors, giving their correlation over the terms: d j T d q = d q T d j {\displaystyle {\textbf {d}}_{j}^{T}{\textbf {d}}_{q}={\textbf {d}}_{q}^{T}{\textbf {d}}_{j}} . Now, from the theory of linear algebra, there exists a decomposition of X {\displaystyle X} such that U {\displaystyle U} and V {\displaystyle V} are orthogonal matrices and Σ {\displaystyle \Sigma } is a diagonal matrix. This is called a singular value decomposition (SVD): X = U Σ V T {\displaystyle {\begin{matrix}X=U\Sigma V^{T}\end{matrix}}} The matrix products giving us the term and document correlations then become X X T = ( U Σ V T ) ( U Σ V T ) T = ( U Σ V T ) ( V T T Σ T U T ) = U Σ V T V Σ T U T = U Σ Σ T U T X T X = ( U Σ V T ) T ( U Σ V T ) = ( V T T Σ T U T ) ( U Σ V T ) = V Σ T U T U Σ V T = V Σ T Σ V T {\displaystyle {\begin{matrix}XX^{T}&=&(U\Sigma V^{T})(U\Sigma V^{T})^{T}=(U\Sigma V^{T})(V^{T^{T}}\Sigma ^{T}U^{T})=U\Sigma V^{T}V\Sigma ^{T}U^{T}=U\Sigma \Sigma ^{T}U^{T}\\X^{T}X&=&(U\Sigma V^{T})^{T}(U\Sigma V^{T})=(V^{T^{T}}\Sigma ^{T}U^{T})(U\Sigma V^{T})=V\Sigma ^{T}U^{T}U\Sigma V^{T}=V\Sigma ^{T}\Sigma V^{T}\end{matrix}}} Since Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} and Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } are diagonal we see that U {\displaystyle U} must contain the eigenvectors of X X T {\displaystyle XX^{T}} , while V {\displaystyle V} must be the eigenvectors of X T X {\displaystyle X^{T}X} . Both products have the same non-zero eigenvalues, given by the non-zero entries of Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} , or equally, by the non-zero entries of Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } . Now the decomposition looks like this: X U Σ V T ( d j ) ( d ^ j ) ↓ ↓ ( t i T ) → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] = ( t ^ i T ) → [ [ u 1 ] … [ u l ] ] ⋅ [ σ 1 … 0 ⋮ ⋱ ⋮ 0 … σ l ] ⋅ [ [ v 1 ] ⋮ [ v l ] ] {\displaystyle {\begin{matrix}&X&&&U&&\Sigma &&V^{T}\\&({\textbf {d}}_{j})&&&&&&&({\hat {\textbf {d}}}_{j})\\&\downarrow &&&&&&&\downarrow \\({\textbf {t}}_{i}^{T})\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}&=&({\hat {\textbf {t}}}_{i}^{T})\rightarrow &{\begin{bmatrix}{\begin{bmatrix}\,\\\,\\{\textbf {u}}_{1}\\\,\\\,\end{bmatrix}}\dots {\begin{bmatrix}\,\\\,\\{\textbf {u}}_{l}\\\,\\\,\end{bmatrix}}\end{bmatrix}}&\cdot &{\begin{bmatrix}\sigma _{1}&\dots &0\\\vdots &\ddots &\vdots \\0&\dots &\sigma _{l}\\\end{bmatrix}}&\cdot &{\begin{bmatrix}{\begin{bmatrix}&&{\textbf {v}}_{1}&&\end{bmatrix}}\\\vdots \\{\begin{bmatrix}&&{\textbf {v}}_{l}&&\end{bmatrix}}\end{bmatrix}}\end{matrix}}} The values σ 1 , … , σ l {\displaystyle \sigma _{1},\dots ,\sigma _{l}} are called the singular values, and u 1 , … , u l {\displaystyle u_{1},\dots ,u_{l}} and v 1 , … , v l {\displaystyle v_{1},\dots ,v_{l}} the left and right singular vectors. Notice the only part of U {\displaystyle U} that contributes to t i {\displaystyle {\textbf {t}}_{i}} is the i 'th {\displaystyle i{\textrm {'th}}} row. Let this row vector be called t ^ i T {\displaystyle {\hat {\textrm {t}}}_{i}^{T}} . Likewise, the only part of V T {\displaystyle V^{T}} that contributes to d j {\displaystyle {\textbf {d}}_{j}} is the j 'th {\displaystyle j{\textrm {'th}}} column, d ^ j {\displaystyle {\hat {\textrm {d}}}_{j}} . These are not the eigenvectors, but depend on all the eigenvectors. I

Information Age

The Information Age is a historical period that began in the mid-20th century. It is characterized by a rapid shift from traditional industries, as established during the Industrial Revolution, to an economy centered on information technology. The onset of the Information Age has been linked to the development of the transistor in 1947. Advances in computer miniaturization, internet communication, and semiconductor technology enabled the rapid expansion of digital systems and global information networks. The Information Age transformed industries such as education, healthcare, finance, entertainment, and communication through digital infrastructure and connected technologies. The rise of smartphones and cloud-based services further accelerated global internet accessibility and digital interaction. == Digital applications and mobile technology == The expansion of Android and iOS ecosystems during the 21st century contributed to the widespread use of utility applications and mobile productivity tools. Applications related to calculations, scheduling, digital organization, and educational support became increasingly common on smartphones and tablets. Mobile utility software demonstrates how modern digital platforms support accessibility and everyday online services. Independent developers have contributed to this technological ecosystem through lightweight applications focused on mobile usability and internet-based functionality. == Influence on modern society == The Information Age has reshaped the way individuals communicate, consume information, and interact with digital services. Social media platforms, artificial intelligence systems, cloud storage, and mobile computing continue to influence modern economies and online communities worldwide. Emerging technologies such as the Internet of things, machine learning, and advanced automation are often associated with the transition toward the Fourth Industrial Revolution. == History == The digital revolution converted technology from analog format to digital format. By doing this, it became possible to make copies that were identical to the original. In digital communications, for example, repeating hardware was able to amplify the digital signal and pass it on with no loss of information in the signal. Of equal importance to the revolution was the ability to easily move the digital information between media and to access or distribute it remotely. One turning point of the revolution was the change from analog to digitally recorded music. During the 1980s, the digital format of optical compact discs gradually replaced analog formats, such as vinyl records and cassette tapes, as the popular medium of choice. === Previous inventions === Humans have manufactured tools for counting and calculating since ancient times, such as the abacus, astrolabe, equatorium, and mechanical timekeeping devices. More complicated devices started appearing in the 1600s, including the slide rule and mechanical calculators. By the early 1800s, the Industrial Revolution had produced mass-market calculators like the arithmometer and the enabling technology of the punch card. Charles Babbage proposed a mechanical general-purpose computer called the Analytical Engine, but it was never successfully built, and was largely forgotten by the 20th century, and unknown to most of the inventors of modern computers. The Second Industrial Revolution, in the last quarter of the 19th century, developed useful electrical circuits and the telegraph. In the 1880s, Herman Hollerith developed electromechanical tabulating and calculating devices using punch cards and unit record equipment, which became widespread in business and government. Meanwhile, various analog computer systems used electrical, mechanical, or hydraulic systems to model problems and calculate answers. These included an 1872 tide-predicting machine, differential analysers, perpetual calendar machines, the Deltar for water management in the Netherlands, network analyzers for electrical systems, and various machines for aiming military guns and bombs. The construction of problem-specific analog computers continued in the late 1940s and beyond, with FERMIAC for neutron transport, Project Cyclone for various military applications, and the Phillips Machine for economic modeling. Building on the complexity of the Z1 and Z2, German inventor Konrad Zuse used electromechanical systems to complete in 1941 the Z3, the world's first working programmable, fully automatic digital computer. Also, during World War II, Allied engineers constructed electromechanical bombes to break the German Enigma machine encoding. The base-10 electromechanical Harvard Mark I was completed in 1944, and was to some degree improved with inspiration from Charles Babbage's designs. === 1947–1969: Origins === In 1947, the first working transistor, the germanium-based point-contact transistor, was invented by John Bardeen and Walter Houser Brattain while working under William Shockley at Bell Labs. This led the way to more advanced digital computers. From the late 1940s, universities, the military, and businesses developed computer systems to digitally replicate and automate previously manually performed mathematical calculations, with the LEO being the first commercially available general-purpose computer. Digital communication became economical for widespread adoption after the invention of the personal computer in the 1970s. Claude Shannon, a Bell Labs mathematician, is generally credited with laying the foundations of digitalization in his pioneering 1948 article, A Mathematical Theory of Communication. In 1948, Bardeen and Brattain patented an insulated-gate transistor (IGFET) with an inversion layer. Their concept forms the basis of CMOS and DRAM technology today. In 1957, at Bell Labs, Frosch and Derick were able to manufacture planar silicon dioxide transistors, later a team at Bell Labs demonstrated a working MOSFET. The first integrated circuit milestone was achieved by Jack Kilby in 1958. Other important technological developments included the invention of the monolithic integrated circuit chip by Robert Noyce at Fairchild Semiconductor in 1959, made possible by the planar process developed by Jean Hoerni. In 1963, complementary MOS (CMOS) was developed by Chih-Tang Sah and Frank Wanlass at Fairchild Semiconductor. The self-aligned gate transistor, which further facilitated mass production, was invented in 1966 by Robert Bower at Hughes Aircraft and independently by Robert Kerwin, Donald Klein, and John Sarace at Bell Labs. In 1962, AT&T deployed the T-carrier for long-haul pulse-code modulation (PCM) digital voice transmission. The T1 format carried 24 pulse-code modulated, time-division multiplexed speech signals, each encoded in 64 kbit/s streams, leaving 8 kbit/s of framing information, which facilitated the synchronization and demultiplexing at the receiver. Over the subsequent decades, the digitisation of voice became the norm for all but the last mile (where analogue continued to be the norm right into the late 1990s). Following the development of MOS integrated circuit chips in the early 1960s, MOS chips reached higher transistor density and lower manufacturing costs than bipolar integrated circuits by 1964. MOS chips further increased in complexity at a rate predicted by Moore's law, leading to large-scale integration (LSI) with hundreds of transistors on a single MOS chip by the late 1960s. The application of MOS LSI chips to computing was the basis for the first microprocessors, as engineers began recognizing that a complete computer processor could be contained on a single MOS LSI chip. In 1968, Fairchild engineer Federico Faggin improved MOS technology with his development of the silicon-gate MOS chip, which he later used to develop the Intel 4004, the first single-chip microprocessor. It was released by Intel in 1971 and laid the foundations for the microcomputer revolution that began in the 1970s. MOS technology also led to the development of semiconductor image sensors suitable for digital cameras. The first such image sensor was the charge-coupled device, developed by Willard S. Boyle and George E. Smith at Bell Labs in 1969, based on MOS capacitor technology. === 1969–1989: Invention of the internet, rise of home computers === The public was first introduced to the concepts that led to the Internet when a message was sent over the ARPANET in 1969. Packet switched networks such as ARPANET, Mark I, CYCLADES, Merit Network, Tymnet, and Telenet, were developed in the late 1960s and early 1970s using a variety of protocols. The ARPANET in particular led to the development of protocols for internetworking, in which multiple separate networks could be joined into a network of networks. The Whole Earth movement of the 1960s advocated the use of new technology. In the 1970s, the home computer was introduced, time-sharing computers, the video game console, the first coin-op vide