Principal component analysis

Principal component analysis

Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data are linearly transformed onto a new coordinate system such that the directions (principal components) capturing the largest variation in the data can be easily identified. The principal components of a collection of points in a real coordinate space are a sequence of p {\displaystyle p} unit vectors, where the i {\displaystyle i} -th vector is the direction of a line that best fits the data while being orthogonal to the first i − 1 {\displaystyle i-1} vectors. Here, a best-fitting line is defined as one that minimizes the average squared perpendicular distance from the points to the line. These directions (i.e., principal components) constitute an orthonormal basis in which different individual dimensions of the data are linearly uncorrelated. Many studies use the first two principal components in order to plot the data in two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as population genetics, microbiome studies, and atmospheric science. == Overview == When performing PCA, the first principal component of a set of p {\displaystyle p} variables is the derived variable formed as a linear combination of the original variables that explains the most variance. The second principal component explains the most variance in what is left once the effect of the first component is removed, and we may proceed through p {\displaystyle p} iterations until all the variance is explained. PCA is most commonly used when many of the variables are highly correlated with each other and it is desirable to reduce their number to an independent set. The first principal component can equivalently be defined as a direction that maximizes the variance of the projected data. The i {\displaystyle i} -th principal component can be taken as a direction orthogonal to the first i − 1 {\displaystyle i-1} principal components that maximizes the variance of the projected data. For either objective, it can be shown that the principal components are eigenvectors of the data's covariance matrix. Thus, the principal components are often computed by eigendecomposition of the data covariance matrix or singular value decomposition of the data matrix. PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to factor analysis. Factor analysis typically incorporates more domain-specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix. PCA is also related to canonical correlation analysis (CCA). CCA defines coordinate systems that optimally describe the cross-covariance between two datasets while PCA defines a new orthogonal coordinate system that optimally describes variance in a single dataset. Robust and L1-norm-based variants of standard PCA have also been proposed. == History == PCA was invented in 1901 by Karl Pearson, as an analogue of the principal axis theorem in mechanics; it was later independently developed and named by Harold Hotelling in the 1930s. Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT) in signal processing, the Hotelling transform in multivariate quality control, proper orthogonal decomposition (POD) in mechanical engineering, singular value decomposition (SVD) of X (invented in the last quarter of the 19th century), eigenvalue decomposition (EVD) of XTX in linear algebra, factor analysis (for a discussion of the differences between PCA and factor analysis see Ch. 7 of Jolliffe's Principal Component Analysis), Eckart–Young theorem (Harman, 1960), or empirical orthogonal functions (EOF) in meteorological science (Lorenz, 1956), empirical eigenfunction decomposition (Sirovich, 1987), quasiharmonic modes (Brooks et al., 1988), spectral decomposition in noise and vibration, and empirical modal analysis in structural dynamics. == Intuition == PCA can be thought of as fitting a p-dimensional ellipsoid to the data, where each axis of the ellipsoid represents a principal component. If some axis of the ellipsoid is small, then the variance along that axis is also small. To find the axes of the ellipsoid, we must first center the values of each variable in the dataset on 0 by subtracting the mean of the variable's observed values from each of those values. These transformed values are used instead of the original observed values for each of the variables. Then, we compute the covariance matrix of the data and calculate the eigenvalues and corresponding eigenvectors of this covariance matrix. Then we must normalize each of the orthogonal eigenvectors to turn them into unit vectors. Once this is done, each of the mutually-orthogonal unit eigenvectors can be interpreted as an axis of the ellipsoid fitted to the data. This choice of basis will transform the covariance matrix into a diagonalized form, in which the diagonal elements represent the variance of each axis. The proportion of the variance that each eigenvector represents can be calculated by dividing the eigenvalue corresponding to that eigenvector by the sum of all eigenvalues. Biplots and scree plots (degree of explained variance) are used to interpret findings of the PCA. == Details == PCA is defined as an orthogonal linear transformation on a real inner product space that transforms the data to a new coordinate system such that the greatest variance by some scalar projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on. Consider an n × p {\displaystyle n\times p} data matrix, X, with column-wise zero empirical mean (the sample mean of each column has been shifted to zero), where each of the n rows represents a different repetition of the experiment, and each of the p columns gives a particular kind of feature (say, the results from a particular sensor). Mathematically, the transformation is defined by a set of size l {\displaystyle l} (where l {\displaystyle l} is usually selected to be strictly less than p {\displaystyle p} to reduce dimensionality) of p {\displaystyle p} -dimensional vectors of weights or coefficients w ( k ) = ( w 1 , … , w p ) ( k ) {\displaystyle \mathbf {w} _{(k)}=(w_{1},\dots ,w_{p})_{(k)}} that map each row vector x ( i ) = ( x 1 , … , x p ) ( i ) {\displaystyle \mathbf {x} _{(i)}=(x_{1},\dots ,x_{p})_{(i)}} of X to a new vector of principal component scores t ( i ) = ( t 1 , … , t l ) ( i ) {\displaystyle \mathbf {t} _{(i)}=(t_{1},\dots ,t_{l})_{(i)}} , given by t k ( i ) = x ( i ) ⋅ w ( k ) f o r i = 1 , … , n k = 1 , … , l {\displaystyle {t_{k}}_{(i)}=\mathbf {x} _{(i)}\cdot \mathbf {w} _{(k)}\qquad \mathrm {for} \qquad i=1,\dots ,n\qquad k=1,\dots ,l} in such a way that the individual variables t 1 , … , t l {\displaystyle t_{1},\dots ,t_{l}} of t considered over the data set successively inherit the maximum possible variance from X, with each coefficient vector w constrained to be a unit vector. The above may equivalently be written in matrix form as T = X W {\displaystyle \mathbf {T} =\mathbf {X} \mathbf {W} } where T i k = t k ( i ) {\displaystyle {\mathbf {T} }_{ik}={t_{k}}_{(i)}} , X i j = x j ( i ) {\displaystyle {\mathbf {X} }_{ij}={x_{j}}_{(i)}} , and W j k = w j ( k ) {\displaystyle {\mathbf {W} }_{jk}={w_{j}}_{(k)}} . === First component === In order to maximize variance, the first weight vector w(1) thus has to satisfy w ( 1 ) = arg ⁡ max ‖ w ‖ = 1 { ∑ i ( t 1 ) ( i ) 2 } = arg ⁡ max ‖ w ‖ = 1 { ∑ i ( x ( i ) ⋅ w ) 2 } {\displaystyle \mathbf {w} _{(1)}=\arg \max _{\Vert \mathbf {w} \Vert =1}\,\left\{\sum _{i}(t_{1})_{(i)}^{2}\right\}=\arg \max _{\Vert \mathbf {w} \Vert =1}\,\left\{\sum _{i}\left(\mathbf {x} _{(i)}\cdot \mathbf {w} \right)^{2}\right\}} Equivalently, writing this in matrix form gives w ( 1 ) = arg ⁡ max ‖ w ‖ = 1 { ‖ X w ‖ 2 } = arg ⁡ max ‖ w ‖ = 1 { w T X T X w } {\displaystyle \mathbf {w} _{(1)}=\arg \max _{\left\|\mathbf {w} \right\|=1}\left\{\left\|\mathbf {Xw} \right\|^{2}\right\}=\arg \max _{\left\|\mathbf {w} \right\|=1}\left\{\mathbf {w} ^{\mathsf {T}}\mathbf {X} ^{\mathsf {T}}\mathbf {Xw} \right\}} Since w(1) has been defined to be a unit vector, it equivalently also satisfies w ( 1 ) = arg ⁡ max { w T X T X w w T w } {\displaystyle \mathbf {w} _{(1)}=\arg \max \left\{{\frac {\mathbf {w} ^{\mathsf {T}}\mathbf {X} ^{\mathsf {T}}\mathbf {Xw} }{\mathbf {w} ^{\mathsf {T}}\mathbf {w} }}\right\}} The quantity to be maximised can be recognised as a Rayleigh quotient. A standard result for a positive semidefinite matrix such as XTX is that the quotient's maximum possible value is the largest eigenvalue of the matrix, which occurs when w is the corresponding eigenvector. With w(1) found, the first principal component of a data vector

Bump (application)

Bump was an iOS and Android mobile app that enabled smartphone users to transfer contact information, photos and files between devices. In 2011, it was #8 on Apple's list of all-time most popular free iPhone apps, and by February 2013 it had been downloaded 125 million times. Its developer, Bump Technologies, shut down the service and discontinued the app on January 31, 2014, after being acquired by Google for Google Photos and Android Camera. == Features == Bump sent contact information, photos and files to another device over the internet. Before activating the transfer, each user confirmed what they want to send to the other user. To initiate a transfer, two people physically bumped their phones together. A screen appeared on both users' smartphone displays, allowing them to confirm what they want to send to each other. When two users bumped their phones, software on the phones send a variety of sensor data to an algorithm running on Bump servers, which included the location of the phone, accelerometer readings, IP address, and other sensor readings. The algorithm figured out which two phones felt the same physical bump and then transfers the information between those phones. Bump did not use Near Field Communication. February 2012 release of Bump 3.0 for iOS, the company streamlined the app to focus on its most frequently used features: contact and photo sharing. Bump 3.0 for Android maintained the features eliminated from the iOS version but moved them behind swipeable layers. In May 2012, a Bump update enabled users to transfer photos from their phone to their computer via a web service. To initiate a transfer, the user goes to the Bump website on their computer and bumps the smartphone on the computer keyboard's space bar. By December 2012, various Bump updates for iOS and Android had added the abilities to share video, audio, and any files. Users swipe to access those features. In February 2013, an update to the Bump iOS and Android apps enabled users to transfer photos, videos, contacts and other files from a computer to a smartphone and vice versa via a web service. To perform the transfer, users went to the Bump website on their computer and bump the smartphone on the computer keyboard's space bar. == History == The underlying idea of a synchronous gesture like bumping two devices for content transfer or pairing them was first conceived by Ken Hinkley of Microsoft Research in 2003. This idea was presented at a user interface and technology conference that same year. The paper proposed the use of accelerometers and a bumping gesture of two devices to enable communication, screen sharing and content transfer between them. Similar to this original concept, the idea for Bump app was conceived by David Lieb, a former employee of Texas Instruments, while he was attending the University of Chicago Booth School of Business for his MBA. While going through the orientation and meeting process of business school, he became frustrated by constantly entering contact information into his iPhone and felt that the process could be improved. His fellow Texas Instruments employees Andy Huibers and Jake Mintz, who was a classmate of Lieb's at the University of Chicago's MBA program, joined Lieb to form Bump Technologies. Bump Technologies launched in 2008 and is located in Mountain View, CA. Early funding for the project was provided by startup incubator Y Combinator, Sequoia Capital and other angel investors. It gained attention at the CTIA international wireless conference, due to its accessibility and novelty factor. In October 2009, Bump received $3.4m in Series A funding followed in January 2011 with a $16m series B financing round led by Andreessen Horowitz. Silicon Valley venture capitalist Marc Andreessen sits on the company's board. The Bump app debuted in the Apple iOS App Store in March 2009 and was “one of the apps that helped to define the iPhone” (Harry McCracken, Technologizer). It soon became the billionth download on Apple's App Store. An Android version launched in November 2009. By the time Bump 3.0 for iOS was released in February 2012, the app had been installed 77 million times, with users sharing more than 2 million photos daily. As of February 2013, there had been 125 million Bump app downloads. == Other apps created by Bump Technologies == Bump Technologies worked with PayPal in March 2010 to create a PayPal iPhone application. The application, which allows two users to automatically activate an Internet transfer of money between their accounts, found widespread adoption. A similar version was released for Android in August 2010. The Bump capability in PayPal's apps was removed in March 2012. At that time, Bump Technologies released Bump Pay, an iOS app that lets users transfer money via PayPal by physically bumping two smartphones together. The tool was originally created for the Bump team to use when splitting up restaurant bills. The payment feature was not added to the Bump app because the company “wanted to make it as simple as possible so people understand how this works,” Lieb told ABC News. Bump Pay was the first app from the company's Bump Labs initiative. A goal of Bump Labs is to test new app ideas that may not fit within the main Bump app. ING Direct added a feature to its iPhone app in 2011 that lets users transfer money to each other using Bump's technology. The feature was later added to its Android app, now called Capital One 360. In July 2012, Bump Technologies released Flock, an iPhone photo sharing app. An Android version was released in December 2012. Using geolocation data embedded in photos and a user's Facebook connections, Flock finds pictures the user takes while out with friends and family and puts everyone's photos from that event into a single shared album. Users receive a push notification after the event, asking if they want to share their photos with friends who were there in the moment. The app will also scan previous photos in the iPhone camera roll and uncover photos that have yet to be shared. If location services were enabled at the time a photo was taken, Flock allows users to create an album of photos from the past with the friends who were there with them. == Acquisition by Google == On September 16, 2013, Bump Technologies announced that it had been acquired by Google. On December 31, 2013, they broke the news that both Bump and Flock would be discontinued so that the team could focus on new projects at Google. The apps were removed from the App Store and Google Play on January 31, 2014. The company subsequently deleted all user data and shut down their servers, thus rendering existing installations of the apps inoperable.

Liang Wenfeng

Liang Wenfeng (Chinese: 梁文锋; pinyin: Liáng Wénfēng; born 1985) is a Chinese entrepreneur and businessman who is the co-founder of the quantitative hedge fund High-Flyer, as well as the founder and CEO of its artificial intelligence company DeepSeek. Liang attended Zhejiang University, and began his career by applying machine learning methods to quantitative finance. Through High-Flyer, he built large-scale computing infrastructure that was later used to support artificial intelligence research, leading to the creation of DeepSeek in 2023. DeepSeek gained international attention following the release of DeepSeek-R1, which analysts described as demonstrating high-level performance with comparatively limited compute resources. In 2025, Liang was named to Time magazine's list of 100 Most Influential People in AI and Fortune's list of the Most Powerful People in Business. == Early life == Liang was born in 1985 in the village of Mililing (米历岭村), Qinba town (覃巴镇), Wuchuan city (吴川市), Guangdong. His parents were both primary school teachers. Liang was routinely praised by both locals and teachers alike. Even since middle school, Liang was recalled for being well-known for reading comic books, while also being very proficient in mathematics. == Education == After elementary school, Liang attended Wuchuan No. 1 Middle School. There, he quickly excelled in class and ranked highly amongst his peers. He taught himself high school and university-level mathematics courses. Liang then attended Wuchaun No. 1 High School. In these years, he developed hobbies of mathematical modeling and conducting research projects. Compared to his peers, he was always ranked highly. For every mathematics exam, he always ranked within the top three. He was also the top scorer in the Zhanjiang region of Guangdong for the college entrance exam. Thus, in 2002, Liang left high school early to further pursue his education at the university level at the young age of 17. Attending Zhejiang University at the age of 17, Liang earned a Bachelor of Engineering in Electronic Information Engineering in 2007 and his Master of Engineering in Information & Communication Engineering in 2010. His master's dissertation was titled "Study on Object Tracking Algorithm Based on Low-Cost PTZ camera" (基于低成本PTZ摄像机的目标跟踪算法研究). In his college years, DJI founder Wang Tao asked Liang to join as a co-founder. Liang declined the invitation to pursue artificial intelligence methodologies in financial markets. While he states that those around him had entrepreneurial mindsets, he himself valued academics. == Career == === Early career (2008–2016) === During the 2008 financial crisis, Liang formed a team with his classmates to accumulate data related to financial markets. He also led the team to explore quantitative trading using machine learning and other technologies. After his graduation, Liang moved to a cheap flat in Chengdu, Sichuan, where he experimented with ways to apply AI to various fields. These ventures failed, until he tried applying AI to finance. In 2013, Liang attempted to integrate artificial intelligence with quantitative trading and founded Hangzhou Yakebi Investment Management Co Ltd with Xu Jin, an alumnus of Zhejiang University. In 2015, they co-founded Hangzhou Huanfang Technology Co Ltd, which is today's Zhejiang Jiuzhang Asset Management Co Ltd. === High-Flyer (2016–2023) === In February 2016, Liang and two other engineering classmates co-founded Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). The team relied on mathematics and AI to make investments. Much of the early startup culture was described by former employees to be "geeky" and "quirky," often seen as contrary to the existing culture in large Chinese tech companies. In 2019, Liang founded High-Flyer AI which was dedicated to research on AI algorithms and its basic applications. By this time, High-Flyer had over 10 billion yuan in assets under management. On 30 August 2019, Liang Wenfeng delivered a keynote speech entitled "The Future of Quantitative Investment in China from a Programmer's Perspective" at the Private Equity Golden Bull Award ceremony held by China Securities Journal, and sparked heated discussions. Liang stated that the criterion for determining what is quantitative or non-quantitative is whether the investment decision is made by quantitative methods or by people. Quantitative funds do not have portfolio managers making the decisions and instead are just servers. He also stated High-Flyer's mission is to improve the effectiveness of China's secondary market. In February 2021, Gregory Zuckerman's book The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution was published. Liang wrote the preface for the Chinese edition of the book where he stated that whenever he encountered difficulties at work, he would think of Simons' words "There must be a way to model prices". In January 2025, Zuckerman wrote in The Wall Street Journal where he acknowledged this fact and stated he has been trying to get in touch with Liang but much like Simons, Liang is very secretive and difficult to contact. During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Liang wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group. === DeepSeek (since 2023) === ==== DeepSeek begins ==== In May 2023, Liang announced High-Flyer would pursue the development of artificial general intelligence and launched DeepSeek. During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China. That laid the foundation for DeepSeek to operate as an LLM developer. Liang also stated DeepSeek gets funding from High-Flyer. This was because when DeepSeek was founded, venture capital firms were reluctant in providing funding as it was unlikely that it would be able to generate an exit in a short period of time. Liang only personally holds 1% of the company, with 99% of the company being held by Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). With DeepSeek's funding model, it lacks commercial pressure and rigid key performance indicators, enabling the company to deviate from previously established model architectures. ==== Early development ==== In July 2024, Liang was interviewed again by 36Kr. He stated that when DeepSeek-V2 was released and triggered an AI price war in China, it came as a huge surprise as the team did not expect pricing to be so sensitive. Liang's aggressive pricing of the language model forced domestic tech giants including Alibaba and Baidu to cut their own rates by over 95%. He also stated that as China's economy develops, it should gradually become a contributor instead of freeriding. What is lacking in China's innovation is not capital but a lack of confidence and knowledge on organizing talent into it. DeepSeek has not hired anyone particularly special and employees tend to be locally educated. When it comes to disruptive technologies, closed source approaches can only temporarily delay others in catching up. As the goal was long-term, DeepSeek sought employees who had ability and passion rather than experience. To retain a high talent density relative to larger firms like Bytedance or Baidu, DeepSeek aimed to maintain a low-hierarchy corporate culture, with members working in project-based groups, as well as competitive compensation. Liang emphasized his vision for DeepSeek employees to bring their "unique experience and ideas" instead of needing to be explicitly directed, with an overall bottom-up approach to division of labor. Liang noted that a significant outcome of this approach was the multi-head latent attention training architecture, which was attributed directly to a young DeepSeek researcher's personal interest. This advancement played a core role in reducing the cost of training the DeepSeek-V3 model, released in December 2024. ==== Release of DeepSeek-R1 ==== Also on 20 January 2025, DeepSeek, the company Liang founded and served as the CEO, released DeepSeek-R1, a 671-billion-parameter open-source reasoning AI model, alongside the publication of a detailed technical paper explaining its architecture and training methodology. The model was built using just 2,048 Nvidia H800 GPUs at a cost of $5.6 million, showcasing a resource-efficient approach that contrasted sharply with the billion-dollar budgets of Western competitors. The development of DeepSeek-R1 occurred amidst U.S. sanctions where Trump limited sales of Nvidia chips to China. By 27 January, DeepSeek surpassed ChatGPT to become the #1 free app on the United States iOS App Store. U.S. stocks plummeted, as more than $1 trillion was erased in market capitalization amid panic over DeepSeek. Technology journ

Sarah Guo

Sarah Guo is an American tech investor. She is the founder of the venture capital firm Conviction and formerly a general partner at Greylock Partners. == Early life and education == Guo grew up in Wisconsin. Her parents worked for Bell Labs. After attending Phillips Academy, she graduated from the University of Pennsylvania and its Wharton School. She received a Bachelor of Arts, a Bachelor of Science, a Master of Business Administration (M.B.A.), and a Master of Arts from the University of Pennsylvania. == Career == As a teenager, Guo worked at Casa Systems, a cloud networking company founded by her parents that launched in 2003 and went public in 2017. She then worked at Goldman Sachs. In 2013, Guo joined Greylock Partners. While still in her twenties, she became the firm's youngest General Partner. Guo left Greylock in July 2022, and in October of that year, launched a new early-stage venture capital firm focused on AI with $101 million. In 2025, Conviction raised a second fund in late 2024 with Mike Vernal. Conviction's investments include early investments in Baseten, Cognition AI, OpenEvidence, Harvey, HeyGen, Mistral AI, Sierra Platform, Sunday Robotics, and Thinking Machines Lab. Guo appears in media outlets, as an expert in AI, infrastructure, business software, cybersecurity, technology policy and software engineering. Guo is on the Midas List and the Midas Seed List of top investors. She co-hosts the podcast No Priors with tech founder and super angel Elad Gil. == Personal life == Guo is married to Pat Grady of Sequoia Capital.

Stockfish (chess)

Stockfish is a free and open-source chess engine, available for various desktop and mobile platforms. It can be used in chess software through the Universal Chess Interface. Stockfish has been one of the strongest chess engines in the world for several years. It has won all main events of the Top Chess Engine Championship (TCEC) and the Chess.com Computer Chess Championship (CCC) since 2020 and, as of May 2026, is the strongest CPU chess engine in the world with an estimated Elo rating of 3653 in a time control of 40/15 (15 minutes to make 40 moves), according to CCRL. The Stockfish engine was developed by Tord Romstad, Marco Costalba, and Joona Kiiski, and was derived from Glaurung, an open-source engine by Tord Romstad released in 2004. It is now being developed and maintained by the Stockfish community. Stockfish historically used only a classical hand-crafted function to evaluate board positions, but with the introduction of the efficiently updatable neural network (NNUE) in August 2020, Stockfish 12 adopted a hybrid evaluation system that primarily used the neural network and occasionally relied on the hand-crafted evaluation. In July 2023, Stockfish removed the hand-crafted evaluation and transitioned to a fully neural network-based approach. == Features == Stockfish uses a tree-search algorithm based on alpha–beta search with several hand-designed heuristics. Stockfish represents positions using bitboards. Stockfish supports Chess960, a feature it inherited from Glaurung. Support for Syzygy tablebases, previously available in a fork maintained by Ronald de Man, was integrated into Stockfish in 2014. In 2018, support for the 7-man Syzygy was added, shortly after the tablebase was made available. Stockfish supports an unlimited number of CPU threads in multiprocessor systems, with a maximum transposition table size of 32 TB. Stockfish has been a very popular engine on various platforms. On desktop, it is the default chess engine bundled with the Internet Chess Club interface programs BlitzIn and Dasher. On mobile, it has been bundled with the Stockfish app, SmallFish and Droidfish. Other Stockfish-compatible graphical user interfaces (GUIs) include Fritz, Arena, Stockfish for Mac, and PyChess. Stockfish can be compiled to WebAssembly or JavaScript, allowing it to run in the browser. Both Chess.com and Lichess provide Stockfish in this form in addition to a server-side program. Release versions and development versions are available as C++ source code and as precompiled versions for Microsoft Windows, macOS, Linux 32-bit/64-bit and Android. == History == The program originated from Glaurung, an open-source chess engine created by Tord Romstad and first released in 2004. Four years later, Marco Costalba forked the project, naming it Stockfish because it was "produced in Norway and cooked in Italy" (Romstad is Norwegian and Costalba is Italian). The first version, Stockfish 1.0, was released in November 2008. For a while, new ideas and code changes were transferred between the two programs in both directions, until Romstad decided to discontinue Glaurung in favor of Stockfish, which was the stronger engine at the time. The last Glaurung version (2.2) was released in December 2008. Around 2011, Romstad decided to abandon his involvement with Stockfish in order to spend more time on his new iOS chess app. On 18 June 2014 Marco Costalba announced that he had "decided to step down as Stockfish maintainer" and asked that the community create a fork of the current version and continue its development. An official repository, managed by a volunteer group of core Stockfish developers, was created soon after and currently manages the development of the project. === Fishtest === Since 2013, Stockfish has been developed using a distributed testing framework named Fishtest, where volunteers can donate CPU time for testing improvements to the program. Changes to game-playing code are accepted or rejected based on results of playing of tens of thousands of games on the framework against an older "reference" version of the program, using sequential probability ratio testing. Tests on the framework are verified using the chi-squared test, and only if the results are statistically significant are they deemed reliable and used to revise the software code. After the inception of Fishtest, Stockfish gained 120 Elo points in 12 months, propelling it to the top of all major rating lists. As of May 2026, the framework has used a total of more than 20,100 years of CPU time to play over 10 billion chess games. === NNUE === In June 2020, Stockfish introduced the efficiently updatable neural network (NNUE) approach, based on earlier work by computer shogi programmers. Instead of using manually designed heuristics to evaluate the board, this approach introduced a neural network trained on millions of positions which could be evaluated quickly on CPU. On 2 September 2020, the twelfth version of Stockfish was released, incorporating NNUE, and reportedly winning ten times more game pairs than it loses when matched against version eleven. In July 2023, the classical evaluation was completely removed in favor of the NNUE evaluation. == Competition results == === Top Chess Engine Championship === Stockfish is a TCEC multiple-time champion and the current leader in trophy count. Ever since TCEC restarted in 2013, Stockfish has finished first or second in every season except one. Stockfish finished second in TCEC Season 4 and 5, with scores of 23–25 first against Houdini 3 and later against Komodo 1142 in the Superfinal event. Season 5 was notable for the winning Komodo team as they accepted the award posthumously for the program's creator Don Dailey, who succumbed to an illness during the final stage of the event. In his honor, the version of Stockfish that was released shortly after that season was named "Stockfish DD". On 30 May 2014, Stockfish 170514 (a development version of Stockfish 5 with tablebase support) convincingly won TCEC Season 6, scoring 35.5–28.5 against Komodo 7x in the Superfinal. Stockfish 5 was released the following day. In TCEC Season 7, Stockfish again made the Superfinal, but lost to Komodo with a score of 30.5–33.5. In TCEC Season 8, despite losses on time caused by buggy code, Stockfish nevertheless qualified once more for the Superfinal, but lost 46.5–53.5 to Komodo. In Season 9, Stockfish defeated Houdini 5 with a score of 54.5–45.5. Stockfish finished third during season 10 of TCEC, the only season since 2013 in which Stockfish had failed to qualify for the superfinal. It did not lose a game but was still eliminated because it was unable to score enough wins against lower-rated engines. After this technical elimination, Stockfish went on a long winning streak, winning seasons 11 (59–41 against Houdini 6.03), 12 (60–40 against Komodo 12.1.1), and 13 (55–45 against Komodo 2155.00) convincingly. In Season 14, Stockfish faced a new challenger in Leela Chess Zero, eking out a win by one point (50.5–49.5). Its winning streak was finally ended in Season 15, when Leela qualified again and won 53.5–46.5, but Stockfish promptly won Season 16, defeating AllieStein 54.5–45.5, after Leela failed to qualify for the Superfinal. In Season 17, Stockfish faced Leela again in the superfinal, losing 52.5–47.5. However, Stockfish has won every Superfinal since: beating Leela 53.5–46.5 in Season 18, 54.5–45.5 in Season 19, 53–47 in Season 20, and 56–44 in Season 21. In Season 22, Komodo Dragon beat out Leela to qualify for the Superfinal, losing to Stockfish by a large margin 59.5–40.5. Stockfish did not lose an opening pair in this match. Leela made the Superfinal in Seasons 23 and 24, but was crushed by Stockfish both times (58.5–41.5 and 58–42). In Season 25, Stockfish once again defeated Leela, but this time by a narrower margin of 52–48. Stockfish also took part in the TCEC cup, winning the first edition, but was surprisingly upset by Houdini in the semifinals of the second edition. Stockfish recovered to beat Komodo in the third-place playoff. In the third edition, Stockfish made it to the finals, but was defeated by Leela Chess Zero after blundering in a 7-man endgame tablebase draw. It turned this result around in the fourth edition, defeating Leela in the final 4.5–3.5. In TCEC Cup 6, Stockfish finished third after losing to AllieStein in the semifinals, the first time it had failed to make the finals. Since then, Stockfish has consistently won the tournament, with the exception of the 11th edition which Leela won 8.5–7.5. === Chess.com Computer Chess Championship === Ever since Chess.com hosted its first Chess.com Computer Chess Championship in 2018, Stockfish has been the most successful engine. It dominated the earlier championships, winning six consecutive titles before finishing second in CCC7. Since then, its dominance has come under threat from the neural-network engines Leelenstein and Leela Chess Zero, but it has continued to perform w

Color histogram

In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges that span the image's color space (the set of all possible colors). A color histogram can be built for any kind of color space, although the term is more often used for three-dimensional spaces such as RGB or HSV. For monochromatic images, the term intensity histogram may be used instead. For multi-spectral images, where each pixel is represented by an arbitrary number of measurements (for example, beyond the three measurements in RGB), a color histogram is N-dimensional, with N being the number of measurements taken. Each measurement has its own wavelength range of the light spectrum, some of which may be outside the visible spectrum. If the set of possible color values is sufficiently small, each of those colors may be placed on a range by itself; then the histogram is merely the count of pixels that have each possible color. Most often, the space is divided into an appropriate number of ranges, often arranged as a regular grid, each containing many similar color values. A color histogram may also be represented and displayed as a smooth function defined over the color space that approximates the pixel counts. Like other kinds of histograms, a color histogram is a statistic that can be viewed as an approximation of an underlying continuous distribution of color values. == Overview == Color histograms are flexible constructs that can be built from images in various color spaces, whether RGB, rg chromaticity or any other color space of any dimension. A histogram of an image is produced first by discretization of the colors in the image into a number of bins, and counting the number of image pixels in each bin. For example, a red–blue chromaticity histogram can be formed by first normalizing color pixel values by dividing RGB values by R+G+B, then quantizing the normalized R and B coordinates into N bins each. A two-dimensional histogram of red–blue chromaticity divided into four bins (N=4) may yield a histogram similar to this table: A histogram can be N-dimensional. Although harder to display, a three-dimensional color histogram for the above example could be thought of as four separate red–blue histograms, where each of the four histograms contains the red–blue values for a bin of green (0–63, 64–127, 128–191, and 192–255). The histogram provides a compact summarization of the distribution of data in an image. A color histogram of an image is relatively invariant with translation and rotation about the viewing axis, and varies only slowly with the angle of view. By comparing histogram signatures of two images and matching the color content of one image with the other, a color histogram is particularly well suited for the problem of recognizing an object of unknown position and rotation within a scene. Importantly, translation of an RGB image into the illumination invariant rg-chromaticity space allows the histogram to operate well in varying light levels. 1. What is a histogram? A histogram is a graphical representation of the number of pixels in an image. In a more simple way to explain, a histogram is a bar graph, whose X-axis represents the tonal scale (black at the left and white at the right), and Y-axis represents the number of pixels in an image in a certain area of the tonal scale. For example, the graph of a luminance histogram shows the number of pixels for each brightness level (from black to white), and when there are more pixels, the peak at the certain luminance level is higher. 2. What is a color histogram? A color histogram of an image represents the distribution of the composition of colors in the image. It shows different types of colors appeared and the number of pixels in each type of the colors appeared. The relation between a color histogram and a luminance histogram is that a color histogram can be also expressed as “three luminance histograms”, each of which shows the brightness distribution of each individual red/green/blue color channel. == Characteristics of a color histogram == A color histogram focuses only on the proportion of the number of different types of colors, regardless of the spatial location of the colors. The values of a color histogram are from statistics. They show the statistical distribution of colors and the essential tone of an image. In general, as the color distributions of the foreground and background in an image are different, there might be a bimodal distribution in the histogram. For the luminance histogram alone, there is no perfect histogram and in general, the histogram can tell whether it is over-exposure or not, but there are times when you might think the image is over exposed by viewing the histogram; however, in reality it is not. == Principles of the formation of a color histogram == The formation of a color histogram is rather simple. From the definition above, we can simply count the number of pixels for each 256 scales in each of the 3 RGB channel, and plot them on 3 individual bar graphs. In general, a color histogram is based on a certain color space, such as RGB or HSV. When we compute the pixels of different colors in an image, if the color space is large, then we can first divide the color space into certain numbers of small intervals. Each of the intervals is called a bin. This process is called color quantization. Then, by counting the number of pixels in each of the bins, we get a color histogram of the image. The concrete steps of the principles can be viewed in Example 1. == Examples == === Example 1 === Given the following image of a cat (an original version and a version that has been reduced to 256 colors for easy histogram purposes), the following data represents a color histogram in the RGB color space, using four bins. Bin 0 corresponds to intensities 0–63 Bin 1 is 64–127 Bin 2 is 128–191 and Bin 3 is 192–255. === Example 2 === Application in camera: Nowadays, some cameras have the ability to show the 3 color histograms when we take photos. We can examine clips (spikes on either the black or white side of the scale) in each of the 3 RGB color histograms. If we find one or more clipping on a channel of the 3 RGB channels, then this would result in a loss of detail for that color. To illustrate this, consider this example: We know that each of the three R, G, B channels has a range of values from 0 to 255 (8 bit). So consider a photo that has a luminance range of 0–255. Assume the photo we take is made of 4 blocks that are adjacent to each other and we set the luminance scale for each of the 4 blocks of original photo to be 10, 100, 205, 245. Thus, the image looks like the topmost figure on the right. Then, we overexpose the photo a little, say, the luminance scale of each block is increased by 10. Thus, the luminance scale for each of the 4 blocks of new photo is 20, 110, 215, 255. Then, the image looks like the second figure on the right. There is not much difference between both figures, all we can see is that the whole image becomes brighter (the contrast for each of the blocks remain the same). Now, we overexpose the original photo again, this time the luminance scale of each block is increased by 50. Thus, the luminance scale for each of the 4 blocks of the new photo is 60, 150, 255, 255. The new image now looks like the third figure on the right. Note that the scale for the last block is 255 instead of 295, for 255 is the top scale and thus the last block has clipped. When this happens, we lose the contrast of the last 2 blocks, and thus we cannot recover the image no matter how we adjust it. To conclude, when taking photos with a camera that displays histograms, always keep the brightest tone in the image below the largest scale 255 on the histogram in order to avoid losing details. == Drawbacks and other approaches == The main drawback of histograms for classification is that the representation is dependent on the color of the object being studied, ignoring its shape and texture. Color histograms can potentially be identical for two images with different object content which happens to share color information. Conversely, without spatial or shape information, similar objects of different color may be indistinguishable based solely on color histogram comparisons. There is no way to distinguish a red and white cup from a red and white plate. Put it another way: histogram-based algorithms have no concept of a generic 'cup', and a model of a red and white cup is no use when given an otherwise identical blue and white cup. Another problem is that color histograms have high sensitivity to noisy interference such as lighting intensity changes and quantization errors. High dimensionality (bins) color histograms are also another issue. Some color histogram feature spaces often occupy more than one hundred di

China brain

In the philosophy of mind, the China brain thought experiment (also known as the Chinese Nation, Chinese Gym, or China-body) considers what would happen if each person in the entire population of China were asked to simulate the action of one neuron in the brain, using telephones or walkie-talkies to simulate the axons and dendrites that connect neurons. The question this thought experiment attempts to answer is whether this arrangement would have a mind or consciousness in the same way that the human brain exhibits. Early versions of this scenario were put forward in 1961 by Anatoly Dneprov, in 1974 by Lawrence Davis, and again in 1978 by Ned Block. Block argues that the China brain would not have a mind, whereas Daniel Dennett argues that it would. The China brain problem is a special case of the more general problem of whether minds could exist within other, larger minds. The Chinese room scenario analyzed by John Searle is a similar thought experiment in philosophy of mind that relates to artificial intelligence. Instead of people who each model a single neuron of the brain, in the Chinese room, clerks who do not speak Chinese accept notes in Chinese and return an answer in Chinese according to a set of rules, without the people in the room ever understanding what those notes mean. In fact, the original short story The Game (1961) by Dneprov contains both the China brain and the Chinese room scenarios. == Background == Many theories of mental states are materialist, that is, they describe the mind as the behavior of a physical object like the brain. One formerly prominent example is the identity theory, which says that mental states are brain states. One criticism is the problem of multiple realizability. The physicalist theory that responds to this is functionalism, which states that a mental state can be whatever functions as a mental state. That is, the mind can be composed of neurons, or it could be composed of wood, rocks or toilet paper, as long as it provides mental functionality. == Description == Suppose that the whole nation of China were reordered to simulate the workings of a single brain (that is, to act as a mind according to functionalism). Each Chinese person acts as (say) a neuron, and communicates by special two-way radio in corresponding way to the other people. The current mental state of the China brain is displayed on satellites that may be seen from anywhere in China. The China brain would then be connected via radio to a body, one that provides the sensory inputs and behavioral outputs of the China brain. Thus, the China brain possesses all the elements of a functional description of mind: sensory inputs, behavioral outputs, and internal mental states causally connected to other mental states. If the nation of China can be made to act in this way, then, according to functionalism, this system would have a mind. Block's goal is to show how unintuitive it is to think that such an arrangement could create a mind capable of thoughts and feelings. == Consciousness == The China brain argues that consciousness is a problem for functionalism. Block's Chinese nation presents a version of what is known as the absent qualia objection to functionalism because it purports to show that it is possible for something to be functionally equivalent to a human being and yet have no conscious experience. A creature that functions like a human being but does not feel anything is known as a "philosophical zombie". So the absent qualia objection to functionalism could also be called the "zombie objection". == Criticisms == Some philosophers, like Daniel Dennett, have concluded that the China brain does create a mental state. Functionalist philosophers of mind endorse the idea that something like the China brain can realise a mind, and that neurons are, in principle, not the only material that can create a mental state.