Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also reduces variance and overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the ensemble averaging approach. == Description of the technique == Given a standard training set D {\displaystyle D} of size n {\displaystyle n} , bagging generates m {\displaystyle m} new training sets D i {\displaystyle D_{i}} , each of size n ′ {\displaystyle n'} , by sampling from D {\displaystyle D} uniformly and with replacement. By sampling with replacement, some observations may be repeated in each D i {\displaystyle D_{i}} . If n ′ = n {\displaystyle n'=n} , then for large n {\displaystyle n} the set D i {\displaystyle D_{i}} is expected to have the fraction (1 - 1/e) (~63.2%) of the unique samples of D {\displaystyle D} , the rest being duplicates. This kind of sample is known as a bootstrap sample. Sampling with replacement ensures each bootstrap is independent from its peers, as it does not depend on previous chosen samples when sampling. Then, m {\displaystyle m} models are fitted using the above bootstrap samples and combined by averaging the output (for regression) or voting (for classification). Bagging leads to "improvements for unstable procedures", which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression. Bagging was shown to improve preimage learning. On the other hand, it can mildly degrade the performance of stable methods such as k-nearest neighbors. == Process of the algorithm == === Key Terms === There are three types of datasets in bootstrap aggregating. These are the original, bootstrap, and out-of-bag datasets. Each section below will explain how each dataset is made except for the original dataset. The original dataset is whatever information is given. === Creating the bootstrap dataset === The bootstrap dataset is made by randomly picking objects from the original dataset. Also, it must be the same size as the original dataset. However, the difference is that the bootstrap dataset can have duplicate objects. Here is a simple example to demonstrate how it works along with the illustration below: Suppose the original dataset is a group of 12 people. Their names are Emily, Jessie, George, Constantine, Lexi, Theodore, John, James, Rachel, Anthony, Ellie, and Jamal. By randomly picking a group of names, let us say our bootstrap dataset had James, Ellie, Constantine, Lexi, John, Constantine, Theodore, Constantine, Anthony, Lexi, Constantine, and Theodore. In this case, the bootstrap sample contained four duplicates for Constantine, and two duplicates for Lexi, and Theodore. === Creating the out-of-bag dataset === The out-of-bag dataset represents the remaining people who were not in the bootstrap dataset. It can be calculated by taking the difference between the original and the bootstrap datasets. In this case, the remaining samples who were not selected are Emily, Jessie, George, Rachel, and Jamal. Keep in mind that since both datasets are sets, when taking the difference the duplicate names are ignored in the bootstrap dataset. The illustration below shows how the math is done: === Application === Creating the bootstrap and out-of-bag datasets is crucial since it is used to test the accuracy of ensemble learning algorithms like random forest. For example, a model that produces 50 trees using the bootstrap/out-of-bag datasets will have a better accuracy than if it produced 10 trees. Since the algorithm generates multiple trees and therefore multiple datasets the chance that an object is left out of the bootstrap dataset is low. The next few sections talk about how the random forest algorithm works in more detail. === Creation of Decision Trees === The next step of the algorithm involves the generation of decision trees from the bootstrapped dataset. To achieve this, the process examines each gene/feature and determines for how many samples the feature's presence or absence yields a positive or negative result. This information is then used to compute a confusion matrix, which lists the true positives, false positives, true negatives, and false negatives of the feature when used as a classifier. These features are then ranked according to various classification metrics based on their confusion matrices. Some common metrics include estimate of positive correctness (calculated by subtracting false positives from true positives), measure of "goodness", and information gain. These features are then used to partition the samples into two sets: those that possess the top feature, and those that do not. The diagram below shows a decision tree of depth two being used to classify data. For example, a data point that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given a "Yes". This process is repeated recursively for successive levels of the tree until the desired depth is reached. At the very bottom of the tree, samples that test positive for the final feature are generally classified as positive, while those that lack the feature are classified as negative. These trees are then used as predictors to classify new data. === Random Forests === The next part of the algorithm involves introducing yet another element of variability amongst the bootstrapped trees. In addition to each tree only examining a bootstrapped set of samples, only a small but consistent number of unique features are considered when ranking them as classifiers. This means that each tree only knows about the data pertaining to a small constant number of features, and a variable number of samples that is less than or equal to that of the original dataset. Consequently, the trees are more likely to return a wider array of answers, derived from more diverse knowledge. This results in a random forest, which possesses numerous benefits over a single decision tree generated without randomness. In a random forest, each tree "votes" on whether or not to classify a sample as positive based on its features. The sample is then classified based on majority vote. An example of this is given in the diagram below, where the four trees in a random forest vote on whether or not a patient with mutations A, B, F, and G has cancer. Since three out of four trees vote yes, the patient is then classified as cancer positive. Because of their properties, random forests are considered one of the most accurate data mining algorithms, are less likely to overfit their data, and run quickly and efficiently even for large datasets. They are primarily useful for classification as opposed to regression, which attempts to draw observed connections between statistical variables in a dataset. This makes random forests particularly useful in such fields as banking, healthcare, the stock market, and e-commerce where it is important to be able to predict future results based on past data. One of their applications would be as a useful tool for predicting cancer based on genetic factors, as seen in the above example. There are several important factors to consider when designing a random forest. If the trees in the random forests are too deep, overfitting can still occur due to over-specificity. If the forest is too large, the algorithm may become less efficient due to an increased runtime. Random forests also do not generally perform well when given sparse data with little variability. However, they still have numerous advantages over similar data classification algorithms such as neural networks, as they are much easier to interpret and generally require less data for training. As an integral component of random forests, bootstrap aggregating is very important to classification algorithms, and provides a critical element of variability that allows for increased accuracy when analyzing new data, as discussed below. == Improving Random Forests and Bagging == While the techniques described above utilize random forests and bagging (otherwise known as bootstrapping), there are certain techniques that can be used in order to improve their execution and voting time, their prediction accuracy, and their overall performance. The following are key steps in creating an efficient random forest: Specify the maximum depth of trees: Instead of allowing the random forest to continue until all nodes are pure, it is better to cut it off at a certain point in order to further decrease chances of overfitting. Prune the dataset: Using an extremely large dataset may create results that are less indicative of the data provided than a smaller set that more accurately represents what is being focused on. Continue pruning the data at each
Native cloud application
A native cloud application (NCA) is a type of computer software that natively utilizes services and infrastructure from cloud computing providers such as Amazon EC2, Force.com, or Microsoft Azure. NCAs exhibit a combined usage of the three fundamental technologies: Computational grid - loosely, e.g. MapReduce Data grids (e.g. distributed in-memory data caches) Auto-scaling on any managed infrastructure
Oren Etzioni
Oren Etzioni (born 1964) is Professor Emeritus of Computer Science at the University of Washington, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). Etzioni is a co-founder of Vercept, an AI startup, and founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. He is also the Founder and Technical Director of the AI2 Incubator and a venture partner at the Madrona Venture Group. == Early life and education == Etzioni is the son of Israeli-American intellectual Amitai Etzioni. He was the first student to major in computer science at Harvard University, where he earned a bachelor's degree in 1986. He earned a PhD from Carnegie Mellon University in January, 1991, supervised by Tom M. Mitchell. == University of Washington career == Etzioni joined the University of Washington faculty in 1991, immediately after receiving his PhD. He rose through the ranks to become the Washington Research Foundation Entrepreneurship Professor in Computer Science & Engineering. Etzioni's research has been focused on basic problems in the study of intelligence, machine reading, machine learning and web search. Past projects include Internet Softbots—the study of intelligent agents in the context of real-world software testbeds. In 2003, he started the KnowItAll project for acquiring massive amounts of information from the web. In 2005, he founded and became the director of the university's Turing Center. The center investigated problems in data mining, natural language processing, the Semantic Web and other web search topics. Etzioni coined the term machine reading and helped to create the first commercial comparison shopping agent. He has published over 200 technical papers, and his H-index exceeds 100. == Entrepreneurship == As a faculty member Etzioni was also an active entrepreneur, founding multiple companies and pioneering multiple technologies including MetaCrawler (bought by Infospace), Netbot (bought by Excite in 1997 for $35 million), and ClearForest (bought by Reuters). He founded Farecast, a travel metasearch and price prediction site, which was acquired by Microsoft in 2008 for $115 million. Before founding Farecast, he developed a program originally called Hamlet, that used algorithms to identify patterns in airfare data using data-mining techniques. He also co-founded Decide.com, a website to help consumers make buying decisions using previous price history and recommendations from other users. Decide.com was bought by eBay in September, 2013. Etzioni is also a venture partner at the Madrona Venture Group. He is founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. Etzioni is a co-founder of Vercept, an AI startup formed in 2025. == Founding CEO of AI2 == In September 2013 Etzioni was selected as the Founding CEO of the Allen Institute for Artificial Intelligence by philanthropist Paul G. Allen, and in January 2014 he took a leave of absence from the University of Washington to serve in that role. Etzioni's technical contributions continued at AI2; for example, in 2015, he helped to create the Semantic Scholar search engine. Under Etzioni’s leadership, AI2 grew from zero to over two hundred team members including notable researchers and engineers across several domains of AI. By 2021, its AI2 researchers had published near 700 papers in publications such as AAAI, ACL, CVPR, NeurIPS, and ICLR. Twenty-four of these papers had garnered special-recognition awards. AI2 also offered several key resources and tools to the AI community including the AllenNLP library, Semantic Scholar, and the conservation platforms EarthRanger and Skylight. Ed Lazowska, AI2 Board Member, has stated about Etzioni that he "took the collegial, collaborative culture that he absorbed in his 20+ years as a professor in UW's Allen School and mixed it with the singular focus that drives startups to create an elixir that AI2 folks have been drinking over the last eight years. The result is an exceptional organization of scientists, engineers, and entrepreneurs that's pursuing Paul Allen’s vision of ‘AI for the Common Good’ with extraordinary success.” == Popular press == In addition to his scientific publications, Etzioni has written commentary on AI for The New York Times, Wired, Nature, and other publications. After reading the idea in a book about AI by Brad Smith and Harry Shum, Etzioni has attempted to create an oath for AI practitioners. In 2018, he published what he called a "Hippocratic Oath for artificial intelligence practitioners" in TechCrunch. == Awards and recognition == In 1993, Etzioni received a National Young Investigator Award. In 2003, Etzioni was elected as AAAI Fellow. In 2005, Etzioni received an IJCAI Distinguished Paper Award for "A Probabilistic Model of Redundancy in Information Extraction". In 2007, he received the Robert S. Engelmore Memorial Award. In 2012 Etzioni was featured as GeekWire's "Geek of the Week". In 2013 Etzioni was voted "Geek of the Year" through GeekWire. In 2022, Etzioni received the 2012 ACL Test-of-Time Paper Award. In 2022, Etzioni, along with Ana-Maria Popescu and Henry Kautz, received the ACM Intelligent User Interfaces Most Impact Award for their 2003 paper, "Towards a Theory of Natural Language Interfaces to Databases". == Personal life == Etzioni has three children, and has said in interviews that family is his number one priority. He is married to Ivone Etzioni, and was previously married to Dr. Ruth Etzioni, a biostatistician at the Fred Hutchinson Cancer Center. Outside of his professional career, Etzioni has a wide range of personal interests. He has attended the Burning Man festival, which he described as a valuable way to step outside his comfort zone. His first computer was a TRS-80, and he has described his car’s GPS as his favorite gadget, joking that he has “no sense of direction.” == Selected publications == === Scholarly publications === Etzioni, Oren (July 1994). "A Softbot-based Interface to the Internet" (PDF). Communications of the ACM. Retrieved March 29, 2018. Etzioni, Oren (December 2008). "Open Information Extraction from the Web" (PDF). Communications of the ACM. Retrieved March 29, 2018. Zamir, Oren; Etzioni, Oren (1998). "Web document clustering". Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM. pp. 46–54. doi:10.1145/290941.290956. ISBN 978-1-58113-015-7. S2CID 244069. Zamir, Oren; Etzioni, Oren (May 1999). "Grouper: a dynamic clustering interface to Web search results". Computer Networks. 31 (11–16): 1361–1374. CiteSeerX 10.1.1.31.8216. doi:10.1016/S1389-1286(99)00054-7. S2CID 206134308. Popescu, Ana-Maria; Etzioni, Oren (2005). "Extracting product features and opinions from reviews". Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 339–346. doi:10.3115/1220575.1220618. Etzioni, Oren; Cafarella, Michael; Downey, Doug; Popescu, Ana-Maria; Shaked, Tal; Sonderland, Stephen; Weld, Daniel; Yates, Alexander (June 2005). "Unsupervised named-entity extraction from the Web: An experimental study". Artificial Intelligence. 165 (1): 91–134. doi:10.1016/j.artint.2005.03.001. Downey, Doug; Etzioni, Oren; Sonderland, Stephen (July 2010). "Grouper: Analysis of a probabilistic model of redundancy in unsupervised information extraction". Artificial Intelligence. 174 (11): 726–748. CiteSeerX 10.1.1.174.2441. doi:10.1016/j.artint.2010.04.024. === Popular articles === Etzioni, Oren (August 4, 2011). "Web Search Needs a Shakeup" (PDF). Nature. Retrieved November 21, 2019. Etzioni, Oren (December 9, 2014). "AI Won't Exterminate Us – It Will Empower Us". Backchannel. Retrieved March 29, 2018. Etzioni, Oren (February 4, 2016). "To Keep AI Safe -- Use AI". Vox. Retrieved November 21, 2019. Etzioni, Oren (April 8, 2016). "Quora Session with Oren Etzioni". Quora. Retrieved March 29, 2018. Etzioni, Oren (June 15, 2016). "Deep Learning Isn't a Dangerous Magic Genie. It's Just Math". Wired. Retrieved March 29, 2018. Etzioni, Oren (September 20, 2016). "No, the Experts Don't Think Superintelligent AI is a Threat to Humanity". MIT Technology Review. Retrieved November 21, 2019. Etzioni, Oren (July 6, 2017). "Artificial intelligence: AI Zooms in on highly influential citations". Nature. Retrieved March 29, 2018. Etzioni, Oren (September 1, 2017). "How to Regulate Artificial Intelligence". The New York Times. Retrieved March 29, 2018. Etzioni, Oren (November 2, 2017). "Workers Displaced by Automation Should Try A New Job: Caregiver". Wired. Retrieved March 29, 2018. Etzioni, Oren (March 14, 2018). "A Hippocratic Oath for artificial intelligence practitioners". Tech Crunch. Retrieved March 29, 2018. Etzioni, Oren (March 7, 2018). "A 'Manhattan Project' for science research". The Hill. Retrieved November 21, 2019. Etzioni, Ore
Ernst Dickmanns
Ernst Dieter Dickmanns is a German pioneer of dynamic computer vision and of driverless cars. Dickmanns has been a professor at the University of the Bundeswehr Munich (1975–2001), and visiting professor to Caltech and to MIT, teaching courses on "dynamic vision". == Biography == Dickmanns was born in 1936. He studied aerospace and aeronautics at RWTH Aachen (1956–1961), and control engineering at Princeton University (1964/65); from 1961 to 1975 he was associated with the German Aero-Space Research Establishment (now DLR) Oberpfaffenhofen, working in the fields of flight dynamics and trajectory optimization. In 1971/72 he spent a Post-Doc Research Associateship with the NASA-Marshall Space Flight Center, Huntsville (orbiter re-entry). From 1975 to 2001 he was with UniBw Munich, where he initiated the 'Institut fuer Flugmechanik und Systemdynamik' (IFS), the Institut fuer die 'Technik Autonomer Systeme' (TAS), and the research activities in machine vision for vehicle guidance. == Pioneering work in autonomous driving == In the early 1980s his team equipped a Mercedes-Benz van with cameras and other sensors. The 5-ton van was re-engineered that it was possible to control steering wheel, throttle, and brakes through computer commands based on real-time evaluation of image sequences. Software was written that translated the sensory data into appropriate driving commands. For safety reasons, initial experiments in Bavaria took place on streets without traffic. In 1986 the Robot Car "VaMoRs" managed to drive all by itself and by 1987 was capable of driving itself at speeds up to 96 kilometres per hour (60 mph). One of the greatest challenges in high-speed autonomous driving arises through the rapidly changing visual street scenes. Back then, computers were much slower than they are today (~1% of 1%); therefore, sophisticated computer vision strategies were necessary to react in real time. The team of Dickmanns solved the problem through an innovative approach to dynamic vision. Spatiotemporal models were used right from the beginning, dubbed '4-D approach', which did not need storing previous images but nonetheless was able to yield estimates of all 3-D position and velocity components. Attention control including artificial saccadic movements of the platform carrying the cameras allowed the system to focus its attention on the most relevant details of the visual input. Kalman filters have been extended to perspective imaging and were used to achieve robust autonomous driving even in presence of noise and uncertainty. Feedback of prediction errors allowed bypassing the (ill-conditioned) inversion of perspective projection by least-squares parameter fits. When in 1986/83 the EUREKA-project 'PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety' (PROMETHEUS) was initiated by the European car manufacturing industry (funding in the range of several hundred million Euros), the initially planned autonomous lateral guidance by buried cables was dropped and substituted by the much more flexible machine vision approach proposed by Dickmanns, and partially encouraged by his successes. Most of the major car companies participated; so did Dickmanns and his team in cooperation with the Daimler-Benz AG. Substantial progress was made in the following 7 years. In particular, Dickmanns' robot cars learned to drive in traffic under various conditions. An accompanying human driver with a "red button" made sure the robot vehicle could not get out of control and become a danger to the public. Since 1992, driving in public traffic was standard as final step in real-world testing. Several dozen Transputers, a special breed of parallel computers, were used to deal with the (by 1990s standards) enormous computational demands. Two culmination points were achieved in 1994/95, when Dickmanns´ re-engineered autonomous S-Class Mercedes-Benz performed international demonstrations. The first was the final presentation of the PROMETHEUS project in October 1994 on Autoroute 1 near the airport Charles-de-Gaulle in Paris. With guests on board, the twin vehicles of Daimler-Benz (VITA-2) and UniBwM (VaMP) drove more than 1,000 kilometres (620 mi) on the three-lane highway in standard heavy traffic at speeds up to 130 kilometres per hour (81 mph). Driving in free lanes, convoy driving with distance keeping depending on speed, and lane changes left and right with autonomous passing have been demonstrated; the latter required interpreting the road scene also in the rear hemisphere. Two cameras with different focal lengths for each hemisphere have been used in parallel for this purpose. The second culmination point was a 1,758 kilometres (1,092 mi) trip in the fall of 1995 from Munich in Bavaria to Odense in Denmark to a project meeting and back. Both longitudinal and lateral guidance were performed autonomously by vision. On highways, the robot achieved speeds exceeding 175 kilometres per hour (109 mph) (there is no general speed limit on the Autobahn). Publications from Dickmann's research group indicate a mean autonomously driven distance without resets of ~9 kilometres (5.6 mi); the longest autonomously driven stretch reached 158 kilometres (98 mi). More than half of the resets required were achieved autonomously (no human intervention). This is particularly impressive considering that the system used black-and-white video-cameras and did not model situations like road construction sites with yellow lane markings; lane-changes at over 140 kilometres per hour (87 mph), and other traffic with more than 40 kilometres per hour (25 mph) relative speed have been handled. In total, 95% autonomous driving (by distance) was achieved. In the years 1994 to 2004 the elder 5-ton van 'VaMoRs' was used to develop the capabilities needed for driving on networks of minor (also unsealed) roads and for cross-country driving including avoidance of negative obstacles like ditches. Turning off onto crossroads of unknown width and intersection angles required a big effort, but has been achieved with "Expectation-based, Multi-focal, Saccadic vision" (EMS-vision). This vertebrate-type vision uses animation capabilities based on knowledge about subject classes (including the autonomous vehicle itself) and their potential behaviour in certain situations. This rich background is used for control of gaze and attention as well as for locomotion. Beside ground vehicle guidance, also applications of the 4-D approach to dynamic vision for unmanned air vehicles (conventional aircraft and helicopters) have been investigated. Autonomous visual landing approaches and landings have been demonstrated in hardware-in-the-loop simulations with visual/inertial data fusion. Real-world autonomous visual landing approaches till shortly before touchdown have been performed in 1992 with the twin-propeller aircraft Dornier 128 of the University of Brunswick at the airport there. Another success of this machine vision technology was the first ever visually controlled grasping experiment of a free-floating object in weightlessness on board the Space Shuttle Columbia D2-mission in 1993 as part of the 'Rotex'-experiment of DLR.
Trevor Hastie
Trevor John Hastie (born 27 June 1953) is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for his contributions to applied statistics, especially in the field of machine learning, data mining, and bioinformatics. He has authored several popular books in statistical learning, including The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. He also contributed to the development of S. == Education and career == Hastie was born on 27 June 1953 in South Africa. He received his B.S. in statistics from the Rhodes University in 1976 and master's degree from University of Cape Town in 1979. Hastie joined the doctoral program at Stanford University in 1980 and received his Ph.D. in 1984 under the supervision of Werner Stuetzle. His dissertation was "Principal Curves and Surfaces". Hastie began his professional career in 1977 with the South African Medical Research Council. After receiving his master's degree in 1979, he spent a year interning at the London School of Hygiene & Tropical Medicine, the Johnson Space Center in Houston, and the Biomath department at Oxford University. After receiving his doctoral degree from Stanford, Hastie returned to South Africa to work with his former employer South African Medical Research Council. He returned to United States in 1986 and joined the AT&T Bell Laboratories in Murray Hill, New Jersey and remained there for nine years. Working with John Chambers, he co-directed the development of the S programming language. He joined Stanford University in 1994 as Associate Professor in Statistics and Biostatistics. He was promoted to full Professor in 1999. During the period 2006–2009, he was the chair of the Department of Statistics at Stanford University. In 2013 he was named the John A. Overdeck Professor of Mathematical Sciences. == Awards and honors == Hastie is a Fellow of the Royal Statistical Society since 1979. He is also an elected Fellow of several professional and scholarly societies, including the Institute of Mathematical Statistics, the American Statistical Association, and the South African Statistical Society. He is a recipient of 'Myrto Lefkopolou Distinguished Lectureship' award of Biostatistics Department at the Harvard School of Public Health. In 2018, he was elected a member of the National Academy of Sciences. In 2019 Hastie became a foreign member of the Royal Netherlands Academy of Arts and Sciences. Hastie was named for the C.R. and Bhargavi Rao Prize in 2025. Hastie and Hui Zou received the 2025 Founders of Statistics prize for their elastic net paper. == Publications == Hastie is a prolific author of scientific works on numerous topics in applied statistics, including statistical learning, data mining, statistical computing, and bioinformatics. He along with his collaborators has authored about 125 scientific articles. Many of Hastie's scientific articles were coauthored by his longtime collaborator, Robert Tibshirani. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. He has coauthored the following books: T. Hastie and R. Tibshirani, Generalized Additive Models, Chapman and Hall, 1990. J. Chambers and T. Hastie, Statistical Models in S, Wadsworth/Brooks Cole, 1991. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Prediction, Inference and Data Mining, Second Edition, Springer Verlag, 2009 (available for free from the author's website). G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer Verlag, 2013 (available for free from the co-author's website). T. Hastie, R. Tibshirani, M. Wainwright, Statistical Learning with Sparsity: the Lasso and Generalizations, CRC Press, 2015 (available for free from the author's website). Bradley Efron; Trevor Hastie (2016). Computer Age Statistical Inference. Cambridge University Press. ISBN 9781107149892.
Stencil buffer
A stencil buffer is an extra data buffer, in addition to the color buffer and Z-buffer, found on modern graphics hardware. The buffer is per pixel and works on integer values, usually with a depth of one byte per pixel. The Z-buffer and stencil buffer often share the same area in the RAM of the graphics hardware. In the simplest case, the stencil buffer is used to limit the area of rendering (stenciling). More advanced usage of the stencil buffer makes use of the strong connection between the Z-buffer and the stencil buffer in the rendering pipeline. For example, stencil values can be automatically increased/decreased for every pixel that fails or passes the depth test. The simple combination of depth test and stencil modifiers make a vast number of effects possible (such as stencil shadow volumes, Two-Sided Stencil, compositing, decaling, dissolves, fades, swipes, silhouettes, outline drawing, or highlighting of intersections between complex primitives) though they often require several rendering passes and, therefore, can put a heavy load on the graphics hardware. The most typical application is still to add shadows to 3D applications. It is also used for planar reflections. Other rendering techniques, such as portal rendering, use the stencil buffer in other ways; for example, it can be used to find the area of the screen obscured by a portal and re-render those pixels correctly. The stencil buffer and its modifiers can be accessed in computer graphics by using APIs like OpenGL, Direct3D, Vulkan or Metal. == Architecture == The stencil buffer typically shares the same memory space as the Z-buffer, and typically the ratio is 24 bits for Z-buffer + 8 bits for stencil buffer or, in the past, 15 bits for Z-buffer + 1 bit for stencil buffer. Another variant is 4 + 24, where 28 of the 32 bits are used and 4 ignored. Stencil and Z-buffers are part of the frame buffer, coupled to the color buffer. The first chip available to a wider market was 3Dlabs' Permedia II, which supported a one-bit stencil buffer. The bits allocated to the stencil buffer can be used to represent numerical values in the range [0, 2n-1], and also as a Boolean matrix (n is the number of allocated bits), each of which may be used to control the particular part of the scene. Any combination of these two ways of using the available memory is also possible. == Stencil test == Stencil test or stenciling is among the operations on the pixels/fragments (Per-pixel operations), located after the alpha test, and before the depth test. The stencil test ensures undesired pixels do not reach the depth test. This saves processing time for the scene. Similarly, the alpha test can prevent corresponding pixels to reach the stencil test. The test itself is carried out over the stencil buffer to some value in it, or altered or used it, and carried out through the so-called stencil function and stencil operations. The stencil function is a function by which the stencil value of a certain pixel is compared to a given reference value. If this comparison is logically true, the stencil test passes. Otherwise not. In doing so, the possible reaction caused by the result of comparing three different state-depth and stencil buffer: Stencil test is not passed Stencil test is passed but not the depth test Both tests are passed (or stencil test is passed, and the depth is not enabled) For each of these cases, different operations can be set over the examined pixel. In the OpenGL stencil functions, the reference value and mask, respectively, define the function glStencilFunc. In Direct3D each of these components is adjusted individually using methods SetRenderState devices currently in control. This method expects two parameters, the first of which is a condition that is set and the other its value. In the order that was used above, these conditions are called D3DRS_STENCILFUNC, D3DRS_STENCILREF, and D3DRS_STENCILMASK. Stencil operations in OpenGL adjust glStencilOp function that expects three values. In Direct3D, again, each state sets a specific method SetRenderState. The three states that can be assigned to surgery are called D3DRS_STENCILFAIL, D3DRENDERSTATE_STENCILZFAIL, and D3DRENDERSTATE_STENCILPASS. == Z-fighting == Due to the lack of precision in the Z-buffer, coplanar polygons that are short-range, or overlapping, can be portrayed as a single plane with a multitude of irregular cross-sections. These sections can vary depending on the camera position and other parameters and are rapidly changing. This is called Z-fighting. There exist multiple solutions to this issue: - Bring the far plane closer to restrict the scene's depth, thus increasing the accuracy of the Z-buffer, or reducing the distance at which objects are visible in the scene. - Increase the number of bits allocated to the Z-buffer, which is possible at the expense of memory for the stencil buffer. - Move polygons farther apart from one another, which restricts the possibilities for the artist to create an elaborate scene. All of these approaches to the problem can only reduce the likelihood that the polygons will experience Z-fighting, and do not guarantee a definitive solution in the general case. A solution that includes the stencil buffer is based on the knowledge of which polygon should be in front of the others. The silhouette of the front polygon is drawn into the stencil buffer. After that, the rest of the scene can be rendered only where the silhouette is negative, and so will not clash with the front polygon. == Shadow volume == Shadow volume is a technique used in 3D computer graphics to add shadows to a rendered scene. They were first proposed by Frank Crow in 1977 as the geometry describing the 3D shape of the region occluded from a light source. A shadow volume divides the virtual world in two: areas that are in shadow and areas that are not. The stencil buffer implementation of shadow volumes is generally considered among the most practical general-purpose real-time shadowing techniques for use on modern 3D graphics hardware. It has been popularised by the video game Doom 3, and a particular variation of the technique used in this game has become known as Carmack's Reverse. == Reflections == Reflection of a scene is drawn as the scene itself transformed and reflected relative to the "mirror" plane, which requires multiple render passes and using of stencil buffer to restrict areas where the current render pass works: Draw the scene excluding mirror areas – for each mirror lock the Z-buffer and color buffer Render visible part of the mirror Depth test is set up so that each pixel is passed to enter the maximum value and always passes for each mirror: Depth test is set so that it passes only if the distance of a pixel is less than the current (default behavior) The matrix transformation is changed to reflect the scene relative to the mirror plane Unlock the Z-buffer and color buffer Draw the scene, but only the part of it that lies between the mirror plane and the camera. In other words, a mirror plane is also a clipping plane Again locks color buffer, depth test is set so that it always passes, reset stencil for the next mirror. == Planar Shadows == While drawing a plane of shadows, there are two dominant problems: The first concerns the problem of deep struggle in case the flat geometry is not awarded on the part covered with the shadow of shadows and outside. See the section that relates to this. Another problem relates to the extent of the shadows outside the area where the plane there. Another problem, which may or may not appear, depending on the technique, the design of more polygons in one part of the shadow, resulting in darker and lighter parts of the same shade. All three problems can be solved geometrically, but because of the possibility that hardware acceleration is directly used, it is a far more elegant implementation using the stencil buffer: 1. Enable lights and the lights 2. Draw a scene without any polygon that should be projected shadows 3. Draw all polygons which should be projected shadows, but without lights. In doing so, the stencil buffer, the pixel of each polygon to be assigned to a specific value for the ground to which they belong. The distance between these values should be at least two, because for each plane to be used two values for two states: in the shadows and bright. 4. Disable any global illumination (to ensure that the next steps will affect only individual selected light) For each plane: For each light: 1. Edit a stencil buffer and only the pixels that carry a specific value for the selected level. Increase the value of all the pixels that are projected objects between the date of a given level and bright. 2. Allow only selected light for him to draw level at which part of her specific value was not changed. == Spatial shadows == Stencil buffer implementation of spatial drawing shadows is any shadow of a geometric body that its volume includes part of the scene that is
AI Paragraph Rewriters Reviews: What Actually Works in 2026
Looking for the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.