Viola–Jones object detection framework

The Viola–Jones object detection framework is a machine learning object detection framework proposed in 2001 by Paul Viola and Michael Jones. It was motivated primarily by the problem of face detection, although it can be adapted to the detection of other object classes. In short, it consists of a sequence of classifiers. Each classifier is a single perceptron with several binary masks (Haar features). To detect faces in an image, a sliding window is computed over the image. For each image, the classifiers are applied. If at any point, a classifier outputs "no face detected", then the window is considered to contain no face. Otherwise, if all classifiers output "face detected", then the window is considered to contain a face. The algorithm is efficient for its time, able to detect faces in 384 by 288 pixel images at 15 frames per second on a conventional 700 MHz Intel Pentium III. It is also robust, achieving high precision and recall. While it has lower accuracy than more modern methods such as convolutional neural network, its efficiency and compact size (only around 50k parameters, compared to millions of parameters for typical CNN like DeepFace) means it is still used in cases with limited computational power. For example, in the original paper, they reported that this face detector could run on the Compaq iPAQ at 2 fps (this device has a low power StrongARM without floating point hardware). == Problem description == Face detection is a binary classification problem combined with a localization problem: given a picture, decide whether it contains faces, and construct bounding boxes for the faces. To make the task more manageable, the Viola–Jones algorithm only detects full view (no occlusion), frontal (no head-turning), upright (no rotation), well-lit, full-sized (occupying most of the frame) faces in fixed-resolution images. The restrictions are not as severe as they appear, as one can normalize the picture to bring it closer to the requirements for Viola-Jones. any image can be scaled to a fixed resolution for a general picture with a face of unknown size and orientation, one can perform blob detection to discover potential faces, then scale and rotate them into the upright, full-sized position. the brightness of the image can be corrected by white balancing. the bounding boxes can be found by sliding a window across the entire picture, and marking down every window that contains a face. This would generally detect the same face multiple times, for which duplication removal methods, such as non-maximal suppression, can be used. The "frontal" requirement is non-negotiable, as there is no simple transformation on the image that can turn a face from a side view to a frontal view. However, one can train multiple Viola-Jones classifiers, one for each angle: one for frontal view, one for 3/4 view, one for profile view, a few more for the angles in-between them. Then one can at run time execute all these classifiers in parallel to detect faces at different view angles. The "full-view" requirement is also non-negotiable, and cannot be simply dealt with by training more Viola-Jones classifiers, since there are too many possible ways to occlude a face. == Components of the framework == A full presentation of the algorithm is in. Consider an image I ( x , y ) {\displaystyle I(x,y)} of fixed resolution ( M , N ) {\displaystyle (M,N)} . Our task is to make a binary decision: whether it is a photo of a standardized face (frontal, well-lit, etc) or not. Viola–Jones is essentially a boosted feature learning algorithm, trained by running a modified AdaBoost algorithm on Haar feature classifiers to find a sequence of classifiers f 1 , f 2 , . . . , f k {\displaystyle f_{1},f_{2},...,f_{k}} . Haar feature classifiers are crude, but allows very fast computation, and the modified AdaBoost constructs a strong classifier out of many weak ones. At run time, a given image I {\displaystyle I} is tested on f 1 ( I ) , f 2 ( I ) , . . . f k ( I ) {\displaystyle f_{1}(I),f_{2}(I),...f_{k}(I)} sequentially. If at any point, f i ( I ) = 0 {\displaystyle f_{i}(I)=0} , the algorithm immediately returns "no face detected". If all classifiers return 1, then the algorithm returns "face detected". For this reason, the Viola-Jones classifier is also called "Haar cascade classifier". === Haar feature classifiers === Consider a perceptron f w , b {\displaystyle f_{w,b}} defined by two variables w ( x , y ) , b {\displaystyle w(x,y),b} . It takes in an image I ( x , y ) {\displaystyle I(x,y)} of fixed resolution, and returns f w , b ( I ) = { 1 , if ∑ x , y w ( x , y ) I ( x , y ) + b > 0 0 , else {\displaystyle f_{w,b}(I)={\begin{cases}1,\quad {\text{if }}\sum _{x,y}w(x,y)I(x,y)+b>0\\0,\quad {\text{else}}\end{cases}}} A Haar feature classifier is a perceptron f w , b {\displaystyle f_{w,b}} with a very special kind of w {\displaystyle w} that makes it extremely cheap to calculate. Namely, if we write out the matrix w ( x , y ) {\displaystyle w(x,y)} , we find that it takes only three possible values { + 1 , − 1 , 0 } {\displaystyle \{+1,-1,0\}} , and if we color the matrix with white on + 1 {\displaystyle +1} , black on − 1 {\displaystyle -1} , and transparent on 0 {\displaystyle 0} , the matrix is in one of the 5 possible patterns shown on the right. Each pattern must also be symmetric to x-reflection and y-reflection (ignoring the color change), so for example, for the horizontal white-black feature, the two rectangles must be of the same width. For the vertical white-black-white feature, the white rectangles must be of the same height, but there is no restriction on the black rectangle's height. ==== Rationale for Haar features ==== The Haar features used in the Viola-Jones algorithm are a subset of the more general Haar basis functions, which have been used previously in the realm of image-based object detection. While crude compared to alternatives such as steerable filters, Haar features are sufficiently complex to match features of typical human faces. For example: The eye region is darker than the upper-cheeks. The nose bridge region is brighter than the eyes. Composition of properties forming matchable facial features: Location and size: eyes, mouth, bridge of nose Value: oriented gradients of pixel intensities Further, the design of Haar features allows for efficient computation of f w , b ( I ) {\displaystyle f_{w,b}(I)} using only constant number of additions and subtractions, regardless of the size of the rectangular features, using the summed-area table. === Learning and using a Viola–Jones classifier === Choose a resolution ( M , N ) {\displaystyle (M,N)} for the images to be classified. In the original paper, they recommended ( M , N ) = ( 24 , 24 ) {\displaystyle (M,N)=(24,24)} . ==== Learning ==== Collect a training set, with some containing faces, and others not containing faces. Perform a certain modified AdaBoost training on the set of all Haar feature classifiers of dimension ( M , N ) {\displaystyle (M,N)} , until a desired level of precision and recall is reached. The modified AdaBoost algorithm would output a sequence of Haar feature classifiers f 1 , f 2 , . . . , f k {\displaystyle f_{1},f_{2},...,f_{k}} . The details of the modified AdaBoost algorithm is detailed below. ==== Using ==== To use a Viola-Jones classifier with f 1 , f 2 , . . . , f k {\displaystyle f_{1},f_{2},...,f_{k}} on an image I {\displaystyle I} , compute f 1 ( I ) , f 2 ( I ) , . . . f k ( I ) {\displaystyle f_{1}(I),f_{2}(I),...f_{k}(I)} sequentially. If at any point, f i ( I ) = 0 {\displaystyle f_{i}(I)=0} , the algorithm immediately returns "no face detected". If all classifiers return 1, then the algorithm returns "face detected". === Learning algorithm === The speed with which features may be evaluated does not adequately compensate for their number, however. For example, in a standard 24x24 pixel sub-window, there are a total of M = 162336 possible features, and it would be prohibitively expensive to evaluate them all when testing an image. Thus, the object detection framework employs a variant of the learning algorithm AdaBoost to both select the best features and to train classifiers that use them. This algorithm constructs a "strong" classifier as a linear combination of weighted simple “weak” classifiers. h ( x ) = sgn ⁡ ( ∑ j = 1 M α j h j ( x ) ) {\displaystyle h(\mathbf {x} )=\operatorname {sgn} \left(\sum _{j=1}^{M}\alpha _{j}h_{j}(\mathbf {x} )\right)} Each weak classifier is a threshold function based on the feature f j {\displaystyle f_{j}} . h j ( x ) = { − s j if f j < θ j s j otherwise {\displaystyle h_{j}(\mathbf {x} )={\begin{cases}-s_{j}&{\text{if }}f_{j}<\theta _{j}\\s_{j}&{\text{otherwise}}\end{cases}}} The threshold value θ j {\displaystyle \theta _{j}} and the polarity s j ∈ ± 1 {\displaystyle s_{j}\in \pm 1} are determined in the training, as well as the coefficients α j {\displaystyle \alpha _{j}} . Here a simplified version of the lea

Audio-visual speech recognition

Audio visual speech recognition (AVSR) is a technique that uses image processing capabilities in lip reading to aid speech recognition systems in recognizing indeterministic phones or giving preponderance among near probability decisions. Each system of lip reading and speech recognition works separately, then their results are mixed at the stage of feature fusion. As the name suggests, it has two parts. First one is the audio part and second one is the visual part. In audio part we use features like log mel spectrogram, mfcc etc. from the raw audio samples and we build a model to get feature vector out of it . For visual part generally we use some variant of convolutional neural network to compress the image to a feature vector after that we concatenate these two vectors (audio and visual ) and try to predict the target object.

FutureMedia

FutureMedia is a program that analyzes the state and future of digital, social, and mobile media. It functions as a collaborative initiative at Georgia Tech and the Georgia Tech Research Institute. FutureMedia consults approximately 500 faculty members working in those fields. == History == In 2019, Future Media expanded into the Direct-To-Consumer market by acquiring Australian watchmaker Oak & Jackal. == Programs == === FutureMedia Fest === The organization most recently hosted FutureMedia Fest 2010, a four-day conference (Oct 4–7, 2010) with a keynote addresses from Michael Jones, the chief technology advocate at Google. The event featured panels, workshops, and technology demonstrations. === FutureMedia Outlook === Contemporaneous with FutureMedia Fest 2010, the organization released the FutureMedia Outlook, an analysis of the future of media, concentrating on six major trends in those fields, including information overload, personalization, data integrity, an expectation of multimedia, augmented reality, and collaborative software.

Cultural technology

Cultural technology (Korean: 문화기술; Hanja: 文化技術; RR: munhwagisul) is a system used by South Korean talent agencies to promote K-pop culture throughout the world as part of the Korean Wave. The system was developed by Lee Soo-man, founder of talent agency and record company SM Entertainment. == History == === Coinage === During a speech at the Stanford Graduate School of Business in 2011, Lee said he coined the term "cultural technology" as a system about fourteen years prior, when S.M. Entertainment decided to promote its K-pop artists to all of Asia. In the late 1990s, Lee and his colleagues created a manual on cultural technology, which specified the steps needed to popularize K-pop artists outside South Korea. "The manual, which all S.M. employees are instructed to learn, explains when to bring in foreign composers, producers, and choreographers; what chord progressions to use in what country; the precise color of eyeshadow a performer should wear in a particular country; the exact hand gestures he or she should make; and the camera angles to be used in the videos (a three-hundred-and-sixty-degree group shot to open the video, followed by a montage of individual closeups)," according to The New Yorker. The term "cultural technology," apart from Lee's systemized definition, can be traced back to the lectures of Michael White, an Australian social worker, educator, and therapeutic theorist and his works Narrative Means to Therapeutic Ends (1990) and Maps of Narrative Practice (2007). Its usage may also date further back to French philosopher Michel Foucault (1977). South Korean computer scientist Kwangyun Wohn said he coined the term "culture technology" in 1994. Cultural technology has also been one of six technology initiatives of the South Korean government since 2001. In regards to cultural technology, the Korean Wave is considered one of the most successful outcomes of government support of exporting Korean entertainment products. === The Four Core Stages === The cultural technology system originally employed by SM Entertainment since the 1990s existed in four stages: Casting, Training, Producing, and Marketing/Managing. Each of these four stages were curated to help spread the Hallyu wave through the development of its artists, and are present in the strategies of many other South Korean talent agencies when creating, debuting, and marketing groups. ==== Casting ==== While the majority of K-pop idols are from South Korea, some are from Japan, China, or Thailand. Many of Korea's entertainment companies, such as SM's Global Auditions, Bighit's Hit It auditions, and YG's Next Generation, host worldwide auditions. Scouting and streetcasting are also common, with members like BTS's Jin recruited for their looks or other surface reasons. Sometimes, casting agents go to dance schools to recruit the top dancers to be trained further at the entertainment company. ==== Training ==== Idols train extensively before debut. They receive training in dance, vocal activities, presentation, and other areas that will benefit them in the industry. Oftentimes, this training will last for years at a time, and trainees are in the proverbial dungeon. Before debut, idols and groups attempt to gain fans through pre-debut activities. SM Entertainment has a system in place called SM Rookies, which is a pre-debut team that hosts concerts and releases videos that strengthen the fanbase of the group even before their first single is released. Other forms of pre-debut activities include featuring in other, more seasoned idols' videos—like Nu'est in Orange Caramel or Exo in Girls' Generation-TTS Twinkle or BTS in Jo Kwon. One particular method of pre-debut training is coupled with casting in production shows, like Sixteen and Produce 101, in which members for a final group are selected and trained. ==== Producing ==== The production of music is integral in culture technology. For cultural technology, production of music helps create differentiated content to set trends in the K-pop world—trends that vary from music to also costume, choreography, and music videos. SM in particular focuses heavily on the expansion globally. Some companies also outsource production to more internationally famed parties, like Cube Entertainment's partnership with Skrillex for 4minute's Act. 7. ==== Marketing/Managing ==== In the marketing and management stage, talent agencies seek to broaden their reach. Often, idols have potential for being actors and actresses in dramas, or perhaps hosts/permanent members of variety shows like Kim Hee-chul in Knowing Bros. This so-called omnidirectional marketing lineup ranges over lifestyle and seeks to reach many aspects of living, like music, TV, drama, entertainment, sports, and fashion. This is also where older groups find new life, like Super Junior. Companies are not complacent but experiment constantly to develop the best marketing for the best management system. Marketing also aspires to branch out to international audiences, sometimes via the implementation of variety shows. Despite being primarily in Korean, these variety shows are accessible to all due to the simplistic, easily understood nature of shows—game-oriented shows like Run BTS! or consistently subbed shows like Weekly Idol are popular in showing the fun-loving side of idols. == Evolution into New Culture Technology == In February 2016, SM hosted a press conference discussing the future of SM and its cultural technology. Lee Soo-man announced the implementation of New Culture Technology, an SM-specific system. While SM's cultural technology in the past relied on local, Korean artists like Rain and BoA, the updated model tries to embed more and more foreign singers from strategic markets into larger girl or boy bands. These imported singers are then used to promote their acts back in their respective home countries. New Culture Technology is five projects—SM Station, EDM, Digital Platforms, Rookies Entertainment, and MCN—and one experimental group, NCT. It is a convergence and expansion of SM's four core culture technologies developed and deals heavily with interaction and the desire to innovate through communication. === SM Station === SM announced their intention of creating a new song every week for 52 weeks. Through this constant output of music, they intend to stray away from conventional forms of music and show active movement in digital music market and physical album market through freely and continuously releasing music. Additionally, this SM Station will feature collaborations between artists, producers, composers, and company brands outside the SM label. The name of SM Station is both derived from the radio station and the metaphorical train station. === NCT === Neo Culture Technology (NCT) introduced the idea of "Interactive". SM company tried to connect the targeting market, customers and artist, in order to lead the K-pop culture. NCT (Neo Culture Technology) is the new artist group formed by SM that embodies the concepts of cultural technology. With the seemingly limitless combinations and groups, SM aspires to make the whole world a stage for NCT. Since 2023, there are six NCT groups, who debuted on the digital song sales: NCT U, NCT 127, NCT Dream, WayV, NCT DoJaeJung, and NCT Wish. As of October 2023, the group consists of 25 members: Johnny, Taeyong, Yuta, Kun, Doyoung, Ten, Jaehyun, Winwin, Jungwoo, Mark, Xiaojun, Hendery, Renjun, Jeno, Haechan, Jaemin, Yangyang, Chenle, Jisung, Sion, Riku, Yushi, Daeyoung, Ryo, and Sakuya. ScreaM Records ScreaM Records has been released by SM Entertainment as an EDM label since 2016 for "SM TOWN: New Culture Technology". ScreaM Records is made for "performances made to be enjoyed". It collaborates with inside and outside Korean well-known EDM DJs. ScreaM Records has first launched collaborated song "Wave" E-Mart's home electronics store, Electro Mart. "Our goal is to provide opportunities to producers who have yet to be discovered and produce world famous DJs from the Asian scene." a ScreaM Records representative said. == Three stages of globalization == According to Lee, there are three stages necessary to popularize Korean culture outside South Korea: exporting the product, collaborating with international companies to expand the product's presence abroad, and finally creating a joint venture with international companies. As part of their joint ventures with international companies, South Korean talent agencies may hire foreign composers, producers, and choreographers to ensure K-pop songs feel "local" to foreign countries.

Attention inequality

Attention inequality is the inequality of distribution of attention across users on social networks, people in general, and for scientific papers. Yun Family Foundation introduced "Attention Inequality Coefficient" as a measure of inequality in attention and arguments it by the close interconnection with wealth inequality. == Relationship to economic inequality == Attention inequality is related to economic inequality since attention is an economically scarce good. The same measures and concepts as in classical economy can be applied for attention economy. The relationship develops also beyond the conceptual level—considering the AIDA process, attention is the prerequisite for real monetary income on the Internet. On data of 2018, a significant relationship between likes and comments on Facebook to donations is proven for non-profit organizations. == Attention economy == The attention economy refers to the practice of maximizing the attention users give to a product for advertising-related reasons. Attention economy remains one of the most common forms of advertising, and has been steadily increasing thanks to new technologies such as television, internet and social media. It is one of the most widely-used approaches to economy for its effectiveness for maximising the noticeability of a certain product. == Attention inequality in social media == In social media, attention inequality refers to the unequal distribution of users' attention on social media platforms. This means that instead of an equal distribution of attention, fewer sources receive a disproportionate share of attention, leaving many unnoticed. This phenomenon is possibly the result of social media algorithms, which are commonly designed to drive maximum engagement. This phenomenon is a large factor in the polarization and creation of echo-chambers. Social media algorithms tend to note content that is already performing well and display it to more users, while content that is equally engaging or well-made is not recommended to users. Posts that trigger strong emotions usually out-perform more "uncontroversial" content. When many users interact with the post, it signals the algorithm that the specific post drives engagement. The algorithm then tends to recommend that type of content to an exponential number of people, potentially outperforming "un-emotional" content. These factors, when combined, tend to create an unequal social media environment. == Attention inequality in science == According to a recent 2025 study about research inequality among scientists published in Information Processing and Management, scientific discourse is restricted to a small group of connected scientists, and is frequently not an accurate representation of the whole scientific community. Using citation-network analysis in the fields of nanoscience and chemical physics, the study claims that a group of connected scientists has a significant notability in the scientific community. The calculated connection strength between these scientists is estimated to be about 4.5, the study also says that these authors cite each other four times more often than would be predicted in a random network, whereas ordinary scientists that exist outside of this group only reach an estimated connection strength of 0.9. The study findings suggest that that scientific attention is not distributed by merit, but rather by the connectedness of the scientists involved in the research. == Extent == As data of 2008 shows, 50% of the attention is concentrated on approximately 0.2% of all hostnames, and 80% on 5% of hostnames. The Gini coefficient of attention distribution lay in 2008 at over 0.921 for such commercial domains names as ac.jp and at 0.985 for .org-domains. The Gini coefficient was measured on Twitter in 2016 for the number of followers as 0.9412, for the number of mentions as 0.9133, and for the number of retweets as 0.9034. For comparison, the world's income Gini coefficient was 0.68 in 2005 and 0.904 in 2018. More than 96% of all followers, 93% of the retweets, and 93% of all mentions are owned by 20% of Twitter. == Causes == At least for scientific papers, today's consensus states that inequality is unexplainable by variations of quality and individual talent. The Matthew effect plays a significant role in the emergence of attention inequality—those who already enjoy large amounts of attention get even more attention, and those who do not lose even more. Ranking algorithms based on relevance to the user have been found to alleviate the inequality of the number of posts across topics.

DeepRoute.ai

DeepRoute.ai (Chinese: 元戎启行) is a Chinese autonomous driving company founded in 2019 and headquartered in Shenzhen, China. The company develops full-stack self-driving solutions including perception, decision-making, and control systems. == History == DeepRoute.ai was founded in February 2019 in Shenzhen, China, by Zhou Guang (周光), who serves as the company's CEO. In September 2019, the company collaborated with Dongfeng for a live-streamed autonomous driving demonstration. In October 2019, during the 7th Military World Games, DeepRoute.ai conducted Robotaxi demonstration operations. In November 2019, it obtained an intelligent connected vehicle road test permit for public roads in Shenzhen. In October 2020, DeepRoute.ai signed an "Autonomous Driving Leadership Project" with Dongfeng to build one of China's largest autonomous fleets. In August 2020, DeepRoute.ai announced its partnership with Cao Cao Mobility, a Geely-backed ride-hailing company, to test Robotaxis in Hangzhou for daily operations, planning to provide Robotaxis during the 2022 Asian Games. In September 2021, DeepRoute.ai secured US$300 million in a Series B funding round led by Alibaba. In December 2021, the company unveiled its DeepRoute-Driver 2.0, an L4-level autonomous driving solution comprising five solid-state lidar sensors, eight cameras, a proprietary computing system and an optional millimeter-wave radar. with a production cost of under US$10,000. In June 2022, it partnered with Deppon Express to provide autonomous light truck freight transfer services. In March 2023, the company launched its high-precision map-free intelligent driving solution, DeepRoute-Driver 3.0. In November 2024, Great Wall Motor announced a $100 million Series C funding round for Deeproute. With this, Deeproute has completed five rounds of financing, raising a cumulative total of over $500 million. Its shareholders include Fosun RZ Capital, Yunqi Partners, Alibaba, Vision Plus Capital, and Dongfeng, among others. In the same month, Deeproute.ai emphasised that they were in "deep cooperation" with Nvidia and spoke on being part of the first batch of companies in China to get a hold of Nvidia's newer Thor chip for cars which will be used in a new system released next year. This new system will help manage more complex driving scenarios through visual cues. == Products == === VLA Model === VLA Model is a Vision–language–action model designed for autonomous driving systems. It integrates visual perception, semantic understanding, and action decision-making into a unified framework, aiming to enhance the safety and adaptability of advanced driver-assistance systems (ADAS) in complex road environments. The model was officially launched on August 26, 2025, as the core of DeepRoute.ai's DeepRoute IO 2.0 platform. The VLA model is characterized by its "visual-language-action" architecture, which incorporates a chain-of-thought (CoT) reasoning capability inspired by large language models. This design is intended to address the "black box" limitations of traditional end-to-end autonomous driving systems by enabling the model to analyze information, infer causality, and make decisions in a more transparent and interpretable manner. === Appliance === The company has partnered with several automakers including Dongfeng Motor Corporation and Geely to develop and test autonomous vehicles.

HTTP compression

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization. HTTP data is compressed before it is sent from the server: compliant browsers will announce what methods are supported to the server before downloading the correct format; browsers that do not support compliant compression method will download uncompressed data. The most common compression schemes include gzip and Brotli; a full list of available schemes is maintained by the IANA. There are two different ways compression can be done in HTTP. At a lower level, a Transfer-Encoding header field may indicate the payload of an HTTP message is compressed. At a higher level, a Content-Encoding header field may indicate that a resource being transferred, cached, or otherwise referenced is compressed. Compression using Content-Encoding is more widely supported than Transfer-Encoding, and some browsers do not advertise support for Transfer-Encoding compression to avoid triggering bugs in servers. == Compression scheme negotiation == The negotiation is done in two steps, described in RFC 2616 and RFC 9110: 1. The web client advertises which compression schemes it supports by including a list of tokens in the HTTP request. For Content-Encoding, the list is in a field called Accept-Encoding; for Transfer-Encoding, the field is called TE. 2. If the server supports one or more compression schemes, the outgoing data may be compressed by one or more methods supported by both parties. If this is the case, the server will add a Content-Encoding or Transfer-Encoding field in the HTTP response with the used schemes, separated by commas. The web server is by no means obligated to use any compression method – this depends on the internal settings of the web server and also may depend on the internal architecture of the website in question. == Content-Encoding tokens == The official list of tokens available to servers and client is maintained by IANA, and it includes: br – Brotli, a compression algorithm specifically designed for HTTP content encoding, defined in RFC 7932 and implemented in all modern major browsers. compress – UNIX "compress" program method (historic; deprecated in most applications and replaced by gzip or deflate) deflate – compression based on the deflate algorithm (described in RFC 1951), a combination of the LZ77 algorithm and Huffman coding, wrapped inside the zlib data format (RFC 1950); exi – W3C Efficient XML Interchange gzip – GNU zip format (described in RFC 1952). Uses the deflate algorithm for compression, but the data format and the checksum algorithm differ from the "deflate" content-encoding. This method is the most broadly supported as of March 2011. identity – No transformation is used. This is the default value for content coding. pack200-gzip – Network Transfer Format for Java Archives zstd – Zstandard compression, defined in RFC 8478 In addition to these, a number of unofficial or non-standardized tokens are used in the wild by either servers or clients: bzip2 – compression based on the free bzip2 format, supported by lighttpd lzip – compression based on the free lzip format, supported by wget and Links lzma – compression based on (raw) LZMA is available in Opera 20, and in elinks via a compile-time option peerdist – Microsoft Peer Content Caching and Retrieval rsync – delta encoding in HTTP, implemented by a pair of rproxy proxies. xpress – Microsoft compression protocol used by Windows 8 and later for Windows Store application updates. LZ77-based compression optionally using a Huffman encoding. xz – LZMA2-based content compression, supported by a non-official Firefox patch; and fully implemented in mget since 2013-12-31. == Servers that support HTTP compression == SAP NetWeaver Microsoft IIS: built-in or using third-party module Apache HTTP Server, via mod_deflate (despite its name, only supporting gzip), and mod_brotli Hiawatha HTTP server: serves pre-compressed files Cherokee HTTP server, On the fly gzip and deflate compressions Oracle iPlanet Web Server Zeus Web Server lighttpd nginx – built-in Applications based on Tornado, if "compress_response" is set to True in the application settings (for versions prior to 4.0, set "gzip" to True) Jetty Server – built-into default static content serving and available via servlet filter configurations GeoServer Apache Tomcat IBM Websphere AOLserver Ruby Rack, via the Rack::Deflater middleware HAProxy Varnish – built-in. Works also with ESI Armeria – Serving pre-compressed files NaviServer – built-in, dynamic and static compression Caddy – built-in via encode Many content delivery networks also implement HTTP compression to improve speedy delivery of resources to end users. The compression in HTTP can also be achieved by using the functionality of server-side scripting languages like PHP, or programming languages like Java. Various online tools exist to verify a working implementation of HTTP compression. These online tools usually request multiple variants of a URL, each with different request headers (with varying Accept-Encoding content). HTTP compression is considered to be implemented correctly when the server returns a document in a compressed format. By comparing the sizes of the returned documents, the effective compression ratio can be calculated (even between different compression algorithms). == Problems preventing the use of HTTP compression == A 2009 article by Google engineers Arvind Jain and Jason Glasgow states that more than 99 person-years are wasted daily due to increase in page load time when users do not receive compressed content. This occurs when anti-virus software interferes with connections to force them to be uncompressed, where proxies are used (with overcautious web browsers), where servers are misconfigured, and where browser bugs stop compression being used. Internet Explorer 6, which drops to HTTP 1.0 (without features like compression or pipelining) when behind a proxy – a common configuration in corporate environments – was the mainstream browser most prone to failing back to uncompressed HTTP. Another problem found while deploying HTTP compression on large scale is due to the deflate encoding definition: while HTTP 1.1 defines the deflate encoding as data compressed with deflate (RFC 1951) inside a zlib formatted stream (RFC 1950), Microsoft server and client products historically implemented it as a "raw" deflated stream, making its deployment unreliable. For this reason, some software, including the Apache HTTP Server, only implements gzip encoding. == Security implications == Compression allows a form of chosen plaintext attack to be performed: if an attacker can inject any chosen content into the page, they can know whether the page contains their given content by observing the size increase of the encrypted stream. If the increase is smaller than expected for random injections, it means that the compressor has found a repeat in the text, i.e. the injected content overlaps the secret information. This is the idea behind CRIME. In 2012, a general attack against the use of data compression, called CRIME, was announced. While the CRIME attack could work effectively against a large number of protocols, including but not limited to TLS, and application-layer protocols such as SPDY or HTTP, only exploits against TLS and SPDY were demonstrated and largely mitigated in browsers and servers. The CRIME exploit against HTTP compression has not been mitigated at all, even though the authors of CRIME have warned that this vulnerability might be even more widespread than SPDY and TLS compression combined. In 2013, a new instance of the CRIME attack against HTTP compression, dubbed BREACH, was published. A BREACH attack can extract login tokens, email addresses or other sensitive information from TLS encrypted web traffic in as little as 30 seconds (depending on the number of bytes to be extracted), provided the attacker tricks the victim into visiting a malicious web link. All versions of TLS and SSL are at risk from BREACH regardless of the encryption algorithm or cipher used. Unlike previous instances of CRIME, which can be successfully defended against by turning off TLS compression or SPDY header compression, BREACH exploits HTTP compression which cannot realistically be turned off, as virtually all web servers rely upon it to improve data transmission speeds for users. As of 2016, the TIME attack and the HEIST attack are now public knowledge.