AI Data Analyst Zalando

AI Data Analyst Zalando — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Spatiotemporal reservoir resampling

    Spatiotemporal reservoir resampling

    Spatiotemporal reservoir resampling, commonly known as ReSTIR (from "Reservoir-based SpatioTemporal Importance Resampling"), is a collection of computer graphics techniques for reusing samples during rendering. It was developed primarily to allow more realistic lighting in real-time rendering, because relatively few rays can be traced per pixel while maintaining an acceptable frame rate. It can also be used to speed up off-line path tracing. The first ReSTIR paper, published in 2020, provided algorithms for direct lighting, allowing scenes containing thousands of lights to be rendered in real time on a high-end GPU. Researchers later proposed versions for rendering indirect lighting (and more recently, motion blur and depth of field) and built up a framework of mathematical concepts and notation conventions that help analyze such algorithms. A major focus of this work is removing or reducing the bias that could be introduced when samples from other pixels or frames are reused—or selectively allowing some bias in order to speed up rendering and reduce variance (visible as "noise" in the image). Versions for path tracing apply transformations called shift mappings to samples, typically reusing parts of paths closer to the light and modifying the portion closer to the camera. ReSTIR-related papers and talks have been presented every year at the SIGGRAPH conference since 2020. One of the first games to incorporate ReSTIR into its rendering was Cyberpunk 2077. == Overview and motivation == According to Chris Wyman, one of the co-authors of the original paper, although developers commonly thought that bias was acceptable for real-time rendering, end users (e.g. gamers) are well-aware of the artifacts caused by bias and many have a negative opinion of common sample-reuse techniques such as temporal anti-aliasing (TAA), which may cause "ghosting" when the camera moves, and denoising, which causes blurring and other artifacts. ReSTIR techniques can reduce or avoid these types of bias by reusing samples of the set of possible paths taken by light to reach the camera, instead of reusing rendered pixel color values (which are typically the average of multiple samples, discarding information such as the direction of the light). While other techniques reuse samples in a generic post-processing step, ReSTIR passes can test for shadowing, and reused samples are converted into pixel color values by rendering code that takes the characteristics of different materials into account (e.g. by implementing BRDFs). However the output of ReSTIR is noisy, and a denoising pass is typically still used. Stochastic ray tracing techniques such as path tracing need to average multiple samples (produced by tracing individual rays) in order to render a visually acceptable image. When using a simple unbiased renderer based on Monte Carlo integration, halving the deviation of the result (apparent as "noise" in the image) requires multiplying the number of samples by four, meaning that a rapidly increasingly number of samples is needed to improve quality, Standard ways to mitigate this problem include importance sampling (which requires finding improved sampling distributions for specific situations), and quasi-Monte Carlo integration (which usually still requires tracing a large number of rays). ReSTIR offers a solution that multiplies the effective number of samples while tracing a fixed number of additional rays per frame. Temporal reuse multiplies the effective sample count by the number of frames rendered. Spatial reuse multiplies the effective count by the number of neighboring pixels examined. These two types of reuse can be combined, allowing spatial reuse to be applied recursively, which appears to offer an exponentially increasing effective sample count, however this is quickly limited by the size of the neighborhood used for spatial reuse. Spatial reuse is also potentially less effective near shadow and object edges, especially for objects with fine geometric detail, and temporal reuse is limited by movement of the camera and scene elements. == Variations == Many variations of ReSTIR have been proposed that generalize or improve the original technique (which builds on an earlier method called RIS), specialize it for particular types of illumination or other visual effects, or allow incorporation into rendering algorithms other than standard path tracing. Some published versions are listed below. == Algorithms == === Basic algorithm === ReSTIR uses a combination of resampled importance sampling (RIS) and weighted reservoir sampling (WRS) which the authors call streaming RIS. RIS processes samples from an initial probability distribution (e.g. a probability distribution for which a cheap sampling method exists) and generates samples in a new probability distribution (e.g. a sampling distribution that is optimal for rendering but is impractical to draw samples from directly). WRS allows this to be done while storing only a small number of samples in memory, which is especially helpful on a GPU. Information about the samples is stored in a data structure called a reservoir. WRS also allows samples from multiple reservoirs to be combined ("merged") into a single reservoir; this is crucial for sample reuse. Each pixel has a reservoir, typically containing only a single sample when ReSTIR is used for real-time rendering (some implementations use a larger number, e.g. four samples). The reservoir is typically initialized to a sample drawn using a simple method and is then updated by RIS steps and by reservoir merging, so that the pixel value produced by shading using the sample(s) currently in the reservoir, times the weight for the sample, is always an unbiased estimate of the correct pixel value. If appropriate resampling steps are used, the variance of this estimate (or some function of it, typically the luminance of the RGB color value) decreases with each step. A possible sequence of steps performed for each frame, suitable for computing unbiased direct illumination (DI) is: Perform reservoir resampling by drawing multiple light samples and using streaming RIS to choose one, using probabilities based on a target function, e.g. the luminance of the sample's contribution to the pixel. A weight is also computed for the sample. Typically, a single visibility check is performed here, after choosing a sample, setting the weight to 0 if the light is shadowed. Resampling (combined with the visibility check) ensures that the expected value of the weight times the sample brightness is the correct (unbiased) value for the pixel. (temporal reuse) For each pixel, merge the sample(s) from the previous frame into the current reservoir. Multiple importance sampling (MIS) weights are used to avoid bias due to the fact that the samples in the previous frame's reservoirs may have a different target probability distribution if the objects, lights, or camera have moved. (spatial reuse) For each pixel, choose one or more neighboring pixels and merge their samples into the current pixel's reservoir. Multiple importance sampling (MIS) weights are used to avoid bias due to the fact that the samples in each pixel's reservoir have a different target probability distribution. Because computing unbiased MIS weights requires tracing additional rays (along with other work such as evaluating BRDFs), real-time rendering often uses only a single neighboring pixel. Use the sample in each pixel's reservoir, along with its weight, to determine the color of the pixel for the current frame. Alternatively, multiple samples examined during the preceding steps may be averaged and used to shade the pixel instead (decoupled shading and sampling). For direct lighting, the initial samples used in step 1 are typically drawn by importance sampling from the set of lights in a scene. The algorithm above (from the original ReSTIR paper) draws many lower-quality light samples (e.g. 32) using a fast method, without considering visibility, and chooses one using streaming RIS. Visibility is then tested for the final chosen sample. Considering visibility for each sample drawn would require tracing 32 rays, which would make it much more expensive. The intent is to reduce the number of rays traced, relying on the sample reuse in steps 2 and 3 to make up for the loss of quality caused by rejecting many of the rays due to shadowing. A large part of the initial efforts to optimize ReSTIR (to make it run in real-time on available hardware) went into reducing the cost of randomly sampling the lights. Glossy surfaces may require a larger number of samples, and combining light sampling with BRDF sampling (using MIS) may increase quality. Step 2 (temporal reuse) is sometimes skipped for off-line rendering, and the output of multiple repetitions of initial sampling and spatial reuse is averaged instead; this helps avoids artifacts due to correlations. Step 3 (spatial reuse) may be repeated multiple times in a single frame.

    Read more →
  • Content adaptation

    Content adaptation

    Content adaptation is the action of transforming content to adapt to device capabilities. Content adaptation is usually related to mobile devices, which require special handling because of their limited computational power, small screen size, and constrained keyboard functionality. Content adaptation could roughly be divided to two fields: Media content adaptation that adapts media files. Browsing content adaptation that adapts a website to mobile devices. == Browsing content adaptation == Advances in the capabilities of small, mobile devices such as mobile phones (cell phones) and Personal Digital Assistants have led to an explosion in the number of types of device that can now access the Web. Some commentators refer to the Web that can be accessed from mobile devices as the Mobile Web. The sheer number and variety of Web-enabled devices poses significant challenges for authors of websites who want to support access from mobile devices. The W3C Device Independence Working Group described many of the issues in its report Authoring Challenges for Device Independence. Content adaptation is one approach to a solution. Rather than requiring authors to create pages explicitly for each type of device that might request them, content adaptation transforms an author's materials automatically. For example, content might be converted from a device-independent markup language, such as XDIME, an implementation of the W3C's DIAL specification, into a form suitable for the device, such as XHTML Basic, C-HTML, or WML. Similarly, a suitable device-specific CSS style sheet or a set of in-line styles might be generated from abstract style definitions. Likewise, a device specific layout might be generated from abstract layout definitions. Once created, the device-specific materials form the response returned to the device from which the request was made. Another way is to use the latest trend responsive design based on CSS, covered in this article (RWD). Content adaptation requires a processor that performs the selection, modification, and generation of materials to form the device-specific result. IBM's Websphere Everyplace Mobile Portal (WEMP), BEA Systems' WebLogic Mobility Server, Morfeo's MyMobileWeb, and Apache Cocoon are examples of such processors. Wurfl and WALL are popular open source tools for content adaptation. WURFL is an XML-based Device Description Repository with APIs to access the data in Java and PHP (and other popular programming languages). WALL (Wireless Abstraction Library) lets a developer author mobile pages which look like plain HTML, but converts them to WML, C-HTML, or XHTML Mobile Profile, depending on the capabilities of the device from which the HTTP request originates. GreasySpoon lets the developer build plugins for content editing, in JavaScript, Ruby (programming language), and more, just like the Firefox application GreaseMonkey. Alembik (Media Transcoding Server) is a Java (J2EE) application providing transcoding services for variety of clients and for different media types (image, audio, video, etc.). It is fully compliant with OMA's Standard Transcoder Interface specification and is distributed under the LGPL open source license. In 2007, the first large scale carrier-grade deployments of content transformation, on existing mass-market handsets, with no software download required, were deployed by Vodafone in the UK and globally for Yahoo! oneSearch, using the Novarra Vision solution. Novarra's content adaptation solution had been used in enterprise intranet deployments as early as 2003 (at that time, the platform was named "Engines for Wireless Data"). InfoGin, the 9-year-old content-adaptation company with customers like Vodafone, Orange, Telefónica and PCCW. The patented "Web to Mobile adaptation", Mobile Matrix Transcoder, Multimedia and Documents transcoders, Video adaptation supporte. Launched in 2007, Bytemobile's Web Fidelity Service was another carrier-grade, commercial infrastructure solution, which provided wireless content adaptation to mobile subscribers on their existing mass-market handsets, with no client download required.

    Read more →
  • The Big Book of Social Media

    The Big Book of Social Media

    The Big Book of Social Media: Case Studies, Stories, Perspectives, released in November 2010 by Yorkshire Publishing, is a compilation of non-fiction articles and chapters written by social media experts in their respective fields and edited by Robert Fine, organizer of the Cool Social Conferences World Tour and founder of Cool Blue Press, with a foreword by Sam Feist, political director for CNN. == Synopsis == The publisher, on its site, summed up the book as, "Not business. Not marketing. This is an idea book." And an article in Business Insider described the book as bringing "the social back into social media." == Contributors == Contributing authors include: Alan Rosenblatt, Alane Anderson, Alecia Dantico, Alex Priest, Alfred Naranjo, Becky Carroll, Carri Bugbee, Cathy Scott, Colleen Crinklaw, Constantine Markides, Cordelia Mendoza, Craig Kanalley, Dave Ingland, Eric Andersen, Eric Brown, Gary Zukowski, Haja Rasambainarivo, Jennifer Kaplan, Kari Quaas, Lauri Stevens, Lev Ekster, Mark Stelzner, Matthew Felling, Matt Stewart, Melani Gordon, Michael Bourne, Michele Mattia, Mirna Bard, Neal Schaffer, Nic Evans, Noaf Ereiqat, Pek Pongpaet, Perri Gorman, Phil Baumann, Regina Holliday, Rory Cooper, Sam Feist, Shashi Bellamkonda, Shrinath Navghane, Steve Pratt, Ted Nguyen, Todd Schnick, Tonia Ries, Wayne Burke, as well as Robert Fine. In December 2011, some of the contributing authors organized "Tweet It Forward," a holiday charity fundraiser, with net proceeds benefitting the Food Bank for New York City. == Reception == Reviewer Mike Brown wrote on the Brainzooming blog that the book goes "beyond the valueless chatter out there; it provides solid discussions of real-life social media strategy implementations that have truly integrated organizational objectives and delivered real metrics." And Tech Cocktail wrote it in its review, "Through a collection of entertaining anecdotes and insightful marketing agendas, one sees what social media is truly all about and how it is revolutionizing the communications industry." In 2011, at the SXSW social media festival in Austin, Texas, Fine launched Cool Blue Press and reintroduced The Big Book of Social Media, with plans, he told a reporter from the Washington Examiner, for other new media books and publishing projects, including The Social Media Monthly magazine. The book was reviewed in 2012 by SAGE Publications for its Journalism and Mass Communication Educator magazine. It is also cited in several books and journals. === Awards === The book was a winner in the 4th Annual Reader's Choice "Small Business Book Awards" for 2011. Windmill Networking named it the Top 15 recommended social media books of 2010.

    Read more →
  • HTTP cookie

    HTTP cookie

    An HTTP cookie (also called web cookie, Internet cookie, browser cookie, or simply cookie) is a small block of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's web browser. Cookies are placed on the device used to access a website, and more than one cookie may be placed on a user's device during a session. Cookies serve useful and sometimes essential functions on the web. They enable web servers to store stateful information (such as items added in the shopping cart in an online store) on the user's device or to track the user's browsing activity (including clicking particular buttons, logging in, or recording which pages were visited in the past). They can also be used to save information that the user previously entered into form fields, such as names, addresses, passwords, and payment card numbers for subsequent use. Authentication cookies are commonly used by web servers to authenticate that a user is logged in, and with which account they are logged in. Without the cookie, users would need to authenticate themselves by logging in on each page containing sensitive information that they wish to access. The security of an authentication cookie generally depends on the security of the issuing website and the user's web browser, and on whether the cookie data is encrypted. Security vulnerabilities may allow a cookie's data to be read by an attacker, used to gain access to user data, or used to gain access (with the user's credentials) to the website to which the cookie belongs (see cross-site scripting and cross-site request forgery for examples). Tracking cookies, and especially third-party tracking cookies, are commonly used as ways to compile long-term records of individuals' browsing histories — a potential privacy concern that prompted European and U.S. lawmakers to take action in 2011. European law requires that all websites targeting European Union member states gain "informed consent" from users before storing non-essential cookies on their device. == Background == === Origin of the name === The term cookie was coined by web-browser programmer Lou Montulli. It was derived from the term magic cookie, which is a packet of data a program receives and sends back unchanged, used by Unix programmers. === History === Magic cookies were already used in computing when computer programmer Lou Montulli had the idea of using them in web communications in June 1994. At the time, he was an employee of Netscape Communications, which was developing an e-commerce application for MCI. Vint Cerf and John Klensin represented MCI in technical discussions with Netscape Communications. MCI did not want its servers to have to retain partial transaction states, which led them to ask Netscape to find a way to store that state in each user's computer instead. Cookies provided a solution to the problem of reliably implementing a virtual shopping cart. Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of Mosaic Netscape, released on 13 October 1994, supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, which was granted in 1998. Support for cookies was integrated with Internet Explorer in version 2, released in October 1995. The introduction of cookies was not widely known to the public at the time. In particular, cookies were accepted by default, and users were not notified of their presence. The public learned about cookies after the Financial Times published an article about them on 12 February 1996. In the same year, cookies received a lot of media attention, especially because of potential privacy implications. Cookies were discussed in two U.S. Federal Trade Commission hearings in 1996 and 1997. The development of the formal cookie specifications was already ongoing. In particular, the first discussions about a formal specification started in April 1995 on the www-talk mailing list. A special working group within the Internet Engineering Task Force (IETF) was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by Brian Behlendorf and David Kristol respectively. But the group, headed by Kristol himself and Lou Montulli, soon decided to use the Netscape specification as a starting point. In February 1996, the working group identified third-party cookies as a considerable privacy threat. The specification produced by the group was eventually published as RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default. At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of RFC 2109 was not followed by Netscape and Internet Explorer. RFC 2109 was superseded by RFC 2965 in October 2000. RFC 2965 added a Set-Cookie2 header field, which informally came to be called "RFC 2965-style cookies" as opposed to the original Set-Cookie header field which was called "Netscape-style cookies". Set-Cookie2 was seldom used, however, and was deprecated in RFC 6265 in April 2011 which was written as a definitive specification for cookies as used in the real world. No modern browser recognizes the Set-Cookie2 header field. == Terminology == === Session cookie === A session cookie (also known as an in-memory cookie, transient cookie or non-persistent cookie) exists only in temporary memory while the user navigates a website. Session cookies expire or are deleted when the user closes the web browser. Session cookies are identified by the browser by the absence of an expiration date assigned to them. === Persistent cookie === A persistent cookie expires at a specific date or after a specific length of time. For the persistent cookie's lifespan set by its creator, its information will be transmitted to the server every time the user visits the website that it belongs to, or every time the user views a resource belonging to that website from another website (such as an advertisement). For this reason, persistent cookies are sometimes referred to as tracking cookies because they can be used by advertisers to record information about a user's web browsing habits over an extended period of time. Persistent cookies are also used for reasons such as keeping users logged into their accounts on websites, to avoid re-entering login credentials at every visit. (See § Uses, below.) === Secure cookie === A secure cookie can only be transmitted over an encrypted connection (i.e. HTTPS). They cannot be transmitted over unencrypted connections (i.e. HTTP). This makes the cookie less likely to be exposed to cookie theft via eavesdropping. A cookie is made secure by adding the Secure flag to the cookie. === Http-only cookie === An http-only cookie cannot be accessed by client-side APIs, such as JavaScript. This restriction eliminates the threat of cookie theft via cross-site scripting (XSS). However, the cookie remains vulnerable to cross-site tracing (XST) and cross-site request forgery (CSRF) attacks. A cookie is given this characteristic by adding the HttpOnly flag to the cookie. === Same-site cookie === In 2016 Google Chrome version 51 introduced a new kind of cookie with attribute SameSite with possible values of Strict, Lax or None. With attribute SameSite=Strict, the browsers would only send cookies to a target domain that is the same as the origin domain. This would effectively mitigate cross-site request forgery (CSRF) attacks. With SameSite=Lax, browsers would send cookies with requests to a target domain even it is different from the origin domain, but only for safe requests such as GET (POST is unsafe) and not third-party cookies (inside iframe). Attribute SameSite=None would allow third-party (cross-site) cookies, however, most browsers require secure attribute on SameSite=None cookies. The Same-site cookie is incorporated into a new RFC draft for "Cookies: HTTP State Management Mechanism" to update RFC 6265 (if approved). Chrome, Firefox, and Edge started to support Same-site cookies. The key of rollout is the treatment of existing cookies without the SameSite attribute defined, Chrome has been treating those existing cookies as if SameSite=None, this would let all website/applications run as before. Google intended to change that default to SameSite=Lax in Chrome 80 planned to be released in February 2020, but due to potential for breakage of those applications/websites that rely on third-party/cross-site cookies and COVID-19 circumstances, Google postponed this change to Chrome 84. === Supercookie === A supercookie is a cookie with an origin of a top-level domain (such as .com) or a public suffix (such as .co.uk). Ordinary cookies, by contrast, have an origin of a specific domain name, such as ex

    Read more →
  • Uniform convergence in probability

    Uniform convergence in probability

    Uniform convergence in probability is a form of convergence in probability in statistical asymptotic theory and probability theory. It means that, under certain conditions, the empirical frequencies of all events in a certain event-family uniformly converge to their theoretical probabilities. Uniform convergence in probability has applications to statistics as well as machine learning as part of statistical learning theory. Specifically, the Glivenko-Cantelli theorem and the homonymous classes of functions are fundamentally related to uniform convergence. The law of large numbers says that, for each single event A {\displaystyle A} , its empirical frequency in a sequence of independent trials converges (with high probability) to its theoretical probability. In many application however, the need arises to judge simultaneously the probabilities of events of an entire class S {\displaystyle S} from one and the same sample. Moreover, it, is required that the relative frequency of the events converge to the probability uniformly over the entire class of events S {\displaystyle S} . The Uniform Convergence Theorem gives a sufficient condition for this convergence to hold. Roughly, if the event-family is sufficiently simple (its VC dimension is sufficiently small) then uniform convergence holds. == Definitions == For a class of predicates H {\displaystyle H} defined on a set X {\displaystyle X} and a set of samples x = ( x 1 , x 2 , … , x m ) {\displaystyle x=(x_{1},x_{2},\dots ,x_{m})} , where x i ∈ X {\displaystyle x_{i}\in X} , the empirical frequency of h ∈ H {\displaystyle h\in H} on x {\displaystyle x} is Q ^ x ( h ) = 1 m | { i : 1 ≤ i ≤ m , h ( x i ) = 1 } | . {\displaystyle {\widehat {Q}}_{x}(h)={\frac {1}{m}}|\{i:1\leq i\leq m,h(x_{i})=1\}|.} The theoretical probability of h ∈ H {\displaystyle h\in H} is defined as Q P ( h ) = P { y ∈ X : h ( y ) = 1 } . {\displaystyle Q_{P}(h)=P\{y\in X:h(y)=1\}.} The Uniform Convergence Theorem states, roughly, that if H {\displaystyle H} is "simple" and we draw samples independently (with replacement) from X {\displaystyle X} according to any distribution P {\displaystyle P} , then with high probability, the empirical frequency will be close to its expected value, which is the theoretical probability. Here "simple" means that the Vapnik–Chervonenkis dimension of the class H {\displaystyle H} is small relative to the size of the sample. In other words, a sufficiently simple collection of functions behaves roughly the same on a small random sample as it does on the distribution as a whole. The Uniform Convergence Theorem was first proved by Vapnik and Chervonenkis using the concept of growth function. == Uniform Convergence Theorem == The statement of the Uniform Convergence Theorem is as follows: If H {\displaystyle H} is a set of { 0 , 1 } {\displaystyle \{0,1\}} -valued functions defined on a set X {\displaystyle X} and P {\displaystyle P} is a probability distribution on X {\displaystyle X} then for ε > 0 {\displaystyle \varepsilon >0} and m {\displaystyle m} a positive integer, we have: P m { | Q P ( h ) − Q x ^ ( h ) | ≥ ε for some h ∈ H } ≤ 4 Π H ( 2 m ) e − ε 2 m / 8 . {\displaystyle P^{m}\{|Q_{P}(h)-{\widehat {Q_{x}}}(h)|\geq \varepsilon {\text{ for some }}h\in H\}\leq 4\Pi _{H}(2m)e^{-\varepsilon ^{2}m/8}.} In the above, for any x ∈ X m , {\displaystyle x\in X^{m},} Q P ( h ) = P { ( y ∈ X : h ( y ) = 1 } , {\displaystyle Q_{P}(h)=P\{(y\in X:h(y)=1\},} Q ^ x ( h ) = 1 m | { i : 1 ≤ i ≤ m , h ( x i ) = 1 } | {\displaystyle {\widehat {Q}}_{x}(h)={\frac {1}{m}}|\{i:1\leq i\leq m,h(x_{i})=1\}|} and | x | = m . {\displaystyle |x|=m.} P m {\displaystyle P^{m}} indicates that the probability is taken over x {\displaystyle x} consisting of m {\displaystyle m} i.i.d. draws from the distribution P . {\displaystyle P.} Finally, the growth function Π H {\displaystyle \Pi _{H}} is defined in the following way, for any { 0 , 1 } {\displaystyle \{0,1\}} -valued functions H {\displaystyle H} over X {\displaystyle X} and for any natural number m {\displaystyle m} : Π H ( m ) = max | { h ∩ D : D ⊆ X , | D | = m , h ∈ H } | . {\displaystyle \Pi _{H}(m)=\max |\{h\cap D:D\subseteq X,|D|=m,h\in H\}|.} From the point of view of Learning Theory one can consider H {\displaystyle H} to be the Concept/Hypothesis class defined over the instance set X {\displaystyle X} . Crucially, the Sauer–Shelah lemma implies that Π H ( m ) ≤ m d {\displaystyle \Pi _{H}(m)\leq m^{d}} , where d {\displaystyle d} is the VC dimension of H {\displaystyle H} . == Proof of the Uniform Convergence Theorem == and are the sources of the proof below. Before we get into the details of the proof of the Uniform Convergence Theorem we will present a high level overview of the proof. Symmetrization: We transform the problem of analyzing | Q P ( h ) − Q ^ x ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{x}(h)|\geq \varepsilon } into the problem of analyzing | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} , where r {\displaystyle r} and s {\displaystyle s} are i.i.d samples of size m {\displaystyle m} drawn according to the distribution P {\displaystyle P} . One can view r {\displaystyle r} as the original randomly drawn sample of length m {\displaystyle m} , while s {\displaystyle s} may be thought as the testing sample which is used to estimate Q P ( h ) {\displaystyle Q_{P}(h)} . Permutation: Since r {\displaystyle r} and s {\displaystyle s} are picked identically and independently, so swapping elements between them will not change the probability distribution on r {\displaystyle r} and s {\displaystyle s} . So, we will try to bound the probability of | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} for some h ∈ H {\displaystyle h\in H} by considering the effect of a specific collection of permutations of the joint sample x = r | | s {\displaystyle x=r||s} . Specifically, we consider permutations σ ( x ) {\displaystyle \sigma (x)} which swap x i {\displaystyle x_{i}} and x m + i {\displaystyle x_{m+i}} in some subset of 1 , 2 , . . . , m {\displaystyle {1,2,...,m}} . The symbol r | | s {\displaystyle r||s} means the concatenation of r {\displaystyle r} and s {\displaystyle s} . Reduction to a finite class: We can now restrict the function class H {\displaystyle H} to a fixed joint sample and hence, if H {\displaystyle H} has finite VC Dimension, it reduces to the problem to one involving a finite function class. We present the technical details of the proof. It should be stressed that this proof glosses over details like the measurability of the events V {\displaystyle V} and R {\displaystyle R} ; measurability is granted in the case of H {\displaystyle H} being finite or countable, but this is not normally the case in standard applications of the theorem (e.g. for statistical learning theory or to prove the Glivenko-Cantelli theorem). To get measurability, one needs to use a notion of separability of the underlying space, possibly related to H {\displaystyle H} . === Symmetrization === Lemma: Let V = { x ∈ X m : | Q P ( h ) − Q ^ x ( h ) | ≥ ε for some h ∈ H } {\displaystyle V=\{x\in X^{m}:|Q_{P}(h)-{\widehat {Q}}_{x}(h)|\geq \varepsilon {\text{ for some }}h\in H\}} and R = { ( r , s ) ∈ X m × X m : | Q r ^ ( h ) − Q ^ s ( h ) | ≥ ε / 2 for some h ∈ H } . {\displaystyle R=\{(r,s)\in X^{m}\times X^{m}:|{\widehat {Q_{r}}}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2{\text{ for some }}h\in H\}.} Then for m ≥ 2 ε 2 {\displaystyle m\geq {\frac {2}{\varepsilon ^{2}}}} , P m ( V ) ≤ 2 P 2 m ( R ) {\displaystyle P^{m}(V)\leq 2P^{2m}(R)} . Proof: By the triangle inequality, if | Q P ( h ) − Q ^ r ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon } and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2} then | Q ^ r ( h ) − Q ^ s ( h ) | ≥ ε / 2 {\displaystyle |{\widehat {Q}}_{r}(h)-{\widehat {Q}}_{s}(h)|\geq \varepsilon /2} . Therefore, P 2 m ( R ) ≥ P 2 m { ∃ h ∈ H , | Q P ( h ) − Q ^ r ( h ) | ≥ ε and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 } = ∫ V P m { s : ∃ h ∈ H , | Q P ( h ) − Q ^ r ( h ) | ≥ ε and | Q P ( h ) − Q ^ s ( h ) | ≤ ε / 2 } d P m ( r ) = A {\displaystyle {\begin{aligned}&P^{2m}(R)\\[5pt]\geq {}&P^{2m}\{\exists h\in H,|Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon {\text{ and }}|Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2\}\\[5pt]={}&\int _{V}P^{m}\{s:\exists h\in H,|Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon {\text{ and }}|Q_{P}(h)-{\widehat {Q}}_{s}(h)|\leq \varepsilon /2\}\,dP^{m}(r)\\[5pt]={}&A\end{aligned}}} since r {\displaystyle r} and s {\displaystyle s} are independent. Now for r ∈ V {\displaystyle r\in V} fix an h ∈ H {\displaystyle h\in H} such that | Q P ( h ) − Q ^ r ( h ) | ≥ ε {\displaystyle |Q_{P}(h)-{\widehat {Q}}_{r}(h)|\geq \varepsilon } . For this h {\displaystyle h} , we shall

    Read more →
  • International Webmasters Association

    International Webmasters Association

    The International Webmasters Association (IWA) is a non-profit association for education and certification of web professionals founded in 1996. It provides a Certified Web Professional certification. One of its objectives is to build a World Wide Web that is a true global community. According to the IWA, as of 2025 it has more than 100 official chapters with over 300,000 individual members in 106 countries. In 2001, the IWA merged with the HTML Writers Guild (HWG) and joined the World Wide Web Consortium (W3C). IWA's accomplishments include the publishing of the industry's first guidelines for ethical and professional standards, web certification and education programs, specialized employment resources, and technical assistance to individuals and businesses. IWA members participate to the activities of W3C WCAG Working Group, ATAG Working Group, and the XHTML Working Group. They have also participated in other initiatives such as the Multimodal Interaction Working Group which developed EMMA, the Extensible MultiModal Annotation markup language.

    Read more →
  • Feature detection (web development)

    Feature detection (web development)

    Feature detection (also feature testing) is a technique used in web development for handling differences between runtime environments (typically web browsers or user agents), by programmatically testing for clues that the environment may or may not offer certain functionality. This information is then used to make the application adapt in some way to suit the environment: to make use of certain APIs, or tailor for a better user experience. Its proponents claim it is more reliable and future-proof than other techniques like user agent sniffing and browser-specific CSS hacks. == Techniques == A feature test can take many forms. It is essentially any snippet of code which gives some level of confidence that a required feature is indeed supported. However, in contrast to other techniques, feature detection usually focuses on performing actions which directly relate to the feature to be detected, rather than heuristics. === JavaScript === JavaScript feature detection can inspect the DOM and the local JavaScript environment to test whether browser features or APIs are supported. The simplest technique is to check for the existence of a relevant object or property. For example, the Geolocation API (used for accessing the device's knowledge of its geographical location, possibly obtained from a GPS navigation device) exposes a geolocation property on the navigator object in the DOM; the presence of which implies the Geolocation API is supported: if ('geolocation' in navigator) { // Geolocation API is supported } For a higher level of confidence, some feature tests will attempt to invoke the feature then look for clues that it behaved properly. For example, a test for support for cookies might attempt to set a value as a cookie and then verify it can be read back. === CSS === In CSS, the at-rule @supports introduced in 2015 allows to test if a given feature is supported. For instance the following code activates the declarations only if the user agent supports display: flex: == Undetectables == Some browser features are considered undetectable, because no clues are known to give sufficient confidence that a feature is supported. These are often because of limited information available to the JavaScript environment in the browser; generally features must be exposed via the DOM in some way in order to be detectable using JavaScript. When undetectables are encountered, it is common to turn to user agent sniffing as an alternative mechanism, or to employ defensive coding to minimise the impact if the feature turns out not to be supported. The Modernizr project maintains a record of known undetectables on their wiki.

    Read more →
  • MADI

    MADI

    Multichannel Audio Digital Interface (MADI) standardized as AES10 by the Audio Engineering Society (AES) defines the data format and electrical characteristics of an interface that carries multiple channels of digital audio. The AES first documented the MADI standard in AES10-1991 and updated it in AES10-2003 and AES10-2008. The MADI standard includes a bit-level description and has features in common with the two-channel AES3 interface. MADI supports serial digital transmission over coaxial cable or fibre-optic lines of 28, 56, 32, or 64 channels; and sampling rates to 96 kHz and beyond with an audio bit depth of up to 24 bits per channel. Like AES3 and ADAT Lightpipe, it is a unidirectional interface from one sender to one receiver. == Development and applications == MADI was developed by AMS Neve, Solid State Logic, Sony and Mitsubishi and is widely used in the audio industry, especially in the professional audio sector. It provides advantages over other audio digital interface protocols and standards such as AES3, ADAT Lightpipe, TDIF (Tascam Digital Interface), and S/PDIF (Sony/Philips Digital Interface). These advantages include: Support for a greater number of channels per line Use of coaxial and optical fiber media that support transmission of audio signals over 100 meters, up to 3000 meters over multi-mode and 40,000 meters over single-mode optical fiber The original specification (AES10-1991) defined the MADI link as a 56-channel transport for linking large-format mixing consoles to digital multitrack recording devices. Large broadcast studios also adopted it for routing multi-channel audio throughout their facilities. The 2003 revision (AES10-2003) adds a 64-channel capability by removing varispeed operation and supports 96 kHz sampling frequency with reduced channel capacity. The latest AES10-2008 standard includes minor clarifications and updates to correspond to the current AES3 standard. Audio over Ethernet of various types is the primary alternative to MADI for transport of many channels of professional digital audio. == Transmission format == MADI links use a transmission format similar to Fiber Distributed Data Interface (FDDI) networking. Since MADI is most often transmitted on copper links via 75-ohm coaxial cables, it more closely compares to the FDDI specification for copper-based links, called CDDI. AES10-2003 recommends using BNC connectors with coaxial cables and SC connectors with optic fibers. MADI over fibre can support a range of up to 2 km. The basic data rate is 100 Mbit/s of data using 4B5B encoding to produce a 125 MHz physical baud rate. Unlike AES3, this clock is not synchronized to the audio sample rate, and the audio data payload is padded using JK sync symbols. Sync symbols may be inserted at any subframe boundary, and must occur at least once per frame. Though the standard disassociates the transmission clock from the audio sample rate, and thus requires a separate word clock connection to maintain synchronization, some vendors do give the option of locking to parts of the transmission timing information for purposes of deriving a word clock. The audio data is almost identical to the AES3 payload, though with more channels. Rather than letters, MADI assigns channel numbers from 0–63. Frame synchronization is provided by sync symbols outside the data itself, rather than an embedded preamble sequence, and the first four time slots of each sub-channel are encoded as normal data, used for sub-channel identification: Bit 0: Set to 1 to mark channel 0, the first channel in each frame Bit 1: Set to 1 to indicate that this channel is active (contains interesting data) Bit 2: notA/B channel marker, used to mark left (0) and right (1) channels. Generally, even channels are A and odd channels are B. Bit 3: Set to 1 to mark the beginning of a 192-sample data block == Sampling frequency == The original AES10-1991 specification allowed 56 channels at sample rates from 32 to 48 kHz with an additional vari-speed range of ± 12.5%. This leads to a total range of 28 to 54 kHz. At the highest frequency, this produces a total of 56 × 32 × 54 = 96768 kbit/s, leaving 3.232% of the channel for synchronization marks and transmit clock error. The 2003 revision specifies different relations between sampling frequency and number of channels. 32 kHz to 48 kHz ± 12.5%, 56 channels; 32 kHz to 48 kHz nominal, 64 channels; 64 kHz to 96 kHz ± 12.5%, 28 channels. With a 48 kHz sampling frequency, 64 channels take 64 × 32 × 48000 = 98.304 Mbit/s. Adding the minimum 8 × 58 kbit/s of framing produces 98688 bit/s, leaving 1.312% free for timing variation and other overhead. Both versions of the standard accommodate higher sampling frequencies (for example, 96 kHz or 192 kHz) by using two or more channels per audio sample on the link.

    Read more →
  • Content Security Policy

    Content Security Policy

    Content Security Policy (CSP) is a computer security standard introduced to prevent cross-site scripting (XSS), clickjacking and other code injection attacks resulting from execution of malicious content in the trusted web page context. It is a Candidate Recommendation of the W3C working group on Web Application Security, widely supported by modern web browsers. CSP provides a standard method for website owners to declare approved origins of content that browsers should be allowed to load on that website—covered types are JavaScript, CSS, HTML frames, web workers, fonts, images, embeddable objects such as Java applets, ActiveX, audio and video files, and other HTML5 features. == Status == The standard, originally named Content Restrictions, was proposed by Robert Hansen in 2004, first implemented in Firefox 4 and quickly picked up by other browsers. Version 1 of the standard was published in 2012 as W3C candidate recommendation and quickly with further versions (Level 2) published in 2014. As of 2023, the draft of Level 3 is being developed with the new features being quickly adopted by the web browsers. The following header names are in use as part of experimental CSP implementations: Content-Security-Policy – standard header name proposed by the W3C document. Google Chrome supports this as of version 25. Firefox supports this as of version 23, released on 6 August 2013. WebKit supports this as of version 528 (nightly build). Chromium-based Microsoft Edge support is similar to Chrome's. X-WebKit-CSP – deprecated, experimental header introduced into Google Chrome, Safari and other WebKit-based web browsers in 2011. X-Content-Security-Policy – deprecated, experimental header introduced in Gecko 2 based browsers (Firefox 4 to Firefox 22, Thunderbird 3.3, SeaMonkey 2.1). A website can declare multiple CSP headers, also mixing enforcement and report-only ones. Each header will be processed separately by the browser. CSP can also be delivered within the HTML code using a meta tag, although in this case its effectiveness will be limited. Internet Explorer 10 and Internet Explorer 11 also support CSP, but only sandbox directive, using the experimental X-Content-Security-Policy header. A number of web application frameworks support CSP, for example AngularJS (natively) and Django (middleware). Instructions for Ruby on Rails have been posted by GitHub. Web framework support is however only required if the CSP contents somehow depend on the web application's state—such as usage of the nonce origin. Otherwise, the CSP is rather static and can be delivered from web application tiers above the application, for example on load balancer or web server. === Bypasses === In December 2015 and December 2016, a few methods of bypassing 'nonce' allowlisting origins were published. In January 2016, another method was published, which leverages server-wide CSP allowlisting to exploit old and vulnerable versions of JavaScript libraries hosted at the same server (frequent case with CDN servers). In May 2017 one more method was published to bypass CSP using web application frameworks code. == Mode of operation == If the Content-Security-Policy header is present in the server response, a compliant client enforces the declarative allowlist policy. One example goal of a policy is a stricter execution mode for JavaScript in order to prevent certain cross-site scripting attacks. In practice this means that a number of features are disabled by default: Inline JavaScript code