AI For Business Microsoft

AI For Business Microsoft — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Single-page application

    Single-page application

    A single-page application (SPA) is a web application or website that interacts with the user by dynamically rewriting the current web page with new data from the web server, instead of the default method of loading entire new pages. The goal is faster transitions that make the website feel more like a native app. In a SPA, a page refresh never occurs; instead, all necessary HTML, JavaScript, and CSS code is either retrieved by the browser with a single page load, or the appropriate resources are dynamically loaded and added to the page as necessary, usually in response to user actions. == History == The origins of the term single-page application are unclear, though the concept was discussed at least as early as 2003 by technology evangelists from Netscape. Stuart Morris, a programming student at Cardiff University, Wales, wrote the self-contained website at slashdotslash.com with the same goals and functions in April 2002, and later the same year Lucas Birdeau, Kevin Hakman, Michael Peachey and Clifford Yeh described a single-page application implementation in US patent 8,136,109. Earlier forms were called rich web applications. JavaScript can be used in a web browser to display the user interface (UI), run application logic, and communicate with a web server. Mature free libraries are available that support the building of a SPA, reducing the amount of JavaScript code developers have to write. == Technical approaches == There are various techniques available that enable the browser to retain a single page even when the application requires server communication. === Document hashes === HTML authors can leverage element IDs to show or hide different sections of the HTML document. Then, using CSS, authors can use the :target pseudo-class selector to only show the section of the page which the browser navigated to. === JavaScript frameworks === Web browser JavaScript frameworks and libraries, such as Angular, Ember.js, ExtJS, Knockout.js, Meteor.js, React, Vue.js, and Svelte have adopted SPA principles. Aside from ExtJS, all of these are free. AngularJS is a discontinued fully client-side framework. AngularJS's templating is based on bidirectional UI data binding. Data-binding is an automatic way of updating the view whenever the model changes, as well as updating the model whenever the view changes. The HTML template is compiled in the browser. The compilation step creates pure HTML, which the browser re-renders into the live view. The step is repeated for subsequent page views. In traditional server-side HTML programming, concepts such as controller and model interact within a server process to produce new HTML views. In the AngularJS framework, the controller and model states are maintained within the client browser. Therefore, new pages are capable of being generated without any interaction with a server. Angular 2+ is a SPA Framework developed by Google after AngularJS. There is a strong community of developers using this framework. The framework is updated twice every year. New features and fixes are frequently added in this framework. Ember.js is a client-side JavaScript web application framework based on the model–view–controller (MVC) software architectural pattern. It allows developers to create scalable single-page applications by incorporating common idioms and best practices into a framework that provides a rich object model, declarative two-way data binding, computed properties, automatically updating templates powered by Handlebars.js, and a router for managing application state. ExtJS is also a client side framework that allows creating MVC applications. It has its own event system, window and layout management, state management (stores) and various UI components (grids, dialog windows, form elements etc.). It has its own class system with either dynamic or static loader. The application built with ExtJS can either exist on its own (with state in the browser) or with the server (e.g. with REST API that is used to fill its internal stores). ExtJS has only built in capabilities to use localStorage so larger applications need a server to store state. Knockout.js is a client side framework which uses templates based on the Model-View-ViewModel pattern. Meteor.js is a full-stack (client-server) JavaScript framework designed exclusively for SPAs. It features simpler data binding than Angular, Ember or ReactJS, and uses the Distributed Data Protocol and a publish–subscribe pattern to automatically propagate data changes to clients in real-time without requiring the developer to write any synchronization code. Full stack reactivity ensures that all layers, from the database to the templates, update themselves automatically when necessary. Ecosystem packages such as Server Side Rendering address the problem of search engine optimization. React is a JavaScript library for building user interfaces. It is maintained by Facebook, Instagram and a community of individual developers and corporations. React uses a syntax extension for JavaScript, named JSX, which is a mix of JS and HTML (a subset of HTML). Several companies use React with Redux (JavaScript library) which adds state management capabilities, which (with several other libraries) lets developers create complex applications. Vue.js is a JavaScript framework for building user interfaces. Vue developers also provide Pinia for state management. Svelte is a framework for building user interfaces that compiles Svelte code to JavaScript DOM (Document Object Model) manipulations, avoiding the need to bundle a framework to the client, and allowing for simpler application development syntax. ==== Capabilities and trade-offs in modern frameworks ==== JavaScript-based web application frameworks, such as React and Vue, provide extensive capabilities but come with associated trade-offs. These frameworks often extend or enhance features available through native web technologies, such as routing, component-based development, and state management. While native web standards, including Web Components, modern JavaScript APIs like Fetch and ES Modules, and browser capabilities like Shadow DOM, have advanced significantly, frameworks remain widely used for their ability to enhance developer productivity, offer structured patterns for large-scale applications, simplify handling edge cases, and provide tools for performance optimization. Frameworks can introduce abstraction layers that may contribute to performance overhead, larger bundle sizes, and increased complexity. Modern frameworks, such as React 18 and Vue 3, address these challenges with features like concurrent rendering, tree-shaking, and selective hydration. While these advancements improve rendering efficiency and resource management, their benefits depend on the specific application and implementation context. Lightweight frameworks, such as Svelte and Preact, take different architectural approaches, with Svelte eliminating the virtual DOM entirely in favor of compiling components to efficient JavaScript code, and Preact offering a minimal, compatible alternative to React. Framework choice depends on an application’s requirements, including the team’s expertise, performance goals, and development priorities. A newer category of web frameworks, including enhance.dev, Astro, and Fresh, leverages native web standards while minimizing abstractions and development tooling. These solutions emphasize progressive enhancement, server-side rendering, and optimizing performance. Astro renders static HTML by default while hydrating only interactive parts. Fresh focuses on server-side rendering with zero runtime overhead. Enhance.dev prioritizes progressive enhancement patterns using Web Components. While these tools reduce reliance on client-side JavaScript by shifting logic to build-time or server-side execution, they still use JavaScript where necessary for interactivity. This approach makes them particularly suitable for performance-critical and content-focused applications. === WebAssembly-based frameworks === The following frameworks utilize WebAssembly or can build single-page applications (SPAs) with WebAssembly as a core technology or support mechanism. These frameworks enable high-performance and interactive client-side development, extending the SPA paradigm across languages and ecosystems. Avalonia is primarily a cross-platform desktop UI framework, but experimental support for WebAssembly allows it to be used for SPA development. It has an XAML-based UI design and native-style application features. Blazor WebAssembly is a .NET-based framework that allows developers to build SPAs using C# and Razor syntax. It runs .NET code in the browser via WebAssembly, enabling a full-stack .NET development experience without relying on JavaScript. Flutter on the Web extends Flutter’s cross-platform development capabilities to web-based SPAs. Using Dart and its Skia graphics engine, Flutter allows developers to create visually rich SPAs that

    Read more →
  • Digital cinematography

    Digital cinematography

    Digital cinematography is the process of capturing (recording) a motion picture using digital image sensors rather than through film stock. As digital technology has improved in recent years, this practice has become dominant. Since the 2000s, most movies across the world have been captured as well as distributed digitally. Many vendors have brought products to market, including traditional film camera vendors like Arri and Panavision, as well as new vendors like Red, Blackmagic, Silicon Imaging, Vision Research and companies which have traditionally focused on consumer and broadcast video equipment, like Sony, GoPro, and Panasonic. As of 2023, professional 4K digital cameras were approximately equal to 35mm film in their resolution and dynamic range capacity. Some filmmakers still prefer to use film picture formats to achieve the desired results. == History == The basis for digital cameras are metal–oxide–semiconductor (MOS) image sensors. The first practical semiconductor image sensor was the charge-coupled device (CCD), based on MOS capacitor technology. Following the commercialization of CCD sensors during the late 1970s to early 1980s, the entertainment industry slowly began transitioning to digital imaging and digital video over the next two decades. The CCD was followed by the CMOS active-pixel sensor (CMOS sensor), developed in the 1990s. Beginning in the late 1980s, Sony began marketing the concept of "electronic cinematography," utilizing its analog Sony HDVS professional video cameras. The effort met with very little success. However, this led to one of the earliest high definition video shot feature movies, Julia and Julia (1987). Rainbow (1996) was the world's first film to utilize extensive digital post production techniques. Shot entirely with Sony's first Solid State Electronic Cinematography cameras and featuring over 35 minutes of digital image processing and visual effects, all post production, sound effects, editing and scoring were completed digitally. The Digital High Definition image was transferred to a 35mm negative via an electron beam recorder for theatrical release. The first digitally videoed and post produced feature was Windhorse, shot in Tibet and Nepal in 1996 on the Sony DVW-700WS Digital Betacam and the prosumer Sony DCR-VX1000. The offline editing (Avid) and the online post and color work (Roland House / da Vinci) were also all digital. The film, transferred to 35mm negative for theatrical release, won Best U.S. Feature at the Santa Barbara Film Festival in 1998. In 1997, with the introduction of HDCAM recorders and 1920 × 1080 pixel digital professional video cameras based on CCD technology, the idea, now re-branded as "digital cinematography," began to gain traction in the market. Shot and released in 1998, The Last Broadcast is believed by some to be the first feature-length video shot and edited entirely on consumer-level digital equipment. In May 1999, George Lucas challenged the supremacy of the movie-making medium of film for the first time by including footage filmed with high-definition digital cameras in Star Wars: Episode I – The Phantom Menace. The digital footage blended seamlessly with the footage shot on film and he announced later that year he would film its sequels entirely on hi-def digital video. Also in 1999, digital projectors were installed in four theaters for the showing of The Phantom Menace. In May 2000, Vidocq, which was directed by Pitof, began principal photography shot entirely using a Sony HDW-F900 camera, with the video being released in September the next year. According to the Guinness World Records, Vidocq is the first full length feature filmed in digital high resolution. In June 2000, Star Wars: Episode II – Attack of the Clones began principal photography shot entirely using a Sony HDW-F900 camera as Lucas had previously stated. The film was released in May 2002. In May 2001 Once Upon a Time in Mexico was also shot in 24 frame-per-second high-definition digital video, partially developed by George Lucas using a Sony HDW-F900 camera, following Robert Rodriguez's introduction to the camera at Lucas' Skywalker Ranch facility whilst editing the sound for Spy Kids. A lesser-known movie, Russian Ark (2002), was also shot with the same camera and was the first tapeless digital movie, recorded on HDD instead of tape. In 2009, Slumdog Millionaire became the first movie shot mainly in digital to be awarded the Academy Award for Best Cinematography. The highest-grossing movie in the history of cinema, Avatar (2009), not only was shot on digital cameras as well, but also made the main revenues at the box office no longer by film, but digital projection. Major movies shot on digital video overtook those shot on film in 2013. Since 2016 over 90% of major films were shot on digital video. As of 2017, 92% of films are shot on digital. Only 24 major films released in 2018 were shot on 35mm. Since the 2000s, most movies across the world have been captured as well as distributed digitally. Today, cameras from companies like Sony, Panasonic, JVC and Canon offer a variety of choices for shooting high-definition video. At the high-end of the market, there has been an emergence of cameras aimed specifically at the digital cinema market. These cameras from Sony, Vision Research, Arri, Blackmagic Design, Panavision, Grass Valley and Red offer resolution and dynamic range that exceeds that of traditional video cameras, which are designed for the limited needs of broadcast television. == Technology == Digital cinematography captures motion pictures digitally in a process analogous to digital photography. While there is a clear technical distinction that separates the images captured in digital cinematography from video, the term "digital cinematography" is usually applied only in cases where digital acquisition is substituted for film acquisition, such as when shooting a feature film. The term is seldom applied when digital acquisition is substituted for video acquisition, as with live broadcast television programs. === Recording === ==== Cameras ==== Professional cameras include the Sony CineAlta (F) Series, Blackmagic Cinema Camera, Red One, Arri D-20, D-21 and Alexa, Panavision Genesis, Silicon Imaging SI-2K, Thomson Viper, Vision Research Phantom, IMAX 3D camera based on two Vision Research Phantom cores, Weisscam HS-1 and HS-2, GS Vitec noX, and the Fusion Camera System. Independent micro-budget filmmakers have also pressed low-cost consumer and prosumer cameras into service for digital filmmaking. Flagship smartphones like the Apple iPhone have been used to shoot movies like Unsane (shot on the iPhone 7 Plus) and Tangerine (shot on three iPhone 5S phones) and in January 2018, Unsane's director and Oscar winner Steven Soderbergh expressed an interest in filming other productions solely with iPhones going forward. ==== Sensors ==== Digital cinematography cameras capture digital images using image sensors, either charge-coupled device (CCD) sensors or CMOS active-pixel sensors, usually in one of two arrangements. Single chip cameras designed specifically for the digital cinematography market often use a single sensor (much like digital photo cameras), with dimensions similar in size to a 16 or 35 mm film frame or even (as with the Vision 65) a 65 mm film frame. An image can be projected onto a single large sensor exactly the same way it can be projected onto a film frame, so cameras with this design can be made with PL, PV and similar mounts, in order to use the wide range of existing high-end cinematography lenses available. Their large sensors also let these cameras achieve the same shallow depth of field as 35 or 65 mm motion picture film cameras, which many cinematographers consider an essential visual tool. Codecs Professional raw video recording codecs include Blackmagic Raw, Red Raw, Arri Raw and Canon Raw. ==== Video formats ==== Unlike other video formats, which are specified in terms of vertical resolution (for example, 1080p, which is 1920×1080 pixels), digital cinema formats are usually specified in terms of horizontal resolution. As a shorthand, these resolutions are often given in "nK" notation, where n is the multiplier of 1024 such that the horizontal resolution of a corresponding full-aperture, digitized film frame is exactly 1024 n {\displaystyle 1024n} pixels. Here the "K" has a customary meaning corresponding to the binary prefix "kibi" (ki). For instance, a 2K image is 2048 pixels wide, and a 4K image is 4096 pixels wide. Vertical resolutions vary with aspect ratios though; so a 2K image with an HDTV (16:9) aspect ratio is 2048×1152 pixels, while a 2K image with a SDTV or Academy ratio (4:3) is 2048×1536 pixels, and one with a Panavision ratio (2.39:1) would be 2048×856 pixels, and so on. Due to the "nK" notation not corresponding to specific horizontal resolutions per format a 2K image lacking, for example, the typical 35mm film soundtrack space, is only 182

    Read more →
  • Attention inequality

    Attention inequality

    Attention inequality is the inequality of distribution of attention across users on social networks, people in general, and for scientific papers. Yun Family Foundation introduced "Attention Inequality Coefficient" as a measure of inequality in attention and arguments it by the close interconnection with wealth inequality. == Relationship to economic inequality == Attention inequality is related to economic inequality since attention is an economically scarce good. The same measures and concepts as in classical economy can be applied for attention economy. The relationship develops also beyond the conceptual level—considering the AIDA process, attention is the prerequisite for real monetary income on the Internet. On data of 2018, a significant relationship between likes and comments on Facebook to donations is proven for non-profit organizations. == Attention economy == The attention economy refers to the practice of maximizing the attention users give to a product for advertising-related reasons. Attention economy remains one of the most common forms of advertising, and has been steadily increasing thanks to new technologies such as television, internet and social media. It is one of the most widely-used approaches to economy for its effectiveness for maximising the noticeability of a certain product. == Attention inequality in social media == In social media, attention inequality refers to the unequal distribution of users' attention on social media platforms. This means that instead of an equal distribution of attention, fewer sources receive a disproportionate share of attention, leaving many unnoticed. This phenomenon is possibly the result of social media algorithms, which are commonly designed to drive maximum engagement. This phenomenon is a large factor in the polarization and creation of echo-chambers. Social media algorithms tend to note content that is already performing well and display it to more users, while content that is equally engaging or well-made is not recommended to users. Posts that trigger strong emotions usually out-perform more "uncontroversial" content. When many users interact with the post, it signals the algorithm that the specific post drives engagement. The algorithm then tends to recommend that type of content to an exponential number of people, potentially outperforming "un-emotional" content. These factors, when combined, tend to create an unequal social media environment. == Attention inequality in science == According to a recent 2025 study about research inequality among scientists published in Information Processing and Management, scientific discourse is restricted to a small group of connected scientists, and is frequently not an accurate representation of the whole scientific community. Using citation-network analysis in the fields of nanoscience and chemical physics, the study claims that a group of connected scientists has a significant notability in the scientific community. The calculated connection strength between these scientists is estimated to be about 4.5, the study also says that these authors cite each other four times more often than would be predicted in a random network, whereas ordinary scientists that exist outside of this group only reach an estimated connection strength of 0.9. The study findings suggest that that scientific attention is not distributed by merit, but rather by the connectedness of the scientists involved in the research. == Extent == As data of 2008 shows, 50% of the attention is concentrated on approximately 0.2% of all hostnames, and 80% on 5% of hostnames. The Gini coefficient of attention distribution lay in 2008 at over 0.921 for such commercial domains names as ac.jp and at 0.985 for .org-domains. The Gini coefficient was measured on Twitter in 2016 for the number of followers as 0.9412, for the number of mentions as 0.9133, and for the number of retweets as 0.9034. For comparison, the world's income Gini coefficient was 0.68 in 2005 and 0.904 in 2018. More than 96% of all followers, 93% of the retweets, and 93% of all mentions are owned by 20% of Twitter. == Causes == At least for scientific papers, today's consensus states that inequality is unexplainable by variations of quality and individual talent. The Matthew effect plays a significant role in the emergence of attention inequality—those who already enjoy large amounts of attention get even more attention, and those who do not lose even more. Ranking algorithms based on relevance to the user have been found to alleviate the inequality of the number of posts across topics.

    Read more →
  • New media

    New media

    New media are communication technologies that enable or enhance interaction between users, as well as interaction between users and content. In the middle of the 1990s, the phrase "new media" became widely used as part of a sales pitch for the influx of interactive CD-ROMs for entertainment and education. The new media technologies, sometimes known as Web 2.0, include a wide range of web-related communication tools such as blogs, wikis, online social networking, virtual worlds, and other social media platforms. The phrase "new media" refers to computational media that share material online and through computers. New media inspire new ways of thinking about older media. Media do not replace one another in a clear, linear succession, instead evolving in a more complicated network of interconnected feedback loops . What is different about new media is how they specifically refashion traditional media and how older media refashion themselves to meet the challenges of new media. Unless they contain technologies that enable digital generative or interactive processes, broadcast television programs, non-interactive news websites, feature films, magazines, and books are not considered to be new media. The term "new media" stands in contrast to old media, which dominated the media landscape as a form of mass media for many years. == History == In the 1950s, connections between computing and radical art began to grow stronger. It was not until the 1980s that Alan Kay and his co-workers at Xerox PARC began to give the computability of a personal computer to the individual, rather than have a big organization be in charge of this. In the late 1980s and early 1990s, however, we seem to witness a different kind of parallel relationship between social changes and computer design. Although causally unrelated, conceptually, it makes sense that the Cold War and the design of the Web took place at exactly the same time. Writers and philosophers such as Marshall McLuhan were instrumental in the development of media theory during this period which is now famous declaration in Understanding Media: The Extensions of Man, that "the medium is the message" drew attention to the too often ignored influence media and technology themselves, rather than their "content," have on humans' experience of the world and on society broadly. Until the 1980s, media relied primarily upon print and analog broadcast models such as television and radio. The last twenty-five years have seen the rapid transformation into media which are predicated upon the use of digital technologies such as the Internet and video games. However, these examples are only a small representation of new media. The use of digital computers has transformed the remaining 'old' media, as suggested by the advent of digital television and online publications. Even traditional media forms such as the printing press have been transformed through the application of technologies by using of image manipulation software like Adobe Photoshop and desktop publishing tools. Andrew L. Shapiro argues that the "emergence of new, digital technologies signals a potentially radical shift of who is in control of information, experience and resources". W. Russell Neuman suggests that whilst the "new media" have technical capabilities to pull in one direction, economic and social forces pull back in the opposite direction. According to Neuman, "We are witnessing the evolution of a universal interconnected network of audio, video, and electronic text communications that will blur the distinction between interpersonal and mass communication; and between public and private communication". Neuman argues that new media will: Alter the meaning of geographic distance. Allow for a huge increase in the volume of communication. Provide the possibility of increasing the speed of communication. Provide opportunities for interactive communication. Allow forms of communication that were previously separate to overlap and interconnect. Consequently, it has been the contention of scholars such as Douglas Kellner and James Bohman that new media and particularly the Internet will provide the potential for a democratic postmodern public sphere, in which citizens can participate in well informed, non-hierarchical debate pertaining to their social structures. Contradicting these positive appraisals of the potential social impacts of new media are scholars such as Edward S. Herman and Robert McChesney who have suggested that the transition to new media has seen a handful of powerful transnational telecommunications corporations who achieve a level of global influence which was hitherto unimaginable. Scholars have highlighted both the positive and negative potential and actual implications of new media technologies, suggesting that some of the early work in new media studies was guilty of technologicaldeterminism – whereby the effects of media were determined by the technologies themselves, rather than by tracing the complex social networks that governed the development, funding, implementation, and future evolution of any technology. Based on the argument that people have a limited amount of time to spend on the consumption of different media, displacement theory argue that the viewership or readership of one particular outlet leads to the reduction in the amount of time spent by the individual on another. The introduction of new media, such as the internet, therefore reduces the amount of time individuals would spend on existing "old" media, which could ultimately lead to the end of such traditional media. == Definition == Although, there are several ways that new media may be described, Lev Manovich, in an introduction to The New Media Reader, defines new media by using eight propositions: New media versus cyberculture – Cyberculture is the various social phenomena that are associated with the Internet and network communications (blogs, online multi-player gaming), whereas new media is concerned more with cultural objects and paradigms (digital to analog television, smartphones). New media as computer technology used as a distribution platform – New media are the cultural objects which use digital computer technology for distribution and exhibition. e.g. (at least for now) Internet, Web sites, computer multimedia, Blu-ray disks etc. The problem with this is that the definition must be revised every few years. The term "new media" will not be "new" anymore, as most forms of culture will be distributed through computers. New media as digital data controlled by software – The language of new media is based on the assumption that, in fact, all cultural objects that rely on digital representation and computer-based delivery do share a number of common qualities. New media is reduced to digital data that can be manipulated by software as any other data. Now media operations can create several versions of the same object. An example is an image stored as matrix data which can be manipulated and altered according to the additional algorithms implemented, such as color inversion, gray-scaling, sharpening, rasterizing, etc. New media as the mix between existing cultural conventions and the conventions of software – New media today can be understood as the mix between older cultural conventions for data representation, access, and manipulation and newer conventions of data representation, access, and manipulation. The "old" data are representations of visual reality and human experience, and the "new" data is numerical data. The computer is kept out of the key "creative" decisions, and is delegated to the position of a technician. e.g. In film, software is used in some areas of production, in others are created using computer animation. New media as the aesthetics that accompanies the early stage of every new modern media and communication technology – While ideological tropes indeed seem to be reappearing rather regularly, many aesthetic strategies may reappear two or three times ... In order for this approach to be truly useful it would be insufficient to simply name the strategies and tropes and to record the moments of their appearance; instead, we would have to develop a much more comprehensive analysis which would correlate the history of technology with social, political, and economical histories or the modern period. New media as faster execution of algorithms previously executed manually or through other technologies – Computers are a huge speed-up of what were previously manual techniques. e.g. calculators. Dramatically speeding up the execution makes possible previously non-existent representational technique. This also makes possible of many new forms of media art such as interactive multimedia and video games. On one level, a modern digital computer is just a faster calculator, we should not ignore its other identity: that of a cybernetic control device. New media as the encoding of modernist avant-garde; new media as metamedia – Manovi

    Read more →
  • Geometric hashing

    Geometric hashing

    In computer science, geometric hashing is a method for efficiently finding two-dimensional objects represented by discrete points that have undergone an affine transformation, though extensions exist to other object representations and transformations. In an off-line step, the objects are encoded by treating each pair of points as a geometric basis. The remaining points can be represented in an invariant fashion with respect to this basis using two parameters. For each point, its quantized transformed coordinates are stored in the hash table as a key, and indices of the basis points as a value. Then a new pair of basis points is selected, and the process is repeated. In the on-line (recognition) step, randomly selected pairs of data points are considered as candidate bases. For each candidate basis, the remaining data points are encoded according to the basis and possible correspondences from the object are found in the previously constructed table. The candidate basis is accepted if a sufficiently large number of the data points index a consistent object basis. Geometric hashing was originally suggested in computer vision for object recognition in 2D and 3D, but later was applied to different problems such as structural alignment of proteins. == Geometric hashing in computer vision == Geometric hashing is a method used for object recognition. Let’s say that we want to check if a model image can be seen in an input image. This can be accomplished with geometric hashing. The method could be used to recognize one of the multiple objects in a base, in this case the hash table should store not only the pose information but also the index of object model in the base. === Example === For simplicity, this example will not use too many point features and assume that their descriptors are given by their coordinates only (in practice local descriptors such as SIFT could be used for indexing). ==== Training Phase ==== Find the model's feature points. Assume that 5 feature points are found in the model image with the coordinates ( 12 , 17 ) ; {\displaystyle (12,17);} ( 45 , 13 ) ; {\displaystyle (45,13);} ( 40 , 46 ) ; {\displaystyle (40,46);} ( 20 , 35 ) ; {\displaystyle (20,35);} ( 35 , 25 ) {\displaystyle (35,25)} , see the picture. Introduce a basis to describe the locations of the feature points. For 2D space and similarity transformation the basis is defined by a pair of points. The point of origin is placed in the middle of the segment connecting the two points (P2, P4 in our example), the x ′ {\displaystyle x'} axis is directed towards one of them, the y ′ {\displaystyle y'} is orthogonal and goes through the origin. The scale is selected such that absolute value of x ′ {\displaystyle x'} for both basis points is 1. Describe feature locations with respect to that basis, i.e. compute the projections to the new coordinate axes. The coordinates should be discretised to make recognition robust to noise, we take the bin size 0.25. We thus get the coordinates ( − 0.75 , − 1.25 ) ; {\displaystyle (-0.75,-1.25);} ( 1.00 , 0.00 ) ; {\displaystyle (1.00,0.00);} ( − 0.50 , 1.25 ) ; {\displaystyle (-0.50,1.25);} ( − 1.00 , 0.00 ) ; {\displaystyle (-1.00,0.00);} ( 0.00 , 0.25 ) {\displaystyle (0.00,0.25)} Store the basis in a hash table indexed by the features (only transformed coordinates in this case). If there were more objects to match with, we should also store the object number along with the basis pair. Repeat the process for a different basis pair (Step 2). It is needed to handle occlusions. Ideally, all the non-colinear pairs should be enumerated. We provide the hash table after two iterations, the pair (P1, P3) is selected for the second one. Hash Table: Most hash tables cannot have identical keys mapped to different values. So in real life one won’t encode basis keys (1.0, 0.0) and (-1.0, 0.0) in a hash table. ==== Recognition Phase ==== Find interesting feature points in the input image. Choose an arbitrary basis. If there isn't a suitable arbitrary basis, then it is likely that the input image does not contain the target object. Describe coordinates of the feature points in the new basis. Quantize obtained coordinates as it was done before. Compare all the transformed point features in the input image with the hash table. If the point features are identical or similar, then increase the count for the corresponding basis (and the type of object, if any). For each basis such that the count exceeds a certain threshold, verify the hypothesis that it corresponds to an image basis chosen in Step 2. Transfer the image coordinate system to the model one (for the supposed object) and try to match them. If successful, the object is found. Otherwise, go back to Step 2. === Finding mirrored pattern === It seems that this method is only capable of handling scaling, translation, and rotation. However, the input image may contain the object in mirror transform. Therefore, geometric hashing should be able to find the object, too. There are two ways to detect mirrored objects. For the vector graph, make the left side positive, and the right side negative. Multiplying the x position by -1 will give the same result. Use 3 points for the basis. This allows detecting mirror images (or objects). Actually, using 3 points for the basis is another approach for geometric hashing. === Geometric hashing in higher-dimensions === Similar to the example above, hashing applies to higher-dimensional data. For three-dimensional data points, three points are also needed for the basis. The first two points define the x-axis, and the third point defines the y-axis (with the first point). The z-axis is perpendicular to the created axis using the right-hand rule. Notice that the order of the points affects the resulting basis

    Read more →
  • Verge3D

    Verge3D

    Verge3D is a real-time renderer and a toolkit used for creating interactive 3D experiences running on websites. == Overview == Verge3D enables users to convert content from 3D modelling tools (Blender, 3ds Max, and Maya are currently supported) to view in a web browser. Verge3D was created by the same core group of software engineers that previously created the Blend4Web framework. == Features == Verge3D uses WebGL for rendering. It incorporates components of the Three.js library and exposes its API to application developers. Puzzles Application functionality can be added via JavaScript, either by writing code directly or by using Puzzles, Verge3D’s visual programming environment based on Google Blockly. Puzzles is aimed primarily at non-programmers allowing quick creation of interactive scenarios in a drag-and-drop fashion. App Manager and web publishing App Manager is a lightweight web-based tool for creating, managing and publishing Verge3D projects, running on top of the local development server. Verge3D Network service integrated in the App Manager allows for publishing Verge3D applications via Amazon S3 and EC2 cloud services. PBR For purposes of authoring materials, a glTF 2.0-compliant physically based rendering pipeline is offered alongside the standard shader-based approach. PBR textures can be authored using external texturing software such as Substance Painter for which Verge3D offers the corresponding export preset. Besides the glTF 2.0 model, Verge3D supports physical materials of 3ds Max and Maya (with Autodesk Arnold as reference), and Blender's real-time Eevee materials. glTF and DCC software integration Verge3D integrates directly with Blender, 3ds Max, and Maya, enabling users to create 3D geometry, materials, and animations inside the software, then export them in the JSON-based glTF format. The Sneak Peek feature allows for exporting and viewing scenes from the DCC tool environment. Facebook 3D posts For Facebook publishing, Verge3D offers a specific GLB export option. The exported GLB files are displayed and can be opened in the App Manager. Asset compression Exported files can optionally use LZMA compression, resulting in a reduction in file size of up to 6x. UI and website layouts Interface layouts, created using external WYSIWYG editors, can be linked with Puzzles to trigger changes to a 3D scene being rendered in the browser and vice versa. Animation Verge3D supports skeletal animation, including animation of bipeds and character rigs, and allows for animation of material parameters. Model parts can also be set up to be dragged by the user. Physics The physics module can be linked separately to enable collision detection, dynamically moving objects, support for characters and vehicles, springs, ropes and cloth simulation. As of version 2.11, simple physics simulations can be created and controlled without coding via Puzzles, the visual programming system used by Verge3D. AR/VR The 2.10 update added support for WebXR, an in-development open technology designed to enable virtual reality and augmented reality experiences to be displayed in web browsers. It works with both headsets with controllers, like the HTC Vive and Oculus Rift, and those without, like Google Cardboard. AR/VR experiences can enabled via Puzzles or JavaScript. == Workflow == Verge3D's workflow differs substantially from other mainstream WebGL frameworks. Development of a new Verge3D application is usually started from modeling, texturing and animating 3D objects. The models are assembled in the 3D authoring tool. The scene file is then used as a basis for a Verge3D project initialized from the App Manager. An interactive scenario is optionally added using the Puzzles editor. A Verge3D application can be previewed in the web browser at any development stage using the App Manager. The finished web application can be deployed on the Verge3D Network, on Facebook or on the user's website. == Notable uses == NASA's Jet Propulsion Laboratory used Verge3D to create an interactive 3D visualization of the Mars InSight lander. The web application allows for exploring and interacting with the real-time model of the spacecraft, with the possibility to move different parts and unfurl the solar panels. NASA's older interactive web application Experience Curiosity was ported to Verge3D from Blend4Web. The application makes it possible to operate the rover, control its cameras and the robotic arm and reproduces some of the prominent events of the Mars Science Laboratory mission. Route 66 Digital's Escape Room used Verge3D and Blender. This interactive short explores how users can navigate 3D spaces and interact with objects without the need for instruction.

    Read more →
  • UCSD Pascal

    UCSD Pascal

    UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UCSD). == The p-System == In 1977, the University of California, San Diego (UCSD) Institute for Information Systems developed UCSD Pascal to provide students with a common environment that could run on any of the then available microcomputers as well as campus DEC PDP-11 minicomputers. The operating system became known as UCSD p-System. There were three operating systems that IBM offered for its original IBM PC: the UCSD p-System, CP/M-86, and IBM PC DOS. Vendor SofTech Microsystems emphasized p-System's application portability, with virtual machines for 20 CPUs as of the IBM PC's release. It predicted that users would be able to use applications they purchased on future computers running p-System; advertisements called it "the Universal Operating System". PC Magazine denounced UCSD p-System on the IBM PC, stating in a review of Context MBA, written in the language, that it "simply does not produce good code". The p-System did not sell very well for the IBM PC, because of a lack of applications and because it was more expensive than the other choices. Previously, IBM had offered the UCSD p-System as an option for IBM Displaywriter, an 8086-based dedicated word processing machine. (The Displaywriter's native operating system had been developed completely internally and was not opened for end-user programming.) Notable extensions to standard Pascal include separately compilable Units and a String type. Some intrinsics were provided to accelerate string processing (e.g. scanning in an array for a particular search pattern); other language extensions were provided to allow the UCSD p-System to be self-compiling and self-hosted. UCSD Pascal was based on a p-code machine architecture. Its contribution to these early virtual machines was to extend p-code away from its roots as a compiler intermediate language into a full execution environment. The UCSD Pascal p-Machine was optimized for the new small microcomputers with addressing restricted to 16-bit (only 64 KB of memory). James Gosling cites UCSD Pascal as a key influence (along with the Smalltalk virtual machine) on the design of the Java virtual machine. UCSD p-System achieved machine independence by defining a virtual machine, called the p-Machine (or pseudo-machine, which many users began to call the "Pascal-machine" like the OS—although UCSD documentation always used "pseudo-machine") with its own instruction set called p-code (or pseudo-code). Urs Ammann, a student of Niklaus Wirth, originally presented a p-code in his PhD thesis, from which the UCSD implementation was derived, the Zurich Pascal-P implementation. The UCSD implementation changed the Zurich implementation to be "byte oriented". The UCSD p-code was optimized for execution of the Pascal programming language. Each hardware platform then only needed a p-code interpreter program written for it to port the entire p-System and all the tools to run on it. Later versions also included additional languages that compiled to the p-code base. For example, Apple Computer offered a Fortran Compiler (written by Silicon Valley Software, Sunnyvale California) producing p-code that ran on the Apple version of the p-system. Later, TeleSoft (also located in San Diego) offered an early Ada development environment that used p-code and was therefore able to run on a number of hardware platforms including the Motorola 68000, the System/370, and the Pascal MicroEngine. UCSD p-System shares some concepts with the later Java platform. Both use a virtual machine to hide operating system and hardware differences, and both use programs written to that virtual machine to provide cross-platform support. Likewise both systems allow the virtual machine to be used either as the complete operating system of the target computer or to run in a "box" under another operating system. The UCSD Pascal compiler was distributed as part of a portable operating system, the p-System. == History == UCSD p-System began around 1974 as the idea of UCSD's Kenneth Bowles, who believed that the number of new computing platforms coming out at the time would make it difficult for new programming languages to gain acceptance. He based UCSD Pascal on the Pascal-P2 release of the portable compiler from Zurich. He was particularly interested in Pascal as a language to teach programming. UCSD introduced two features that were important improvements on the original Pascal: variable length strings, and "units" of independently compiled code (an idea included into the then-evolving Ada (programming language)). Niklaus Wirth credits the p-System, and UCSD Pascal in particular, with popularizing Pascal. It was not until the release of Turbo Pascal that UCSD's version started to slip from first place among Pascal users. The Pascal dialect of UCSD Pascal came from the subset of Pascal implemented in Pascal-P2, which was not designed to be a full implementation of the language, but rather "the minimum subset that would self-compile", to fit its function as a bootstrap kit for Pascal compilers. UCSD added strings from BASIC, and several other implementation dependent features. Although UCSD Pascal later obtained many of the other features of the full Pascal language, the Pascal-P2 subset persisted in other dialects, notably Borland Pascal, which copied much of the UCSD dialect. == Versions == There were four versions of UCSD p-code engine, each with several revisions of the p-System and UCSD Pascal. A revision of the p-code engine (i.e., the p-Machine) meant a change to the p-code language, and therefore compiled code is not portable between different p-Machine versions. Each revision was represented with a leading Roman Numeral, while operating system revisions were enumerated as the "dot" number following the p-code Roman Numeral. For example, II.3 represented the third revision of the p-System running on the second revision of the p-Machine. === Version I === Original version, never officially distributed outside of the University of California, San Diego. However, the Pascal sources for both Versions I.3 and I.5 were freely exchanged between interested users. Specifically, the patch revision I.5a was known to be one of the most stable. === Version II === Widely distributed, available on many early microcomputers. Numerous versions included Apple II ultimately Apple Pascal, DEC PDP-11, Intel 8080, Zilog Z80, and MOS 6502 based machines, Motorola 68000 and the IBM PC (Version II on the PC was restricted to one 64K code segment and one 64K stack/heap data segment; Version IV removed the code segment limit but cost a lot more). Project members from this era include Dr Kenneth L Bowles, Mark Allen, Richard Gleaves, Richard Kaufmann, Pete Lawrence, Joel McCormack, Mark Overgaard, Keith Shillington, Roger Sumner, and John Van Zandt. === Version III === Custom version written for Western Digital to run on their Pascal MicroEngine microcomputer. Included support for parallel processes for the first time. === Version IV === Commercial version, developed and sold by SofTech. Based on Version II; did not include changes from Version III. Did not sell well due to combination of their pricing structure, performance problems due to p-code interpreter, and competition with native operating systems (on top of which it often ran). After SofTech dropped the product, it was picked up by Pecan Systems, a relatively small company formed of p-System users and fans. Sales revived somewhat, due mostly to Pecan's reasonable pricing structure, but the p-System and UCSD Pascal gradually lost the market to native operating systems and compilers. Available for the TI-99/4A equipped with p-code card, Commodore CBM 8096, Sage II/IV, HP 9000, and BBC Micro with 6502 second processor. == Further use == The Corvus Systems computer used UCSD Pascal for all its user software. The "innovative concept" of the Constellation OS was to run Pascal (interpretively or compiled) and include all common software in the manual, so users could modify as needed.

    Read more →
  • Digital cinema

    Digital cinema

    Digital cinema is the digital technology used within the film industry to distribute or project motion pictures as opposed to the historical use of reels of motion picture film, such as 35 mm film. Whereas film reels have to be shipped to movie theaters, a digital movie can be distributed to cinemas in a number of ways: over the Internet or dedicated satellite links, or by sending hard drives or optical discs such as Blu-ray discs, then projected using a digital video projector instead of a film projector. Typically, digital movies are shot using digital movie cameras or in animation transferred from a file and are edited using a non-linear editing system (NLE). The NLE is often a video editing application installed in one or more computers that may be networked to access the original footage from a remote server, share or gain access to computing resources for rendering the final video, and allow several editors to work on the same timeline or project. Alternatively a digital movie could be a film reel that has been digitized using a motion picture film scanner and then restored, or, a digital movie could be recorded using a film recorder onto film stock for projection using a traditional film projector. Digital cinema is distinct from high-definition television and does not necessarily use traditional television or other traditional high-definition video standards, aspect ratios, or frame rates. In digital cinema, resolutions are represented by the horizontal pixel count, usually 2K (2048×1080 or 2.2 megapixels) or 4K (4096×2160 or 8.8 megapixels). The 2K and 4K resolutions used in digital cinema projection are often referred to as DCI 2K and DCI 4K. DCI stands for Digital Cinema Initiatives. As digital cinema technology improved in the early 2010s, most theaters across the world converted to digital video projection. Digital cinema technology has continued to develop over the years with RealD 3D, IMAX, RPX, 4DX, Dolby Cinema, and ScreenX, allowing moviegoers more immersive experiences. == History == The transition from film to digital video was preceded by cinema's transition from analog to digital audio, with the release of the Dolby Digital (AC-3) audio coding standard in 1991. Its main basis is the modified discrete cosine transform (MDCT), a lossy audio compression algorithm. It is a modification of the discrete cosine transform (DCT) algorithm, which was first proposed by Nasir Ahmed in 1972 and was originally intended for image compression. The DCT was adapted into the MDCT by J.P. Princen, A.W. Johnson and Alan B. Bradley at the University of Surrey in 1987, and then Dolby Laboratories adapted the MDCT algorithm along with perceptual coding principles to develop the AC-3 audio format for cinema needs. Cinema in the 1990s typically combined analog photochemical images with digital audio. Digital media playback of high-resolution 2K files has at least a 20-year history. Early video data storage units (RAIDs) fed custom frame buffer systems with large memories. In early digital video units, the content was usually restricted to several minutes of material. Transfer of content between remote locations was slow and had limited capacity. It was not until the late 1990s that feature-length films could be sent over the "wire" (Internet or dedicated fiber links). On October 23, 1998, Digital light processing (DLP) projector technology was publicly demonstrated with the release of The Last Broadcast, the first feature-length movie, shot, edited and distributed digitally. In conjunction with Texas Instruments, the movie was publicly demonstrated in five theaters across the United States (Philadelphia, Portland (Oregon), Minneapolis, Providence, and Orlando). === Foundations === In the United States, on June 18, 1999, Texas Instruments' DLP Cinema projector technology was publicly demonstrated on two screens in Los Angeles and New York for the release of Lucasfilm's Star Wars Episode I: The Phantom Menace. In Europe, on February 2, 2000, Texas Instruments' DLP Cinema projector technology was publicly demonstrated, by Philippe Binant, on one screen in Paris for the release of Toy Story 2. From 1997 to 2000, the JPEG 2000 image compression standard was developed by a Joint Photographic Experts Group (JPEG) committee chaired by Touradj Ebrahimi (later the JPEG president). In contrast to the original 1992 JPEG standard, which is a DCT-based lossy compression format for static digital images, JPEG 2000 is a discrete wavelet transform (DWT) based compression standard that could be adapted for motion imaging video compression with the Motion JPEG 2000 extension. JPEG 2000 technology was later selected as the video coding standard for digital cinema in 2004. In 1992, Hughes-JVC was founded by JVC and Hughes Electronics to develop ILA (Image Light Amplifer) digital video projectors for commercial movie theaters using liquid crystal on silicon (LCOS) technology. In 1997, JVC introduced D-ILA (Direct-Drive ILA) technology with a 2K resolution digital video projector. In 2000, JVC introduced a 4K resolution video projector using D-ILA technology. === Initiatives === On January 19, 2000, the Society of Motion Picture and Television Engineers, in the United States, initiated the first standards group dedicated to developing digital cinema. By December 2000, there were 15 digital cinema screens in the United States and Canada, 11 in Western Europe, 4 in Asia, and 1 in South America. Digital Cinema Initiatives (DCI) was formed in March 2002 as a joint project of many motion picture studios (Disney, Fox, MGM, Paramount, Sony Pictures, Universal and Warner Bros.) to develop a system specification for digital cinema. The same month it was reported that the number of cinemas equipped with digital projectors had increased to about 50 in the US and 30 more in the rest of the world. In April 2004, in collaboration with the American Society of Cinematographers, DCI created standard evaluation material (the ASC/DCI StEM material) for testing of 2K and 4K playback and compression technologies. DCI selected JPEG 2000 as the basis for the compression in the system the same year. Initial tests with JPEG 2000 produced bit rates of around 75–125 Mbit/s for 2K resolution and 100–200 Mbit/s for 4K resolution. === Worldwide deployment === In China, in June 2005, an e-cinema system called "dMs" was established and was used in over 15,000 screens spread across China's 30 provinces. DMs estimated that the system would expand to 40,000 screens in 2009. In 2005, the UK Film Council Digital Screen Network launched in the UK by Arts Alliance Media creating a chain of 250 2K digital cinema systems. The roll-out was completed in 2006. This was the first mass roll-out in Europe. AccessIT/Christie Digital also started a roll-out in the United States and Canada. By mid-2006, about 400 theaters were equipped with 2K digital projectors with the number increasing every month. In August 2006, the Malayalam digital movie Moonnamathoral, produced by Benzy Martin, was distributed via satellite to cinemas, thus becoming the first Indian digital cinema. This was done by Emil and Eric Digital Films, a company based at Thrissur using the end-to-end digital cinema system developed by Singapore-based DG2L Technologies. In January 2007, Guru became the first Indian film mastered in the DCI-compliant JPEG 2000 Interop format and also the first Indian film to be previewed digitally, internationally, at the Elgin Winter Garden in Toronto. This film was digitally mastered at Real Image Media Technologies in India. In 2007, the UK became home to Europe's first DCI-compliant fully digital multiplex cinemas; Odeon Hatfield and Odeon Surrey Quays (in London), with a total of 18 digital screens, were launched on 9 February 2007. By March 2007, with the release of Disney's Meet the Robinsons, about 600 screens had been equipped with digital projectors. In June 2007, Arts Alliance Media announced the first European commercial digital cinema Virtual Print Fee (VPF) agreements (with 20th Century Fox and Universal Pictures). In March 2009, AMC Theatres announced that it closed a $315 million deal with Sony to replace all of its movie projectors with 4K HDR digital projectors starting in the second quarter of 2009; it was anticipated that this replacement would be finished by 2012. As digital cinema technology improved in the early 2010s, most theaters across the world converted to digital video projection. In January 2011, the total number of digital screens worldwide was 36,242, up from 16,339 at end 2009 or a growth rate of 121.8 percent during the year. There were 10,083 d-screens in Europe as a whole (28.2 percent of global figure), 16,522 in the United States and Canada (46.2 percent of global figure) and 7,703 in Asia (21.6 percent of global figure). Worldwide progress was slower as in some territories, particularly Latin America and Africa. As of 31 March 2015, 38,719 screens (out of a total of 3

    Read more →
  • List of artificial intelligence journals

    List of artificial intelligence journals

    This is a list of notable peer-reviewed academic journals that publish research in the field of artificial intelligence (AI), including areas such as machine learning, computer vision, natural language processing, robotics, and intelligent systems. == General artificial intelligence == Artificial Intelligence (journal) – Elsevier Journal of Artificial Intelligence Research (JAIR) – AI Access Foundation Knowledge-Based Systems – Elsevier == Machine learning == Data Mining and Knowledge Discovery – Springer Machine Learning (journal) – Springer Journal of Machine Learning Research – Microtome Pattern Recognition (journal) – Elsevier Neural Networks (journal) – Elsevier Neural Computation (journal) – MIT Press Neurocomputing (journal) - Elsevier == Deep learning and neural computation == IEEE Transactions on Evolutionary Computation – IEEE IEEE Transactions on Neural Networks and Learning Systems – IEEE Nature Machine Intelligence – Springer Nature == Computer vision == International Journal of Computer Vision – Springer IEEE Transactions on Pattern Analysis and Machine Intelligence – IEEE Machine Vision and Applications – Springer == Natural language processing == Computational Linguistics (journal) – MIT Press Natural Language Processing Transactions of the Association for Computational Linguistics – ACL == Robotics and intelligent systems == IEEE Transactions on Robotics – IEEE Autonomous Robots – Springer Journal of Intelligent & Robotic Systems – Springer == Interdisciplinary and ethics in AI == AI & Society – Springer Artificial Life – MIT Press Philosophy & Technology – Springer Minds and Machines – Springer

    Read more →
  • Microformat

    Microformat

    Microformats (μF) are predefined HTML markup (like HTML classes) created to serve as descriptive and consistent metadata about elements, designating them as representing a certain type of data (such as contact information, geographic coordinates, events, products, recipes, etc.). They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines, web syndication and aggregators such as RSS. Google confirmed in 2020 that it still parses microformats for use in content indexing. Microformats are referenced in several W3C social web specifications, including IndieAuth and Webmention. Although the content of web pages has been capable of some "automated processing" since the inception of the web, such processing is difficult because the markup elements used to display information on the web do not describe what the information means. Microformats can bridge this gap by attaching semantics, and thereby obviating other, more complicated, methods of automated processing, such as natural language processing or screen scraping. The use, adoption and processing of microformats enables data items to be indexed, searched for, saved or cross-referenced, so that information can be reused or combined. As of 2013, microformats allow the encoding and extraction of event details, contact information, social relationships and similar information. Microformats2, abbreviated as mf2, is the updated version of microformats. Mf2 provides an easier way of interpreting HTML structured syntax and vocabularies than the earlier ways that made use of RDFa and microdata. == Background == Microformats emerged around 2005 as part of a grassroots movement to make recognizable data items (such as events, contact details or geographical locations) capable of automated processing by software, as well as directly readable by end-users. Link-based microformats emerged first. These include vote links that express opinions of the linked page, which search engines can tally into instant polls. CommerceNet, a nonprofit organization that promotes e-commerce on the Internet, has helped sponsor and promote the technology and support the microformats community in various ways. CommerceNet also helped co-found the Microformats.org community site. Neither CommerceNet nor Microformats.org operates as a standards body. The microformats community functions through an open wiki, a mailing list, and an Internet relay chat (IRC) channel. Most of the existing microformats originated at the Microformats.org wiki and the associated mailing list by a process of gathering examples of web-publishing behaviour, then codifying it. Some other microformats (such as rel=nofollow and unAPI) have been proposed, or developed, elsewhere. == Technical overview == XHTML and HTML standards allow for the embedding and encoding of semantics within the attributes of markup elements. Microformats take advantage of these standards by indicating the presence of metadata using the following attributes: class Classname rel relationship, description of the target address in an anchor-element (...) rev reverse relationship, description of the referenced document (in one case, otherwise deprecated in microformats) For example, in the text "The birds roosted at 52.48, -1.89" is a pair of numbers which may be understood, from their context, to be a set of geographic coordinates. With wrapping in spans (or other HTML elements) with specific class names (in this case geo, latitude and longitude, all part of the geo microformat specification): Software agents can recognize exactly what each value represents and can then perform a variety of tasks such as indexing, locating it on a map and exporting it to a GPS device. === Examples === In this example, the contact information is presented as follows: With hCard microformat markup, that becomes: Here, the formatted name (fn), organisation (org), telephone number (tel) and web address (url) have been identified using specific class names and the whole thing is wrapped in class="vcard", which indicates that the other classes form an hCard (short for "HTML vCard") and are not merely coincidentally named. Other, optional, hCard classes also exist. Software, such as browser plug-ins, can now extract the information, and transfer it to other applications, such as an address book. == Specific microformats == Several microformats have been developed to enable semantic markup of particular types of information. However, only hCard and hCalendar have been ratified, the others remaining as drafts: hAtom (superseded by h-entry and h-feed) – for marking up Atom feeds from within standard HTML hCalendar – for events hCard – for contact information; includes: adr – for postal addresses geo – for geographical coordinates (latitude, longitude) hMedia – for audio/video content hAudio – for audio content hNews – for news content hProduct – for products hRecipe – for recipes and foodstuffs. hReview – for reviews rel-directory – for distributed directory creation and inclusion rel-enclosure – for multimedia attachments to web pages rel-license – specification of copyright license rel-nofollow, an attempt to discourage third-party content spam (e.g. spam in blogs) rel-tag – for decentralized tagging (Folksonomy) XHTML Friends Network (XFN) – for social relationships XOXO – for lists and outlines == Uses == Using microformats within HTML code provides additional formatting and semantic data that applications can use. For example, applications such as web crawlers can collect data about online resources, or desktop applications such as e-mail clients or scheduling software can compile details. The use of microformats can also facilitate "mash ups" such as exporting all of the geographical locations on a web page into (for example) Google Maps to visualize them spatially. Several browser extensions, such as Operator for Firefox and Oomph for Internet Explorer, provide the ability to detect microformats within an HTML document. When hCard or hCalendar are involved, such browser extensions allow microformats to be exported into formats compatible with contact management and calendar utilities, such as Microsoft Outlook. When dealing with geographical coordinates, they allow the location to be sent to applications such as Google Maps. Yahoo! Query Language can be used to extract microformats from web pages. On 12 May 2009 Google announced that they would be parsing the hCard, hReview and hProduct microformats, and using them to populate search result pages. They subsequently extended this in 2010 to use hCalendar for events and hRecipe for cookery recipes. Similarly, microformats are also processed by Bing and Yahoo!. As of late 2010, these are the world's top three search engines. Microsoft said in 2006 that they needed to incorporate microformats into upcoming projects, as did other software companies. Alex Faaborg summarizes the arguments for putting the responsibility for microformat user interfaces in the web browser rather than making more complicated HTML: Only the web browser knows what applications are accessible to the user and what the user's preferences are It lowers the barrier to entry for web site developers if they only need to do the markup and not handle "appearance" or "action" issues Retains backwards compatibility with web browsers that do not support microformats The web browser presents a single point of entry from the web to the user's computer, which simplifies security issues == Evaluation == Various commentators have offered review and discussion on the design principles and practical aspects of microformats. Microformats have been compared to other approaches that seek to serve the same or similar purpose. As of 2007, there had been some criticism of one, or all, microformats. The spread and use of microformats was being advocated as of 2007. Opera Software CTO and CSS creator Håkon Wium Lie said in 2005 "We will also see a bunch of microformats being developed, and that’s how the semantic web will be built, I believe." However, in August 2008 Toby Inkster, author of the "Swignition" (formerly "Cognition") microformat parsing service, pointed out that no new microformat specifications had been published since 2005. === Design principles === Computer scientist and entrepreneur, Rohit Khare stated that reduce, reuse, and recycle is "shorthand for several design principles" that motivated the development and practices behind microformats. These aspects can be summarized as follows: Reduce: favor the simplest solutions and focus attention on specific problems; Reuse: work from experience and favor examples of current practice; Recycle: encourage modularity and the ability to embed, valid XHTML can be reused in blog posts, RSS feeds, and anywhere else you can access the web. === Accessibi

    Read more →
  • Rider Spoke

    Rider Spoke

    Rider Spoke developed by Blast Theory in collaboration with the Mixed Reality Lab was first staged at the Barbican, London in October 2007. Created for cyclists, it combines elements of theatre, performance, game play and state of the art technology. Rider Spoke was built in the IPerG project on the EQUIP architecture. Rider Spoke has since been presented in Athens (2008), Brighton (2008), Budapest (2008), Sydney (2009, Adelaide (2009) and Liverpool (2010).

    Read more →
  • Modulation error ratio

    Modulation error ratio

    The modulation error ratio (MER) is a measure used to quantify the performance of a digital radio (or digital TV) transmitter or receiver in a communications system using digital modulation (such as QAM). A signal sent by an ideal transmitter or received by a receiver would have all constellation points precisely at the ideal locations, however various imperfections in the implementation (such as noise, low image rejection ratio, phase noise, carrier suppression, distortion, etc.) or signal path cause the actual constellation points to deviate from the ideal locations. Transmitter MER can be measured by specialized equipment, which demodulates the received signal in a similar way to how a real radio demodulator does it. Demodulated and detected signal can be used as a reasonably reliable estimate for the ideal transmitted signal in MER calculation. == Definition == An error vector is a vector in the I-Q plane between the ideal constellation point and the point received by the receiver. The Euclidean distance between the two points is its magnitude. The modulation error ratio is equal to the ratio of the root mean square (RMS) power (in Watts) of the reference vector to the power (in Watts) of the error. It is defined in dB as: M E R ( d B ) = 10 log 10 ⁡ ( P s i g n a l P e r r o r ) {\displaystyle \mathrm {MER(dB)} =10\log _{10}\left({P_{\mathrm {signal} } \over P_{\mathrm {error} }}\right)} where Perror is the RMS power of the error vector, and Psignal is the RMS power of ideal transmitted signal. MER is defined as a percentage in a compatible (but reciprocal) way: M E R ( % ) = P e r r o r P s i g n a l × 100 % {\displaystyle \mathrm {MER(\%)} ={\sqrt {P_{\mathrm {error} } \over P_{\mathrm {signal} }}}\times 100\%} with the same definitions. MER is closely related to error vector magnitude (EVM), but MER is calculated from the average power of the signal. MER is also closely related to signal-to-noise ratio. MER includes all imperfections including deterministic amplitude imbalance, quadrature error and distortion, while noise is random by nature.

    Read more →
  • Pronunciation assessment

    Pronunciation assessment

    Automatic pronunciation assessment uses computer speech recognition to determine how accurately speech has been pronounced, instead of relying on a human instructor or proctor. It is also called speech verification, pronunciation evaluation, and pronunciation scoring. This technology is used to grade speech quality, for language testing, for computer-aided pronunciation teaching (CAPT) in computer-assisted language learning (CALL), for speaking skill remediation, and for accent reduction. Pronunciation assessment is different from dictation or automatic transcription, because instead of determining unknown speech, it verifies learners' pronunciation of known word(s), often from prior transcription of the same utterance; ideally scoring the intelligibility of the learners' speech. Sometimes pronunciation assessment evaluates the prosody of the learners' speech, such as intonation, pitch, tempo, rhythm, and syllable and word stress, although those are usually not essential for being understood in most languages. Pronunciation assessment is also used in reading tutoring, for example in products from Google, Microsoft, and Amira Learning. Automatic pronunciation assessment can also be used to help diagnose and treat speech disorders such as apraxia. == Intelligibility == Intelligibility refers to how well a learner's utterance is understood by a listener, rather than how much it sounds like a native speaker. This is separate from measures of fluency, such as so-called "Goodness of Pronunciation" (GoP) scores, which estimate how closely an utterance aligns with those of native speakers. Intelligibility is widely regarded as the most important communicative goal in pronunciation teaching and assessment. For example, in the Common European Framework of Reference for Languages (CEFR) assessment criteria for "overall phonological control", intelligibility outweighs formally correct pronunciation at all levels. Studies in applied linguistics have shown that accent reduction does not always increase intelligibility because listeners can often comprehend heavily accented speech without difficulty. Pronunciation assessment systems often rely on acoustic methods such as GoP which compare learner speech to reference models to produce phoneme-level scores, which are in turn aggregated to produce word and phrase scores. While these methods are effective for identifying deviations from native speakers' utterances, they do not effectively measure how understandable speech is to human listeners. Intelligibility is influenced by broader linguistic and contextual factors such as stress placement, speech rate, and coarticulation, which are not represented in purely segmental scores. The earliest work on pronunciation assessment avoided measuring genuine listener intelligibility, a shortcoming corrected in 2011 at the Toyohashi University of Technology, and included in the Versant high-stakes English fluency assessment from Pearson and mobile apps from 17zuoye Education & Technology, but still missing in 2023 products from Google Search, Microsoft, Educational Testing Service, Speechace, and ELSA. Assessing authentic listener intelligibility is essential for avoiding inaccuracies from accent bias, especially in high-stakes assessments; from words with multiple correct pronunciations; and from phoneme coding errors in machine-readable pronunciation dictionaries. In 2022, researchers found that some newer speech-to-text systems, based on end-to-end reinforcement learning to map audio signals directly into words, produce word and phrase confidence scores (from 10-25ms audio frame logit aggregation) closely correlated with genuine listener intelligibility. Others have been able to assess intelligibility using Levenshtein or dynamic time warping distance measures from Wav2Vec2 representation of good speech. Further work through 2025 has focused specifically on measuring intelligibility. A 2025 study of 42 pronunciation and speech coaching apps (32 mobile and 10 web) found that none offered intelligibility assessment. Instead, most provided only segmental and accent-focused scoring. About two-thirds of the apps provided some form of specific pronunciation feedback, usually with phonetic transcriptions, but accompanied by visual cues (such as animations of the vocal tract or the lips and tongue from the front) in only about 5% of the apps. Less than a third provided feedback on learner perception of exemplar speech. == Evaluation == Although there are as yet no industry-standard benchmarks for evaluating pronunciation assessment accuracy, researchers occasionally release evaluation speech corpuses for others to use for improving assessment quality. Such evaluation databases often emphasize formally unaccented pronunciation to the exclusion of genuine intelligibility evident from blinded listener transcriptions. As of mid-2025, state of the art approaches for automatically transcribing phonemes typically achieve an error rate of about 10% from known good speech. The International Speech Communication Association (ISCA) 2025 Workshop on Speech and Language Technology in Education (SLaTE) administered a Speak & Improve Challenge: Spoken Language Assessment and Feedback, introducing benchmarks for evaluating pronunciation assessment and remediation systems across languages, accents, and learner populations. The challenge emphasized cross-lingual generalization and alignment with human intelligibility judgments, for more robust and interpretable assessment systems. Ethical issues in pronunciation assessment are present in both human and automatic methods. Authentic validity, fairness, and mitigating bias in evaluation are all crucial. Diverse speech data should be included in automatic pronunciation assessment models. Combining human judgments, especially blinded transcriptions from a wide diversity of listeners, with automated feedback can improve accuracy and fairness. Second language learners benefit substantially from their use of widely available speech recognition systems for dictation, virtual assistants, and AI chatbots. In such systems, users naturally try to correct their own errors evident in speech recognition results that they notice. Such use improves their grammar and vocabulary development along with their pronunciation skills. The extent to which explicit pronunciation assessment and remediation approaches improve on such self-directed interactions remains an open question. Similarly, automatic dictation results have been shown to reflect intelligibility about as well as human scorers. == Recent developments == During 2021–22, a smartphone-based CAPT system was used to sense articulation through both audible and inaudible signals, providing feedback at the phoneme level. Some promising areas for improvement which were being developed in 2024 include articulatory feature extraction and transfer learning to suppress unnecessary corrections. Other interesting advances under development include "augmented reality" interfaces for mobile devices using optical character recognition to provide pronunciation training on text found in user environments. In 2024, audio multimodal large language models were first described as assessing pronunciation. That work has been carried forward by other researchers in 2025 who report positive results. Subsequently, researchers demonstrated pronunciation scoring by providing a language model with textual descriptions of speech, including the speech-to-text transcript, phoneme sequences, pauses, and phoneme sequence matching; this approach can achieve performance similar to multimodal LLMs that analyze raw audio while avoiding their higher computational cost. In 2025, the Duolingo English Test authors published a description of their pronunciation assessment method, purportedly built to measure intelligibility rather than accent imitation. While achieving a correlation of 0.82 with expert human ratings, very close to inter-rater agreement and outperforming alternative methods, the method is nonetheless based on experts' scores along the six-point CEFR common reference levels scale, instead of actual blinded listener transcriptions. Further promising work in 2025 includes assessment feedback aligning learner speech to synthetic utterances using interpretable features, identifying continuous spans of words for remediation feedback; synthesizing corrected speech matching learners' self-perceived voices, which they prefer and imitate more accurately as corrections; and streaming such interactions. On January 21, 2026, Educational Testing Service's TOEFL iBT high-stakes English language test, required by US university admissions and employers from English as a foreign language applicants more often than all other internet-based tests combined, changed its speaking assessments. While official rubrics claim that the new scoring will be based primarily on intelligibility, the new test's technical description indicates that it ju

    Read more →
  • HTTP compression

    HTTP compression

    HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization. HTTP data is compressed before it is sent from the server: compliant browsers will announce what methods are supported to the server before downloading the correct format; browsers that do not support compliant compression method will download uncompressed data. The most common compression schemes include gzip and Brotli; a full list of available schemes is maintained by the IANA. There are two different ways compression can be done in HTTP. At a lower level, a Transfer-Encoding header field may indicate the payload of an HTTP message is compressed. At a higher level, a Content-Encoding header field may indicate that a resource being transferred, cached, or otherwise referenced is compressed. Compression using Content-Encoding is more widely supported than Transfer-Encoding, and some browsers do not advertise support for Transfer-Encoding compression to avoid triggering bugs in servers. == Compression scheme negotiation == The negotiation is done in two steps, described in RFC 2616 and RFC 9110: 1. The web client advertises which compression schemes it supports by including a list of tokens in the HTTP request. For Content-Encoding, the list is in a field called Accept-Encoding; for Transfer-Encoding, the field is called TE. 2. If the server supports one or more compression schemes, the outgoing data may be compressed by one or more methods supported by both parties. If this is the case, the server will add a Content-Encoding or Transfer-Encoding field in the HTTP response with the used schemes, separated by commas. The web server is by no means obligated to use any compression method – this depends on the internal settings of the web server and also may depend on the internal architecture of the website in question. == Content-Encoding tokens == The official list of tokens available to servers and client is maintained by IANA, and it includes: br – Brotli, a compression algorithm specifically designed for HTTP content encoding, defined in RFC 7932 and implemented in all modern major browsers. compress – UNIX "compress" program method (historic; deprecated in most applications and replaced by gzip or deflate) deflate – compression based on the deflate algorithm (described in RFC 1951), a combination of the LZ77 algorithm and Huffman coding, wrapped inside the zlib data format (RFC 1950); exi – W3C Efficient XML Interchange gzip – GNU zip format (described in RFC 1952). Uses the deflate algorithm for compression, but the data format and the checksum algorithm differ from the "deflate" content-encoding. This method is the most broadly supported as of March 2011. identity – No transformation is used. This is the default value for content coding. pack200-gzip – Network Transfer Format for Java Archives zstd – Zstandard compression, defined in RFC 8478 In addition to these, a number of unofficial or non-standardized tokens are used in the wild by either servers or clients: bzip2 – compression based on the free bzip2 format, supported by lighttpd lzip – compression based on the free lzip format, supported by wget and Links lzma – compression based on (raw) LZMA is available in Opera 20, and in elinks via a compile-time option peerdist – Microsoft Peer Content Caching and Retrieval rsync – delta encoding in HTTP, implemented by a pair of rproxy proxies. xpress – Microsoft compression protocol used by Windows 8 and later for Windows Store application updates. LZ77-based compression optionally using a Huffman encoding. xz – LZMA2-based content compression, supported by a non-official Firefox patch; and fully implemented in mget since 2013-12-31. == Servers that support HTTP compression == SAP NetWeaver Microsoft IIS: built-in or using third-party module Apache HTTP Server, via mod_deflate (despite its name, only supporting gzip), and mod_brotli Hiawatha HTTP server: serves pre-compressed files Cherokee HTTP server, On the fly gzip and deflate compressions Oracle iPlanet Web Server Zeus Web Server lighttpd nginx – built-in Applications based on Tornado, if "compress_response" is set to True in the application settings (for versions prior to 4.0, set "gzip" to True) Jetty Server – built-into default static content serving and available via servlet filter configurations GeoServer Apache Tomcat IBM Websphere AOLserver Ruby Rack, via the Rack::Deflater middleware HAProxy Varnish – built-in. Works also with ESI Armeria – Serving pre-compressed files NaviServer – built-in, dynamic and static compression Caddy – built-in via encode Many content delivery networks also implement HTTP compression to improve speedy delivery of resources to end users. The compression in HTTP can also be achieved by using the functionality of server-side scripting languages like PHP, or programming languages like Java. Various online tools exist to verify a working implementation of HTTP compression. These online tools usually request multiple variants of a URL, each with different request headers (with varying Accept-Encoding content). HTTP compression is considered to be implemented correctly when the server returns a document in a compressed format. By comparing the sizes of the returned documents, the effective compression ratio can be calculated (even between different compression algorithms). == Problems preventing the use of HTTP compression == A 2009 article by Google engineers Arvind Jain and Jason Glasgow states that more than 99 person-years are wasted daily due to increase in page load time when users do not receive compressed content. This occurs when anti-virus software interferes with connections to force them to be uncompressed, where proxies are used (with overcautious web browsers), where servers are misconfigured, and where browser bugs stop compression being used. Internet Explorer 6, which drops to HTTP 1.0 (without features like compression or pipelining) when behind a proxy – a common configuration in corporate environments – was the mainstream browser most prone to failing back to uncompressed HTTP. Another problem found while deploying HTTP compression on large scale is due to the deflate encoding definition: while HTTP 1.1 defines the deflate encoding as data compressed with deflate (RFC 1951) inside a zlib formatted stream (RFC 1950), Microsoft server and client products historically implemented it as a "raw" deflated stream, making its deployment unreliable. For this reason, some software, including the Apache HTTP Server, only implements gzip encoding. == Security implications == Compression allows a form of chosen plaintext attack to be performed: if an attacker can inject any chosen content into the page, they can know whether the page contains their given content by observing the size increase of the encrypted stream. If the increase is smaller than expected for random injections, it means that the compressor has found a repeat in the text, i.e. the injected content overlaps the secret information. This is the idea behind CRIME. In 2012, a general attack against the use of data compression, called CRIME, was announced. While the CRIME attack could work effectively against a large number of protocols, including but not limited to TLS, and application-layer protocols such as SPDY or HTTP, only exploits against TLS and SPDY were demonstrated and largely mitigated in browsers and servers. The CRIME exploit against HTTP compression has not been mitigated at all, even though the authors of CRIME have warned that this vulnerability might be even more widespread than SPDY and TLS compression combined. In 2013, a new instance of the CRIME attack against HTTP compression, dubbed BREACH, was published. A BREACH attack can extract login tokens, email addresses or other sensitive information from TLS encrypted web traffic in as little as 30 seconds (depending on the number of bytes to be extracted), provided the attacker tricks the victim into visiting a malicious web link. All versions of TLS and SSL are at risk from BREACH regardless of the encryption algorithm or cipher used. Unlike previous instances of CRIME, which can be successfully defended against by turning off TLS compression or SPDY header compression, BREACH exploits HTTP compression which cannot realistically be turned off, as virtually all web servers rely upon it to improve data transmission speeds for users. As of 2016, the TIME attack and the HEIST attack are now public knowledge.

    Read more →
  • Extremely online

    Extremely online

    An extremely online (often capitalized), terminally online, or chronically online person is someone who is closely engaged with Internet culture. People said to be extremely online often believe that online posts are very important. Events and phenomena can themselves be extremely online; while often used as a descriptive term, the phenomenon of extreme online usage has been described as "both a reformation of the delivery of ideas – shared through words and videos and memes and GIFs and copypasta – and the ideas themselves". Here, "online" is used to describe "a way of doing things, not [simply] the place they are done". == Criteria == While the term was in use as early as 2014, it gained popularity over the latter half of the 2010s in conjunction with the increasing prevalence and notability of Internet phenomena in all areas of life. Extremely online people, according to The Daily Dot, are interested in topics "no normal, healthy person could possibly care about", and have been analogized to "pop culture fandoms, just without the pop". Extremely online phenomena such as fan culture and reaction GIFs have been described as "swallowing democracy" by journalists such as Amanda Hess in The New York Times, who claimed that a "great convergence between politics and culture, values and aesthetics, citizenship and commercialism" had become "a dominant mode of experiencing politics". Vulture – formerly the pop culture section of New York magazine, now a stand-alone website – has a section for articles tagged "extremely online". == Historical background == In the 2010s, many categories and labels came into wide use from media outlets to describe Internet-mediated cultural trends, such as the alt-right, the dirtbag left, and doomerism. These ideological categories are often defined by their close association with online discourse. For example, the term "alt-right" was added to the Associated Press' stylebook in 2016 to describe the "digital presence" of far-right ideologies, the dirtbag left refers to a group of "underemployed and overly online millennials" who "have no time for the pieties of traditional political discourse", and the doomer's "blackpilled despair" is combined with spending "too much time on message boards in high school" to produce an eclectic "anti-socialism". Extreme onlineness transcends ideological boundaries. For example, right-wing figures like Alex Jones and Laura Loomer have been described as "extremely online", but so have those on the left like Alexandria Ocasio-Cortez and fans of the Chapo Trap House podcast. Extremely online phenomena can range from acts of offline violence (such as the 2019 Christchurch shootings) to "[going] on NPR to explain the anti-capitalist irony inherent in kids eating Tide Pods". United States President Donald Trump's posts on social media have been frequently cited as extremely online, during both his presidency and his 2020 presidential campaign; Vox claimed his approach to re-election veered into being "Too Online", and Reason questioned whether the final presidential debate was "incomprehensible to normies". While individual people are often given the description, being extremely online has also been posited as an overall cultural phenomenon, applying to trends like lifestyle movements suffixed with "-wave" and "-core" based heavily on Internet media, as well as an increasing expectation for digital social researchers to have an "online presence" to advance in their careers. == Participants and media coverage == One example of a phenomenon considered to be extremely online is the "wife guy" (a guy who posts about his wife); despite being a "stupid online thing" which spent several years as a piece of Internet slang, in 2019 it became the subject of five articles in leading U.S. media outlets. Like many extremely online phrases and phenomena, the "wife guy" has been attributed in part to the in-character Twitter account dril. The account frequently parodies how people behave on the Internet, and has been widely cited as influential on online culture. In one tweet, his character refuses to stop using the Internet, even when someone shouts outside his house that he should log off. Many of dril's other coinages have become ubiquitous parts of Internet slang. Throughout the 2010s, posters such as dril inspired commonly used terms like "corncobbing" (referring to someone losing an argument and failing to admit it); while originally a piece of obscure Internet slang used on sites like Twitter, use of the term (and controversy over its misinterpretation) became a subject of reporting from traditional publications, with some noting that keeping up with the rapid turnover of inside jokes, memes, and quotes online required daily attention to avoid embarrassment. Twitch has been described as "talk radio for the extremely online". Another example of an event cited as extremely online is No Nut November. Increasingly, researchers are expected to have more of an online presence, to advance in their careers, as networking and portfolios continue to transition to the digital world. In November 2020, an article in The Washington Post criticized the filter bubble theory of online discourse on the basis that it "overgeneralized" based on a "small subset of extremely online people". The 2021 storming of the United States Capitol was described as extremely online, with "pro-Trump internet personalities", such as Baked Alaska, and fans livestreaming and taking selfies. People who have been described as extremely online include Chrissy Teigen, Jon Ossoff, and Andrew Yang. In contrast, Joe Biden has been cited as the antithesis of extremely online—The New York Times wrote in 2019 that he had "zero meme energy".

    Read more →