WebAR

WebAR, previously known as the Augmented Web, is a web technology that allows for augmented reality functionality within a web browser. It is a combination of HTML, Web Audio, WebGL, and WebRTC. From 2020s more known as web-based Augmented Reality or WebAR, which is about the use of augmented reality elements in browsers. It was the focus of a Birds of a Feather meeting at ISMAR2012 and is now the focus of the W3C Augmented Web Community Group. == Features == Browser augmented reality for smartphones has a number of features that distinguish it from similar content in special apps. No special applications are needed for Web AR. A regular browser is enough. And it can run to a certain extent on most browsers. It is easy to set up marketing analytics. By connecting the website to services that collect statistics, it is convenient to receive geographic coordinates, demographic characteristics and other information about users. Ability to add a CTA button. It is extremely important for marketing websites to place it so that the user can add contact information or place an order after considering the offer. Rich content. Browser augmented reality for tablets and smartphones supports 2D and 3D graphics, animation and other formats. Image marker tracking. If a QR code is selected as an activator for an AR element or just a picture on a flat surface, the device can easily read it. Various activation ways. Web AR can be marker and markerless, attached to geolocation, it can also be hidden in a direct link. Game content. Even simple games with simple mechanics, transferred into augmented reality, can delight the website visitor. Cross-platform. You can view content that complements our usual reality using any modern smartphone model. == Limitations == Performance is simply better on an app, where there's capacity for more memory and programs are executed in native code therefore it provides better visuals, better animations and better interactivity than in WebAR experience. A web page can only have access to certain parts of the device you're using, whereas a native app can access all of a device's capabilities. Meaning if you want the convenience of WebAR, you need to be thinking of simple but effective experiences instead. Compatibility. Not every mobile device has the required HW for AR performance. == Implementation == Browser support is evolving quickly and can best be monitored using services like Can I Use. Since this is a web application, there are platforms that support the creation of WebAR that are similar to normal web development platforms. Something which enables the creation of 3D assets and environments using a web framework that looks similar to HTML. Applications (like for example – A-Frame) are supported by 8th Wall, which is by the end of 2021 the leading SLAM tracking SDK for WebAR on the market. WebAR is currently limited mostly by the browser – so how much the technology will develop rather depends on what the big players like Google and Apple develop. For iOS device users, Apple developed AR Quick Look, an extension that enables users to use ARKit on the web. For Android devices your browser should support WebXR, an API that allows users to view AR/VR content without installing extra plugins or software, and have ARCore installed. There are many tools and frameworks that help developers in expanding the immersive web with WebAR. For example, AR.js is an open-source library for Augmented Reality on the Web for improved WebAR performance on smartphones that includes marker-based technology (simplified QR-codes) and location-based AR. Apple at the WWDC Conference 2018, announced that it has developed a new file format, working together with Pixar, called USDZ Universal. This file will allow developers to create 3d models for augmented reality. USDZ format was created by Apple together with Pixar Animation Studio and allowed developers to create 3D models for AR. == Industries == Where WebAR can be used from virtual guides, which can help students navigate through campus to virtual film posters: E-commerce and Advertising. Education. Entertainment. Business. Fashion. == Examples == Promotion of Spider-Man: Into the Spider-Verse for which 8th Wall developed the AR platform that made this interactive WebAR promoting the Sony animated smash hit. Everyone can invite teenage Spiderman/Miles Morales into their homes for some one-on-one interaction, take pictures and share the experience with friends. Sony Pictures included the QR code to launch this WebAR site in print promotions for the movie. Also in 2017 the advertising of Jumanji: The Next Level gave us the world's first WebAR activation with usage of Amazon Lex to power voice interaction (the same tool that powers Amazon Alexa), the experience sends users on a wild 3D adventure into the world of Jumanji! This was a collaboration between Sony Pictures and Trigger - The Mixed Reality Agency. The WebAR technology is powered by 8th Wall. And you can check it via the link to the official YouTube recording of the experience. RPR & Microsoft's Holographic Retail Platform, where Web AR brings a new twist to online shopping by allowing users to interact with 3D holographic images of models right from their smartphones' browsers. This experience is designed to increase buyer confidence and reduce clothing returns, which are two of the greatest challenges to purchasing clothing online. Digital Porsche Brand Academy was developed by the Team of svarmony Technologies GmbH and it is the first-to-market training tool that uses augmented reality to provide Porsche employees an immersive experience learning about the company's history and values. The star of this WebAR experience is an animated avatar that serves as a tour guide for Porsche's past, present, and future. Employees can explore realistically animated Porsche-locations, take a ride in a virtual Porsche, help assemble a car, and test Porsche knowledge via a quiz. The Digital Porsche Brand Academy is a great starter kit for employees to establish a relationship with the brand and align with the company's plans. == Future == By freeing smartphone users from having to install numerous apps, WebAR can make Augmented Reality far more accessible for them and more beneficial for business. The further development of the WebAR can be accelerated by the widespread social acceptance of the headsets that can give the whole other level of AR experience. This means instant access to the information when the contextually relevant content is appearing as the person's real background is changing.

Blackmagic Design

Blackmagic Design Pty Ltd is an Australian company that develops digital cinema technology and manufactures professional video production hardware and software. Headquartered in South Melbourne, it is known for producing high-end digital movie cameras and a range of broadcast and post-production equipment. The company also develops software applications, including the DaVinci Resolve application for non-linear video editing, color correction, color grading, visual effects, and audio post-production. == History == Blackmagic Design Pty Ltd was founded on 7 September 2001 by Grant Petty. Its first product, DeckLink, introduced in 2002, was a video capture card for macOS that supported uncompressed 10-bit video, marking a shift toward professional-grade yet affordable video workflows. Subsequent versions—including the DeckLink 2, Pro SDI, HD Plus, and Multibridge—added capabilities such as color correction, Windows support, and compatibility with major editing software like Adobe Premiere Pro, to broaden the product's appeal. At the 2012 NAB Show, Blackmagic announced its first Cinema Camera, a digital movie camera. Blackmagic made several acquisitions over the next decade. In 2009, it acquired da Vinci Systems, known for its color-grading tools. In 2010, it acquired Echolab's ATEM switcher line, in 2014, it added eyeon Software (developer of the Blackmagic Fusion compositing software) and London's Cintel (film scanning and restoration), and in 2016, it acquired Fairlight, an audio technology company known for its CMI synthesizers as well as mixing consoles. == Products == List of all products developed by the company. Editing, Color Correction and Audio Post Production DaVinci Resolve (free version) and DaVinci Resolve Studio (paid version), computer software for non-linear video editing, color correction, color grading, visual effects, and audio post-production. Audio/Video Controller Consoles: Editor Keyboard, Speed Editor, DaVinci Resolve Replay Editor, Micro Panel, Mini Panel, DaVinci Resolve Micro Color Panel, Advanced Panel, Fairlight Console Channel Fader, Fairlight Console Channel Control, Fairlight Console LCD Monitor, Fairlight Console Audio Editor, Fairlight Desktop Audio Editor, Fairlight Desktop Console, Fairlight Audio Interface Cintel Film Scanner (Generations 1-3) Live Production Home Streaming: ATEM Mini, ATEM Mini Pro/ISO, ATEM Mini Extreme, ATEM Mini Extreme ISO (The ATEM Mini series has both HDMI and SDI variants) Production Switchers: ATEM 1,2 & 4 M/E Constellation HD, ATEM 1,2 & 4 M/E Constellation 4K, ATEM Constellation 8K, ATEM 1,2 & 4 M/E Production Studio 4K, ATEM Television Studio HD8 & HD8 ISO Switcher & Camera Controllers: ATEM Camera Control Panel, ATEM 1 M/E Advanced Panel, ATEM 2 M/E Advanced Panel, ATEM 4 M/E Advanced Panel Chroma Keyers: Ultimatte 12 HD Mini, Ultimatte 12 HD, Ultimatte 12 4K, Ultimatte 12 8K Recording and Storage: HyperDeck Studio HD Mini, HyperDeck Studio HD Plus, HyperDeck Studio HD Plus, HyperDeck Studio 4K Pro, HyperDeck Extreme 8K HDR, HyperDeck Extreme 4K HDR, HyperDeck Extreme Control, HyperDeck Shuttle HD, Duplicator 4K, MultiDock 10G, Video Assist 7" 12G HDR, Video Assist 5" 12G HDR Capture and Playback UltraStudio: 3G, HD Mini, 4K Mini, 4K Extreme 3 DeckLink (PCIe cards): Mini Recorder, Mini Monitor, Mini Monitor 4K, Mini Recorder 4K, Duo 2 Mini, Duo 2, Quad 2, SDI 4K, Studio 4K, 4K Extreme 12G, 8K Pro, Quad HDMI Recorder Network Storage Cloud Store Cloud Pod Broadcast Converters Micro Converter: BiDirectional SDI/HDMI 3G wPSU, HDMI to SDI 3G wPSU, SDI to HDMI 3G wPSU, BiDirectional SDI/HDMI 3G, HDMI to SDI 3G, SDI to HDMI 3G Mini Converters: Audio to SDI, Optical Fiber 12G, SDI Multiplex 4K, Quad SDI to HDMI 4K, SDI Distribution 4K, SDI to Analog 4K, Audio to SDI 4K, SDI to Audio 4K, HDMI to SDI 6G, SDI to HDMI 6G Teranex Mini: SDI Distribution 12G, SDI to HDMI 12G, Audio to SDI 12G, SDI to Analog 12G, SDI to HDMI 8K HDR, SDI to DisplayPort 8K HDR 2110 IP Converters Routing and Distribution Videohub

CSS HTML Validator

CSS HTML Validator (previously named CSE HTML Validator) is an HTML editor and CSS editor for Microsoft Windows (and Linux and other Unix-like operating systems when used with Wine) that helps web developers create syntactically correct and accessible HTML/HTML5, XHTML, and CSS documents by locating errors, potential problems like browser compatibility issues, and common mistakes. It is also able to check links, check spelling, suggest improvements, alert developers to deprecated, obsolete, or proprietary tags, attributes, and CSS properties, and find issues that can affect search engine optimization. CSS HTML Validator is developed, marketed, and sold by AI Internet Solutions LLC located in the United States. The first version of CSS HTML Validator was released in 1997 for Windows 95. The current version is 2026/v26.02 (as of January 9, 2026) and is for Windows 10 and above, including Windows 11. A native macOS and Linux command-line console tool (called htmlval) became available with version 23. There are currently three main editions of CSS HTML Validator — Pro/Professional, Home/Standard, and Lite. The Enterprise edition was discontinued in 2025/v25. While the application is generally a commercial product (except for the Lite edition), a free version of the Home edition is available for personal/educational, non-commercial use. A free limited version of the htmlval command-line console tool for macOS and Linux is also available. == Features == CSS HTML Validator includes an HTML editor, validator for HTML, XHTML, htmx, polyglot markup, CSS, PHP and JavaScript (using JSLint or JSHint), link checker (to find dead and broken links), spell checker, accessibility checker, and search engine optimization (SEO) checker. An integrated web browser allows developers to browse the web while the pages are automatically validated. Because documents are checked locally and not uploaded over the Internet to a server in order to be checked, validations are performed relatively quickly, and security and privacy are increased. A custom scripting language called TNPL, included in the Pro and Enterprise editions, can be used to customize validations by adding, eliminating, or changing validator messages. TNPL can also be used to integrate customized validation checks to meet the unique requirements of an individual or entity. A Batch Wizard tool, included in the Pro and Enterprise editions, can check entire Web sites, parts of Web sites, or a list of local web documents. The Batch Wizard generates reports in standard HTML or XML format. The reports can be viewed using a normal web browser. The accessibility checker includes support for Section 508 Amendment to the Rehabilitation Act of 1973 and Web Content Accessibility Guidelines (both WCAG 1.0 and WCAG 2.0/2.1/2.2). Using a version of HTML Tidy with HTML5 support and the Pretty Print & Fix Tool, CSS HTML Validator can automatically fix some common problems with HTML and XHTML documents. However, some problems cannot be fixed (or fixed correctly) with automated tools and require manual review and repair. == Version history == Validation of polyglot markup was added in version 12, and mobile development support (for HTML and CSS) was added in version 14 and improved in version 15. Version 15 added built-in syntax checking for JSON and HTML5 cache manifest files. Version 16 added JavaScript linting using JSHint, a static code analysis tool for checking JavaScript, but also continues to support JSLint. Version 17 added support for the Accelerated Mobile Pages Project, which is a type of HTML optimized for mobile web browsing, and support for live DOM validation using Google Chrome CSS HTML Validator 2018/v18 renames the software from CSE HTML Validator to CSS HTML Validator and includes updated HTML5 and CSS support. Version 18 also added a new "By Message" report in the Batch Wizard and dropped support for Windows Vista and below. CSS HTML Validator 2019/v19 includes updated HTML and CSS support, adds WCAG 2.1 support, improves support when running under Wine (software), and is a native 64-bit application (previously releases were 32-bit). CSS HTML Validator 2020/v20, first released in January 2020, includes HTML, CSS, accessibility, and other updates, including improved support for the Accelerated Mobile Pages Project. Also, beginning with version 20, the Standard edition was renamed to the Home edition. CSS HTML Validator 2021/v21, first released in January 2021, includes further HTML, CSS, accessibility, and other updates. CSS HTML Validator 2022/v22, released in January 2022, includes improvements and updates to keep the program up-to-date, a new Microsoft Edge WebView2 rendering engine for the integrated web browser, and three new dark themes. Later updates to version 22 added support for checking JSON Lines and NDJSON documents. CSS HTML Validator 2023/v23, released in January 2023, includes more improvements and updates to keep the program up-to-date. The new release also introduced new command-line macOS and Linux ports of the core validation engine, called htmlval for Mac and Linux. Official support for Windows 7, 8, and 8.1 was dropped in the 2023/v23 version. CSS HTML Validator 2024/v24, released in January 2024, includes updates and improvements. It also adds support for htmx. CSS HTML Validator 2025/v25, released in December 2024, includes further updates and improvements for 2025. Version 25 discontinues the Enterprise edition, moving Enterprise functionality to the Pro edition. CSS HTML Validator 2026/v26, released in January 2026, includes updated support for HTML and CSS. An online edition based on CSS HTML Validator Pro that can check documents via file upload, URL, or snippets (direct text input) was discontinued May 2017 in favor of the desktop version for Microsoft Windows. == Purpose of validation == The purpose of validation and computerized checking of HTML, XHTML, and CSS documents is to help make sure that the documents are syntactically correct and problem-free. Checked HTML, XHTML, and CSS documents are more likely to: be more accessible for people with disabilities (such as blindness), as well as all users in general render faster (user agents don't have to "figure out" and decipher bad syntax) render as intended and with fewer problems on a variety of user agents, including mobile devices cause browsers and user agents to build a more consistent Document Object Model, which is important for CSS and JavaScript be forward-compatible with future versions of user agents and browsers ("future-proof") be compatible with current and future HTML, XHTML, and CSS specifications cause fewer problems for visitors and web indexing not contain dead, broken, or rotting links While automated checking tools are helpful for website development and continued maintenance, they cannot guarantee that a document will display (render) and behave as intended in all browsers. Developers should always test documents in a variety of browsers (including mobile browsers) to locate problems that cannot be detected with a computerized checking tool. == Differences from other HTML validators == CSS HTML Validator is an offline desktop app for Microsoft Windows and a native macOS and Linux command-line console tool that does not require an Internet connection. The offline nature of CSS HTML Validator is in contrast to online web-based services. CSS HTML Validator primarily works offline (except for link checking when it must go online), which has speed and privacy benefits compared to web-based solutions and services like the W3C Markup Validation Service. However, the user must keep the software updated unlike web-based solutions which are typically kept updated by the solution provider. CSS HTML Validator checks HTML/XHTML syntax, CSS, links, spelling, accessibility, JavaScript, SEO, and PHP with one pass, while DTD-based validators are more limited and cannot check HTML5. CSS HTML Validator includes a built-in scripting language (called TNPL) which allows for a high degree of customization via scripting and "user functions". This allows developers to add custom (specialized) validation checks and messages. CSS HTML Validator includes a DTD-based validator which can optionally be used for checking DTD-based versions of HTML (versions prior to HTML5), however one of CSS HTML Validator's primary differences is that its custom validation engine can perform more checks on a document than a DTD-based validator can. This is because DTD-based validators are limited to checking only what can be specified in a Document Type Definition. == Integration == CSS HTML Validator integrates with other third-party software like those listed below. This allows validation using CSS HTML Validator from within the third-party program. EmEditor - includes a special Lite edition build of CSS HTML Validator for built-in checking of HTML and CSS Blumentals Software - several Blumentals software products integrate with CSS H

Bookmarklet

A bookmarklet is a bookmark stored in a web browser that contains JavaScript commands that add new features to the browser. They are stored as the URL of a bookmark in a web browser or as a hyperlink on a web page. Bookmarklets are usually small snippets of JavaScript executed when a user clicks on them. When clicked, bookmarklets can perform a wide variety of operations, such as running a search query from selected text or extracting data from a table. Another name for bookmarklet is favelet or favlet, derived from favorites (synonym of bookmark). == History == Steve Kangas of bookmarklets.com coined the word bookmarklet when he started to create short scripts based on a suggestion in Netscape's JavaScript guide. Before that, Tantek Çelik called these scripts favelets and used that word as early as on 6 September 2001 (personal email). Brendan Eich, who developed JavaScript at Netscape, gave this account of the origin of bookmarklets: They were a deliberate feature in this sense: I invented the javascript: URL along with JavaScript in 1995, and intended that javascript: URLs could be used as any other kind of URL, including being bookmark-able. In particular, I made it possible to generate a new document by loading, e.g. javascript:'hello, world', but also (key for bookmarklets) to run arbitrary script against the DOM of the current document, e.g. javascript:alert(document.links[0].href). The difference is that the latter kind of URL uses an expression that evaluates to the undefined type in JS. I added the void operator to JS before Netscape 2 shipped to make it easy to discard any non-undefined value in a javascript: URL. The increased implementation of Content Security Policy (CSP) in websites has caused problems with bookmarklet execution and usage (2013–2015), with some suggesting that this hails the end or death of bookmarklets. William Donnelly created a work-around solution for this problem (in the specific instance of loading, referencing and using JavaScript library code) in early 2015 using a Greasemonkey userscript (Firefox / Pale Moon browser add-on extension) and a simple bookmarklet-userscript communication protocol. It allows (library-based) bookmarklets to be executed on any and all websites, including those using CSP and having an https:// URI scheme. However, if/when browsers support disabling/disallowing inline script execution using CSP, and if/when websites begin to implement that feature, it will "break" this "fix". == Concept == Web browsers use URIs for the href attribute of the tag and for bookmarks. The URI scheme, such as http or ftp, and which generally specifies the protocol, determines the format of the rest of the string. Browsers also implement javascript: URIs that to a parser is just like any other URI. The browser recognizes the specified javascript scheme and treats the rest of the string as a JavaScript program which is then executed. The expression result, if any, is treated as the HTML source code for a new page displayed in place of the original. The executing script has access to the current page, which it may inspect and change. If the script returns an undefined type (rather than, for example, a string), the browser will not load a new page, with the result that the script simply runs against the current page content. This permits changes such as in-place font size and color changes without a page reload. An immediately invoked function that returns no value or an expression preceded by the void operator will prevent the browser from attempting to parse the result of the evaluation as a snippet of HTML markup: == Usage == Bookmarklets are saved and used as normal bookmarks. As such, they are simple "one-click" tools which add functionality to the browser. For example, they can: Modify the appearance of a web page within the browser (e.g., change font size, background color, etc.) Extract data from a web page (e.g., hyperlinks, images, text, etc.) Remove redirects from (e.g. Google) search results, to show the actual target URL Submit the current page to a blogging service such as Posterous, link-shortening service such as bit.ly, or bookmarking service such as Delicious Query a search engine or online encyclopedia with highlighted text or by a dialog box Submit the current page to a link validation service or translation service Set commonly chosen configuration options when the page itself provides no way to do this Control HTML5 audio and video playback parameters such as speed, position, toggling looping, and showing/hiding playback controls, the first of which can be adjusted beyond HTML5 players' typical range setting. Installing a bookmarklet follows the same process as adding a normal bookmark; the only difference is that in place of the URL destination field is JavaScript code preceded by javascript:. Once created, bookmarklets can be run by clicking on them.

Ambient awareness

Ambient awareness (AmA) is a term used by social scientists to describe a form of peripheral social awareness through social media. This awareness is propagated from relatively constant contact with one's friends and colleagues via social networking platforms on the Internet. The term essentially defines the sort of omnipresent knowledge one experiences by being a regular user of these media outlets that allow a constant connection with one's social circle. According to Clive Thompson of The New York Times, ambient awareness is "very much like being physically near someone and picking up on mood through the little things; body language, sighs, stray comments". Academic Andreas Kaplan defines ambient awareness as "awareness created through regular and constant reception, and/or exchange of information fragments through social media". Two friends who regularly follow one another's digital information can already be aware of each other's lives without actually being physically present to have had a conversation. == Social == Socially speaking, ambient awareness and social media are products of the new generations who are being born or growing up in the digital age, starting circa 1998 and running to current times. Social media is personal media (what you're doing in the moment, how you feel, a picture of where you are) combined with social communication. Social media is the lattice work for ambient awareness. Without social media the state of ambient awareness cannot exist. Artificial Social Networking Intelligence (ASNI) refers to the application of artificial intelligence within social networking services and social media platforms. It encompasses various technologies and techniques used to automate, personalize, enhance, improve, and synchronize user's interactions and experiences within social networks. ASNI is expected to evolve rapidly, influencing how we interact online and shaping their digital experiences. Transparency, ethical considerations, media influence bias, and user control over data will be crucial to ensure responsible development and positive impact. A significant feature of social media is that it is created by those who also consume it. Mostly, those participating in this phenomenon are adolescents, college age, or young adult professionals. According to Dr. Mimi Ito, a cultural anthropologist and Professor in Residence at the University of California at Irvine, the mobile device is the greatest proxy device used to create and distribute Social Media. She reportedly states that "teenagers capture and produce their own media, and stay in constant ambient contact with each other..." using mobile devices. Usually while doing this they are consuming other forms of media such as music or video content via their smart phones, tablets, or other similar devices. Effectively this has led social scientists to believe that learning and multitasking will have a new face as the products of the digital generation enter the work force and begin to integrate their learning methods into the standard preexisting business models of today. Professors Kaplan and Haenlein see ambient awareness as one of the major reasons for the success of such microblogging sites as Twitter. == Origins == The earliest available technology that could be used for constant social contact is the cell phone. For the first time, people could be contacted readily and at will beyond the confines of their work or homes. Then later, with the additional service of texting, one can see the somewhat primitive form of the status update. Since the text message only allows for 160 characters to transmit pertinent information it paved the way for the status update as we know it today. The transition from only having a few points of regular long distance contact, to being constantly available via cell phone, is what primed society for social networking websites. Perhaps the first instance where these websites created the possibility of larger scale ambient awareness was when Facebook installed the news feed. The news feed automatically sends compiled information on all of a users contacts activities directly to them so that they can access all of the happenings in their world from one location. For the first time, becoming someone's Facebook friend was the equivalent of subscribing to a feed of their daily minutiae. Since this innovation, a new wave of micro-blogging services have emerged, such as Twitter or Tumblr. Although these services have often been criticized as containing seemingly meaningless snippets of information, when a follower gathers a certain amount of information, they begin to obtain an ambient understanding of who they are following. This has led to the mass usage of social media as not only a social tool but also as a marketing and business tool. == Uses in marketing == Websites such as Twitter, YouTube, Facebook, and Myspace, among many others, have been used by people in all forms of business to create a closer digital/ambient bond with their clientele base. This is most notably seen in the music industry where social media networking has become the mainstay of all advertising for independent and major artists. The effect of this type of ambient marketing is that the consumer begins to get a sense of the artist's life style and personality. In this way social media outlets and ambient awareness have managed to tighten the gap between consumers and producers in all areas of business. == Uses in business processes == As web-based collaboration tools and social project management suites proliferate, the addition of activity streams to those products help to create business context-specific ambient awareness, and produce a new class of products, such as social project management platforms.

Text-to-image personalization

Text-to-Image personalization is a task in deep learning for computer graphics that augments pre-trained text-to-image generative models. In this task, a generative model that was trained on large-scale data (usually a foundation model), is adapted such that it can generate images of novel, user-provided concepts. These concepts are typically unseen during training, and may represent specific objects (such as the user's pet) or more abstract categories (new artistic style or object relations). Text-to-Image personalization methods typically bind the novel (personal) concept to new words in the vocabulary of the model. These words can then be used in future prompts to invoke the concept for subject-driven generation, inpainting, style transfer and even to correct biases in the model. To do so, models either optimize word-embeddings, fine-tune the generative model itself, or employ a mixture of both approaches. == Technology == Text-to-Image personalization was first proposed during August 2022 by two concurrent works, Textual Inversion and DreamBooth. In both cases, a user provides a few images (typically 3–5) of a concept, like their own dog, together with a coarse descriptor of the concept class (like the word "dog"). The model then learns to represent the subject through a reconstruction based objective, where prompts referring to the subject are expected to reconstruct images from the training set. In Textual Inversion, the personalized concepts are introduced into the text-to-image model by adding new words to the vocabulary of the model. Typical text-to-image models represent words (and sometimes parts-of-words) as tokens, or indices in a predefined dictionary. During generation, an input prompt is converted into such tokens, each of which is converted into a ‘word-embedding’: a continuous vector representation which is learned for each token as part of the model's training. Textual Inversion proposes to optimize a new word-embedding vector for representing the novel concept. This new embedding vector can then be assigned to a user-chosen string, and invoked whenever the user's prompt contains this string. In DreamBooth, rather than optimizing a new word vector, the full generative model itself is fine-tuned. The user first selects an existing token, typically one which rarely appears in prompts. The subject itself is then represented by a string containing this token, followed by a coarse descriptor of the subject's class. A prompt describing the subject will then take the form: "A photo of " (e.g. "a photo of sks cat" when learning to represent a specific cat). The text-to-image model is then tuned so that prompts of this form will generate images of the subject. == Textual Inversion == The key idea in Textual Inversion is to add a new term to the vocabulary of the diffusion model that corresponds to the new (personalized) concept. Textual Inversion operates by inverting the concepts into new pseudo-words within the textual embedding space of a pre-trained text-to-image model. These pseudo-words can be injected into new scenes using simple natural language descriptions, allowing for simple and intuitive modifications. The method allows a user to leverage multi-modal information — using a text-driven interface for ease of editing, but providing visual cues when approaching the limits of natural language. The resulting model is extremely light-weight per concept: only 1K long, but succeeds to encode detailed visual properties of the concept. == Extensions == Several approaches were proposed to refine and improve over the original methods. These include the following. Low-rank Adaptation (LoRA) - an adapter-based technique for efficient finetuning of models. In the case of text-to-image models, LoRA is typically used to modify the cross-attention layers of a diffusion model. Perfusion - a low rank update method that also locks the activations of the key matrix in the diffusion model's cross attention layers to the concept's coarse class. Extended Textual Inversion - a technique that learns an individual word embedding for each layer in the diffusion model's denoising network. Encoder-based methods that use another neural network to quickly personalize a model == Challenges and limitations == Text-to-image personalization methods must contend with several challenges. At their core is the goal of achieving high-fidelity to the personal concept while maintaining high alignment between novel prompts containing the subject, and the generated images (typically referred to as ‘editability’). Another challenge that personalization methods must contend with is memory requirements. Initial implementations of personalization methods required more than 20 Gigabytes of GPU memory, and more recent approaches have reported requirements of more than 40 Gigabytes. However, optimizations such as Flash Attention have since reduced this requirement considerably. Approaches that tune the entire generative model may also create checkpoints that are several gigabytes in size, making it difficult to share or store many models. Embedding based approaches require only a few kilobytes, but typically struggle to preserve identity while maintaining editability. More recent approaches have proposed hybrid tuning goals which optimize both an embedding and a subset of network weights. These can reduce storage requirements to as little as 100 Kilobytes while achieving quality comparable to full tuning methods. Finally, optimization processes can be lengthy, requiring several minutes of tuning for each novel concept. Encoder and quick-tuning methods aim to reduce this to seconds or less.

Virtual Print Fee

Virtual Print Fee (VPF) is a subsidy paid by a film distributor towards the purchase of digital cinema projection equipment for use by a film exhibitor in the presentation of first release motion pictures. The subsidy is paid in the form of a fee per booking of a movie, intended to match the savings that occurs by not shipping a film print. The model is designed to help redistribute the savings realized by studios when using digital distribution instead of film print distribution and is intended to vanish when the transition phase is over when the vast majority of cinemas screens are equipped. == History == The first public demonstration of digital projection for cinema took place at ShoWest in 1999, and it was readily apparent that the technology was further ahead than the business model. Early technology presentations attempted to claim that the technology would pay for itself through new revenues generated by new forms of content. But exhibitors knew their audience, and could see that digital projection was only a replacement technology, creating new financial liabilities, and not new revenue. It wasn’t until the rollout of digital 3-D years later in 2005 that digital projection demonstrated that it could be used to generate additional revenue. The economics were challenging. Film projectors and platters cost in the neighborhood of US$30,000, while early digital projectors cost up to US$150,000. Further, film projectors had a lifetime of 30 years with relatively small annual expenditures in maintenance and replacement parts. On the other hand, exhibitors felt they would be lucky to get 10 years of service from a digital projector, after which there would have to be a refresh in capital expenditure. Meanwhile, distributors would realize significant savings by eliminating the high cost of film prints with corresponding shipping costs, and instead distributing digital files either by satellite or hard drive. The Virtual Print Fee was designed to better balance savings and expenditures for both exhibitors and distributors. It is intended to primarily assist in the replacement of film projectors, and not assist in the purchase of new projection equipment for new construction. To give confidence to financial institutions that digital cinema technology was stable and worthy of investment, Digital Cinema Initiatives was created in 2002, resulting in the release of the first version of the DCI Digital Cinema System Specification in 2005. The DCI Specification continues to be the core specification for digital cinema, establishing the baseline technology and system requirements for which studios will release digital movies. The first set of VPF agreements executed with four major studios were announced by Christie/AIX in November 2005. Christie/AIX at that time was a subsidiary of Access Integrated Technology, now renamed Cinedigm Digital Cinema Corp. The agreements were for the rollout of digital cinema technology to 4000 screens. Since that time, numerous other Digital Cinema Deployment Agreements have been executed around the world, allowing exhibitors in nearly every territory to benefit from VPF subsidies in the conversion from film projection to digital projection.