AI Data Training Jobs

AI Data Training Jobs — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Spatial anti-aliasing

    Spatial anti-aliasing

    In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts (aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphics, digital audio, and many other applications. Anti-aliasing means removing signal components that have a higher frequency than is able to be properly resolved by the recording (or sampling) device. This removal is done before (re)sampling at a lower resolution. When sampling is performed without removing this part of the signal, it causes undesirable artifacts such as black-and-white noise. In signal acquisition and audio, anti-aliasing is often done using an analog anti-aliasing filter to remove the out-of-band component of the input signal prior to sampling with an analog-to-digital converter. In digital photography, optical anti-aliasing filters made of birefringent materials smooth the signal in the spatial optical domain. The anti-aliasing filter essentially blurs the image slightly in order to reduce the resolution to or below that achievable by the digital sensor (the larger the pixel pitch, the lower the achievable resolution at the sensor level). == Examples == In computer graphics, anti-aliasing improves the appearance of "jagged" polygon edges, or "jaggies", so they are smoothed out on the screen. However, it incurs a performance cost for the graphics card and uses more video memory. The level of anti-aliasing determines how smooth polygon edges are (and how much video memory it consumes). Near the top of an image with a receding checker-board pattern, the image is difficult to recognise and often not considered aesthetically pleasing. In contrast, when anti-aliased the checker-board near the top blends into grey, which is usually the desired effect when the resolution is insufficient to show the detail. Even near the bottom of the image, the edges appear much smoother in the anti-aliased image. Multiple methods exist, including the sinc filter, which is considered a better anti-aliasing algorithm. When magnified, it can be seen how anti-aliasing interpolates the brightness of the pixels at the boundaries to produce grey pixels since the space is occupied by both black and white tiles. These help make the sinc filter antialiased image appear much smoother than the original. In a simple diamond image, anti-aliasing blends the boundary pixels; this reduces the aesthetically jarring effect of the sharp, step-like boundaries that appear in the aliased graphic. Anti-aliasing is often applied in rendering text on a computer screen, to suggest smooth contours that better emulate the appearance of text produced by conventional ink-and-paper printing. Particularly with fonts displayed on typical LCD screens, it is common to use subpixel rendering techniques like ClearType. Sub-pixel rendering requires special colour-balanced anti-aliasing filters to turn what would be severe colour distortion into barely-noticeable colour fringes. Equivalent results can be had by making individual sub-pixels addressable as if they were full pixels, and supplying a hardware-based anti-aliasing filter as is done in the OLPC XO-1 laptop's display controller. Pixel geometry affects all of this, whether the anti-aliasing and sub-pixel addressing are done in software or hardware. == Simplest approach to anti-aliasing == The most basic approach to anti-aliasing a pixel is determining what percentage of the pixel is occupied by a given region in the vector graphic - in this case a pixel-sized square, possibly transposed over several pixels - and using that percentage as the colour. A Python program producing a basic plot of a single, white-on-black anti-aliased point using the method is as follows: This method is generally best suited for simple graphics, such as basic lines or curves, and applications that would otherwise have to convert absolute coordinates to pixel-constrained coordinates, such as 3D graphics. It is a fairly fast function, but it is relatively low-quality, and gets slower as the complexity of the shape increases. For purposes requiring very high-quality graphics or very complex vector shapes, this will probably not be the best approach. Note: The plot_antialiased_point routine above cannot blindly set the colour value to the percent calculated. It must add the new value to the existing value at that location up to a maximum of 1. Otherwise, the brightness of each pixel will be equal to the darkest value calculated in time for that location which produces a very bad result. For example, if one point sets a brightness level of 0.90 for a given pixel and another point calculated later barely touches that pixel and has a brightness of 0.05, the final value set for that pixel should be 0.95, not 0.05. For more sophisticated shapes, the algorithm may be generalized as rendering the shape to a pixel grid with higher resolution than the target display surface (usually a multiple that is a power of 2 to reduce distortion), then using bicubic interpolation to determine the average intensity of each real pixel on the display surface. == Signal processing approach to anti-aliasing == In this approach, the ideal image is regarded as a signal. The image displayed on the screen is taken as samples, at each (x,y) pixel position, of a filtered version of the signal. Ideally, one would understand how the human brain would process the original signal, and provide an on-screen image that will yield the most similar response by the brain. The most widely accepted analytic tool for such problems is the Fourier transform; this decomposes a signal into basis functions of different frequencies, known as frequency components, and gives us the amplitude of each frequency component in the signal. The waves are of the form: cos ⁡ ( 2 j π x ) cos ⁡ ( 2 k π y ) {\displaystyle \ \cos(2j\pi x)\cos(2k\pi y)} where j and k are arbitrary non-negative integers. There are also frequency components involving the sine functions in one or both dimensions, but for the purpose of this discussion, the cosine will suffice. The numbers j and k together are the frequency of the component: j is the frequency in the x direction, and k is the frequency in the y direction. The goal of an anti-aliasing filter is to greatly reduce frequencies above a certain limit, known as the Nyquist frequency, so that the signal will be accurately represented by its samples, or nearly so, in accordance with the sampling theorem; there are many different choices of detailed algorithm, with different filter transfer functions. Current knowledge of human visual perception is not sufficient, in general, to say what approach will look best. == Two dimensional considerations == The previous discussion assumes that the rectangular mesh sampling is the dominant part of the problem. The filter usually considered optimal is not rotationally symmetrical, as shown in this first figure; this is because the data is sampled on a square lattice, not using a continuous image. This sampling pattern is the justification for doing signal processing along each axis, as it is traditionally done on one dimensional data. Lanczos resampling is based on convolution of the data with a discrete representation of the sinc function. If the resolution is not limited by the rectangular sampling rate of either the source or target image, then one should ideally use rotationally symmetrical filter or interpolation functions, as though the data were a two dimensional function of continuous x and y. The sinc function of the radius has too long a tail to make a good filter (it is not even square-integrable). A more appropriate analog to the one-dimensional sinc is the two-dimensional Airy disc amplitude, the 2D Fourier transform of a circular region in 2D frequency space, as opposed to a square region. One might consider a Gaussian plus enough of its second derivative to flatten the top (in the frequency domain) or sharpen it up (in the spatial domain), as shown. Functions based on the Gaussian function are natural choices, because convolution with a Gaussian gives another Gaussian whether applied to x and y or to the radius. Similarly to wavelets, another of its properties is that it is halfway between being localized in the configuration (x and y) and in the spectral (j and k) representation. As an interpolation function, a Gaussian alone seems too spread out to preserve the maximum possible detail, and thus the second derivative is added. As an example, when printing a photographic negative with plentiful processing capability and on a printer with a hexagonal pattern, there is no reason to use sinc function interpolation. Such interpolation would treat diagonal lines differently from horizontal and vertical lines, which is like a weak form of aliasing. == Practical real-time anti-aliasing approximations == There are only a handful of primitives used at the lowest level in a real-time rend

    Read more →
  • Digital cinema

    Digital cinema

    Digital cinema is the digital technology used within the film industry to distribute or project motion pictures as opposed to the historical use of reels of motion picture film, such as 35 mm film. Whereas film reels have to be shipped to movie theaters, a digital movie can be distributed to cinemas in a number of ways: over the Internet or dedicated satellite links, or by sending hard drives or optical discs such as Blu-ray discs, then projected using a digital video projector instead of a film projector. Typically, digital movies are shot using digital movie cameras or in animation transferred from a file and are edited using a non-linear editing system (NLE). The NLE is often a video editing application installed in one or more computers that may be networked to access the original footage from a remote server, share or gain access to computing resources for rendering the final video, and allow several editors to work on the same timeline or project. Alternatively a digital movie could be a film reel that has been digitized using a motion picture film scanner and then restored, or, a digital movie could be recorded using a film recorder onto film stock for projection using a traditional film projector. Digital cinema is distinct from high-definition television and does not necessarily use traditional television or other traditional high-definition video standards, aspect ratios, or frame rates. In digital cinema, resolutions are represented by the horizontal pixel count, usually 2K (2048×1080 or 2.2 megapixels) or 4K (4096×2160 or 8.8 megapixels). The 2K and 4K resolutions used in digital cinema projection are often referred to as DCI 2K and DCI 4K. DCI stands for Digital Cinema Initiatives. As digital cinema technology improved in the early 2010s, most theaters across the world converted to digital video projection. Digital cinema technology has continued to develop over the years with RealD 3D, IMAX, RPX, 4DX, Dolby Cinema, and ScreenX, allowing moviegoers more immersive experiences. == History == The transition from film to digital video was preceded by cinema's transition from analog to digital audio, with the release of the Dolby Digital (AC-3) audio coding standard in 1991. Its main basis is the modified discrete cosine transform (MDCT), a lossy audio compression algorithm. It is a modification of the discrete cosine transform (DCT) algorithm, which was first proposed by Nasir Ahmed in 1972 and was originally intended for image compression. The DCT was adapted into the MDCT by J.P. Princen, A.W. Johnson and Alan B. Bradley at the University of Surrey in 1987, and then Dolby Laboratories adapted the MDCT algorithm along with perceptual coding principles to develop the AC-3 audio format for cinema needs. Cinema in the 1990s typically combined analog photochemical images with digital audio. Digital media playback of high-resolution 2K files has at least a 20-year history. Early video data storage units (RAIDs) fed custom frame buffer systems with large memories. In early digital video units, the content was usually restricted to several minutes of material. Transfer of content between remote locations was slow and had limited capacity. It was not until the late 1990s that feature-length films could be sent over the "wire" (Internet or dedicated fiber links). On October 23, 1998, Digital light processing (DLP) projector technology was publicly demonstrated with the release of The Last Broadcast, the first feature-length movie, shot, edited and distributed digitally. In conjunction with Texas Instruments, the movie was publicly demonstrated in five theaters across the United States (Philadelphia, Portland (Oregon), Minneapolis, Providence, and Orlando). === Foundations === In the United States, on June 18, 1999, Texas Instruments' DLP Cinema projector technology was publicly demonstrated on two screens in Los Angeles and New York for the release of Lucasfilm's Star Wars Episode I: The Phantom Menace. In Europe, on February 2, 2000, Texas Instruments' DLP Cinema projector technology was publicly demonstrated, by Philippe Binant, on one screen in Paris for the release of Toy Story 2. From 1997 to 2000, the JPEG 2000 image compression standard was developed by a Joint Photographic Experts Group (JPEG) committee chaired by Touradj Ebrahimi (later the JPEG president). In contrast to the original 1992 JPEG standard, which is a DCT-based lossy compression format for static digital images, JPEG 2000 is a discrete wavelet transform (DWT) based compression standard that could be adapted for motion imaging video compression with the Motion JPEG 2000 extension. JPEG 2000 technology was later selected as the video coding standard for digital cinema in 2004. In 1992, Hughes-JVC was founded by JVC and Hughes Electronics to develop ILA (Image Light Amplifer) digital video projectors for commercial movie theaters using liquid crystal on silicon (LCOS) technology. In 1997, JVC introduced D-ILA (Direct-Drive ILA) technology with a 2K resolution digital video projector. In 2000, JVC introduced a 4K resolution video projector using D-ILA technology. === Initiatives === On January 19, 2000, the Society of Motion Picture and Television Engineers, in the United States, initiated the first standards group dedicated to developing digital cinema. By December 2000, there were 15 digital cinema screens in the United States and Canada, 11 in Western Europe, 4 in Asia, and 1 in South America. Digital Cinema Initiatives (DCI) was formed in March 2002 as a joint project of many motion picture studios (Disney, Fox, MGM, Paramount, Sony Pictures, Universal and Warner Bros.) to develop a system specification for digital cinema. The same month it was reported that the number of cinemas equipped with digital projectors had increased to about 50 in the US and 30 more in the rest of the world. In April 2004, in collaboration with the American Society of Cinematographers, DCI created standard evaluation material (the ASC/DCI StEM material) for testing of 2K and 4K playback and compression technologies. DCI selected JPEG 2000 as the basis for the compression in the system the same year. Initial tests with JPEG 2000 produced bit rates of around 75–125 Mbit/s for 2K resolution and 100–200 Mbit/s for 4K resolution. === Worldwide deployment === In China, in June 2005, an e-cinema system called "dMs" was established and was used in over 15,000 screens spread across China's 30 provinces. DMs estimated that the system would expand to 40,000 screens in 2009. In 2005, the UK Film Council Digital Screen Network launched in the UK by Arts Alliance Media creating a chain of 250 2K digital cinema systems. The roll-out was completed in 2006. This was the first mass roll-out in Europe. AccessIT/Christie Digital also started a roll-out in the United States and Canada. By mid-2006, about 400 theaters were equipped with 2K digital projectors with the number increasing every month. In August 2006, the Malayalam digital movie Moonnamathoral, produced by Benzy Martin, was distributed via satellite to cinemas, thus becoming the first Indian digital cinema. This was done by Emil and Eric Digital Films, a company based at Thrissur using the end-to-end digital cinema system developed by Singapore-based DG2L Technologies. In January 2007, Guru became the first Indian film mastered in the DCI-compliant JPEG 2000 Interop format and also the first Indian film to be previewed digitally, internationally, at the Elgin Winter Garden in Toronto. This film was digitally mastered at Real Image Media Technologies in India. In 2007, the UK became home to Europe's first DCI-compliant fully digital multiplex cinemas; Odeon Hatfield and Odeon Surrey Quays (in London), with a total of 18 digital screens, were launched on 9 February 2007. By March 2007, with the release of Disney's Meet the Robinsons, about 600 screens had been equipped with digital projectors. In June 2007, Arts Alliance Media announced the first European commercial digital cinema Virtual Print Fee (VPF) agreements (with 20th Century Fox and Universal Pictures). In March 2009, AMC Theatres announced that it closed a $315 million deal with Sony to replace all of its movie projectors with 4K HDR digital projectors starting in the second quarter of 2009; it was anticipated that this replacement would be finished by 2012. As digital cinema technology improved in the early 2010s, most theaters across the world converted to digital video projection. In January 2011, the total number of digital screens worldwide was 36,242, up from 16,339 at end 2009 or a growth rate of 121.8 percent during the year. There were 10,083 d-screens in Europe as a whole (28.2 percent of global figure), 16,522 in the United States and Canada (46.2 percent of global figure) and 7,703 in Asia (21.6 percent of global figure). Worldwide progress was slower as in some territories, particularly Latin America and Africa. As of 31 March 2015, 38,719 screens (out of a total of 3

    Read more →
  • Web3D

    Web3D

    Web3D, also called 3D Web, is a group of technologies to display and navigate websites using 3D computer graphics. These technologies enable applications such as online games, virtual reality experiences, interactive product demonstrations, and 3D data visualization directly within web browsers. The emergence of Web3D dates back to 1994, with the advent of VRML, a file format designed to store and display 3D graphical data on the World Wide Web. Modern Web3D is primarily powered by WebGL, a JavaScript API that enables hardware-accelerated 3D graphics rendering in web browsers without requiring plug-ins. == Pre-WebGL era == The emergence of Web3D dates back to 1994, with the advent of VRML, a file format designed to store and display 3D graphical data on the World Wide Web. In October 1995, at Internet World, Template Graphics Software demonstrated a 3D/VRML plug-in for the beta release of Netscape 2.0 by Netscape Communications. The Web3D Consortium was formed to further the collective development of the format. VRML and its successor, X3D, have been accepted as international standards by the International Organization for Standardization and the International Electrotechnical Commission. The main drawback of the technology was the requirement to use third-party browser plug-ins to perform 3D rendering, which slowed the adoption of the standard. Between 2000 and 2010, one of these plug-ins, Adobe Flash Player, was widely installed on desktop computers and was used to display interactive web pages and online games and to play video and audio content. Several Flash-based frameworks appeared that used software rendering and ActionScript 3 to perform 3D computations such as transformations, lighting, and texturing. Most notable among them were Papervision3D and Away3D. Eventually, Adobe developed Stage3D, an API for rendering interactive 3D graphics with GPU-acceleration for its Flash player and AIR products, which was adopted by software vendors. In 2009, an open-source 3D web technology called O3D was introduced by Google. It also required a browser plug-in, but contrary to Flash/Stage3D, was based on JavaScript API. O3D was geared not only for games but also for advertisements, 3D model viewers, product demos, simulations, engineering applications, control and monitoring systems. == WebGL and glTF == WebGL (short for "Web Graphics Library") evolved out of the Canvas 3D experiments started by Vladimir Vukićević at Mozilla Foundation. Vukićević first demonstrated a Canvas 3D prototype in 2006. By the end of 2007, both Mozilla and Opera had made their own separate implementations. In early 2009, the nonprofit technology consortium Khronos Group started the WebGL Working Group, with initial participation from Apple, Google, Mozilla, Opera, and others. Version 1.0 of the WebGL specification was released in March 2011. Major advantages of the new technology include conformity with web standards and near-native 3D performance without the use of any browser plug-ins. Since WebGL is based on OpenGL ES, it works on mobile devices without any additional abstraction layers. For other platforms, WebGL implementations leverage ANGLE to translate OpenGL ES calls to DirectX, OpenGL, or Vulkan API calls. Among notable WebGL frameworks are A-Frame, which uses HTML-based markup for building virtual reality experiences; PlayCanvas, an open-source engine alongside a proprietary cloud-hosted creation platform for building browser games; Three.js, an MIT-licensed framework used to create demoscene from the early 2000s; Unity, which obtained a WebGL back-end in version 5; and Verge3D, which integrates with Blender, 3ds Max, and Maya to create 3D web content. With the rapid adoption of WebGL, a new problem arose—the lack of a 3D file format optimized for the Web. This issue was addressed by glTF, a format that was conceived in 2012 by members of the COLLADA working group. At SIGGRAPH 2012, Khronos presented a demo of glTF, which was then called WebGL Transmissions Format (WebGL TF). On 19 October 2015, the glTF 1.0 specification was released. Version 2.0 glTF uses a physically based rendering material model, proposed by Fraunhofer. Other upgrades include sparse accessors and morph targets for techniques such as facial animation, and schema tweaks and breaking changes for corner cases or performance, such as replacing top-level glTF object properties with arrays for faster index-based access. == Future == "WebGPU" is the working name for a potential web standard and JavaScript API for accelerated graphics and computing, aiming to provide "modern 3D graphics and computation capabilities". It is developed by the W3C "GPU for the Web" Community Group, with engineers from Apple, Mozilla, Microsoft, and Google, among others. WebGPU will not be based on any existing 3D API and will use Rust-like syntax for shaders.

    Read more →
  • Industry Dive

    Industry Dive

    Industry Dive is a United States-based business-to-business news organization with an estimated 18 million readers in more than 25 industries, such as banking and waste management. Since 2022, it has been owned by Informa plc. Industry Dive aims to serve business executives who read news on their mobile phones. The company had an estimated revenue of more than of more than $110 million in 2023. As of 2020, it has more than 300 employees, including 80 journalists and 12 engineers. Its headquarters is in Washington, D.C. == History == Industry Dive was formed in 2012 by Sean Griffey (president), Eli Dickinson (chief technology officer), and Ryan Willumson (chief revenue officer). It was funded with $900,000 from private investors in 2012 and 2013. The company covered five industries: construction, education, marketing, utility, and waste. In 2016, it began its Dive Awards. Industry Dive's revenues quadrupled from 2015 to 2018, putting it in the top half of the Deloitte Technology Fast 500 and the top 20 percent of the Inc. Top 5000 list. In 2019, Falfurrias Capital Partners acquired a majority stake in the company. ID's content marketing clients included IBM, Siemens, and UPS. In 2020, DCA Live named Industry Dive to its "Red Hot Companies" list, which recognizes the D.C. area's 'fastest-growing' companies. In the same year, Industry Dive acquired CFO. In 2021, Industry Dive acquired PharmaVOICE. In 2022, it was purchased by Informa plc, which bought its majority stake from Falfurrias Capital Partners for about $530 million. == Publications == Industry Dive provides news coverage of a variety of industries including agriculture, banking, construction, education, fashion, healthcare, and manufacturing, each using a different website: == Awards == Industry Dive publications have received several national and regional Awards of Excellence from the American Society of Business Publication Editors, including for a series of 2020 articles about Big Pharma and the race for the coronavirus vaccine. The Washington Post recognized Industry Dive as a top place to work for four consecutive years, from 2016 to 2020.

    Read more →
  • Prompt engineering

    Prompt engineering

    Prompt engineering is the process of structuring natural language inputs (known as prompts) to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens. It can also be defined as the practice of designing and refining input instructions given to a generative AI model to produce more accurate, relevant, or useful outputs. Effective prompt engineering involves understanding how a model interprets language, and may include techniques such as few-shot prompting, chain-of-thought prompting, and role assignment. It is increasingly considered a skill for working with large language models (LLMs) in both research and professional contexts. During the 2020s AI boom, prompt engineering became regarded as a business capability across corporations and industries. Employees with the title prompt engineer were hired to create prompts that would increase productivity and efficacy, although the individual title has since lost traction amid AI models that produce better prompts than humans and corporate training in prompting for general employees. Common prompting techniques include multi-shot, chain-of-thought, and tree-of-thought prompting, as well as the use of assigning roles to the model. Automated prompt generation methods, such as retrieval-augmented generation (RAG), provide for greater accuracy and a wider scope of functions for prompt engineers. Prompt injection is a type of cybersecurity attack that targets machine learning models through malicious prompts. == Terminology == The Oxford English Dictionary defines prompt engineering as "The action or process of formulating and refining prompts for an artificial intelligence program, algorithm, etc., in order to optimize its output or to achieve a desired outcome; the discipline or profession concerned with this." In 2023, prompt ("an instruction given to an artificial intelligence program, algorithm, etc., which determines or influences the content it generates") was the runner-up to Oxford's word of the year. === Prompt === A prompt is some natural language text that describes and prescribes the task that an artificial intelligence (AI) should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement referencing context, instructions, and conversation history. The process of prompt engineering may involve designing clear queries, refining wording, providing relevant context, specifying the style of output, and assigning a character for the AI to mimic in order to guide the model toward more accurate, useful, and consistent responses. When communicating with a text-to-image or a text-to-audio model, a typical prompt contains a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompt engineering may be applied to text-to-image models to achieve a desired subject, style, layout, lighting, and aesthetic. === Techniques === Common terms used to describe various specific prompt engineering techniques include chain-of-thought, tree-of-thought, and retrieval-augmented generation (RAG). A 2024 survey of the field identified over 50 distinct text-based prompting techniques, 40 multimodal variants, and a vocabulary of 33 terms used across prompting research, highlighting a present lack of standardised terminology for prompt engineering. Vibe coding is an AI-assisted software development method where a user prompts an LLM with a description of what they want and lets it generate or edit the code. In 2025, "vibe coding" was the Collins Dictionary word of the year. === Context engineering === Context engineering is a related process that focuses on the context elements that accompany user prompts, which include system instructions, retrieved knowledge, tool definitions, conversation summaries, and task metadata. Context engineering is performed to improve reliability, provenance and token efficiency in production LLM systems. The concept emphasises operational practices such as token budgeting, provenance tags, versioning of context artifacts, observability (logging which context was supplied), and context regression tests to ensure that changes to supplied context do not silently alter system behaviour. == Rationale == Research has found that the performance of large language models (LLMs) is highly sensitive to choices such as the ordering of examples, the quality of demonstration labels, and even small variations in phrasing. In some cases, reordering examples in a prompt produced accuracy shifts of more than 40 percent. === In-context learning === A model's ability to temporarily learn from prompts is known as in-context learning. In-context learning is an emergent ability of large language models. It is an emergent property of model scale, meaning that breaks in scaling laws occur, leading to its efficacy increasing at a different rate in larger models than in smaller models. Unlike training and fine-tuning, which produce lasting changes, in-context learning is temporary. Training models to perform in-context learning can be viewed as a form of meta-learning, or "learning to learn". === Prompting to estimate model sensitivity === Research consistently demonstrates that LLMs are highly sensitive to subtle variations in prompt formatting, structure, and linguistic properties. Some studies have shown up to 76 accuracy points across formatting changes in few-shot settings. Linguistic features significantly influence prompt effectiveness—such as morphology, syntax, and lexico-semantic changes—which meaningfully enhance task performance across a variety of tasks. Clausal syntax, for example, improves consistency and reduces uncertainty in knowledge retrieval. This sensitivity persists even with larger model sizes, additional few-shot examples, or instruction tuning. To address sensitivity of models and make them more robust, several evaluative methods have been proposed. FormatSpread facilitates systematic analysis by evaluating a range of plausible prompt formats, offering a more comprehensive performance interval. Similarly, PromptEval estimates performance distributions across diverse prompts, enabling robust metrics such as performance quantiles and accurate evaluations under constrained budgets. == Prompting techniques == === Multi-shot === A prompt may include a few examples for a model to learn from in context, an approach called few-shot learning. For example, the prompt may ask the model to complete "maison → house, chat → cat, chien →", with the expected response being dog. === Chain-of-thought === Chain-of-thought (CoT) prompting is a technique that allows large language models (LLMs) to solve a problem as a series of intermediate steps before giving a final answer. In 2022, Google Brain reported that chain-of-thought prompting improves reasoning ability by inducing the model to answer a multi-step problem with steps of reasoning that mimic a train of thought. Chain-of-thought techniques were developed to help LLMs handle multi-step reasoning tasks, such as arithmetic or commonsense reasoning questions. When applied to PaLM, a 540 billion parameter language model, according to Google, CoT prompting significantly aided the model, allowing it to perform comparably with task-specific fine-tuned models on several tasks, achieving state-of-the-art results at the time on the GSM8K mathematical reasoning benchmark. It is possible to fine-tune models on CoT reasoning datasets to enhance this capability further and stimulate better interpretability. As originally proposed by Google, each CoT prompt is accompanied by a set of input/output examples—called exemplars—to demonstrate the desired model output, making it a few-shot prompting technique. However, according to a later paper from researchers at Google and the University of Tokyo, simply appending the words "Let's think step-by-step" was also effective, which allowed for CoT to be employed as a zero-shot technique. ==== Self-consistency ==== Self-consistency performs several chain-of-thought rollouts, then selects the most commonly reached conclusion out of all the rollouts. === Tree-of-thought === Tree-of-thought prompting generalizes chain-of-thought by generating multiple lines of reasoning in parallel, with the ability to backtrack or explore other paths. It can use tree search algorithms like breadth-first, depth-first, or beam. === Text-to-image prompting === In 2022, text-to-image models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images. Early text-to-image models typically do not understand negation, grammar and sentence structure in the same way as large language models, and may thus requi

    Read more →
  • Awwwards

    Awwwards

    Awwwards (Awwwards Online SL) is an organization that hosts web design competitions and conferences across Europe and the United States. Website owners and developers can participate by submitting their websites for review. Submissions are assessed by a jury, and top entries are presented and awarded prizes on a rotational basis. == Nomination process == Web designers submit their websites through Awwwards' platform for consideration for the Site of the Day. A jury, composed of industry professionals, and the Awwwards community evaluate the entries. The best daily sites are published annually in "The 365 Best Websites Around the World" book. == Jury == The jury consists of international designers, developers, and agencies who assess the creativity, technical skills, and insight of the submitted web projects. The panel's expertise ensures a comprehensive review process. === Developer Award === Awwwards, in partnership with Microsoft, created the Developer Award to recognize web developers who demonstrate excellence in creating websites that meet modern standards. The award highlights websites that work seamlessly across various platforms and devices, using best practices in HTML5, JavaScript, and CSS. == Annual winners == Some prominent Site of the Year winners include Mercedes-Benz, Bloomberg L.P., Bose Corporation, Warner Brothers, Volkswagen, Uber, and Google. == Awwwards conference == Awwwards also organizes two-day conferences featuring speakers from major tech companies and industry leaders such as Microsoft, Google, Spotify, Adobe, Opera, and Smashing Magazine. These events focus on the latest trends in web design and development. Speakers at Awwwards conferences have included notable figures in the design and technology industry such as Stefan Sagmeister, Paula Scher, and design leaders from companies including Wix. == Corporate affairs == === Platform === Awwwards operates an online platform where web designers and developers submit websites for evaluation and awards. Submitted projects are reviewed by a jury based on design, usability, creativity, and content. The platform also serves as a community hub for discovering digital trends, showcasing work, and accessing educational resources including talks and interviews. Design professionals from international companies have participated in Awwwards events and platform content. For example, Wix, a cloud-based web development company known for its website builder tools, has featured prominently in Awwwards conferences, with its design leadership contributing to discussions on design trends and creative thinking.

    Read more →
  • Mosaik Solutions

    Mosaik Solutions

    Mosaik Solutions (formerly American Roamer) was a company that specializes in wireless coverage data and wireless coverage maps, based in Memphis, Tennessee before being acquired by Ookla. The company collects and crowdsources carrier signal quality from major telecommunications providers or users who have its consumer or enterprise mobile application installed. The data is used to provide insights into places around the world without access to cellular coverage and the development of new coverage patterns, as well as to provide maps showing what provider offers the best service in an area. In 2011, the Federal Communications Commission (FCC), recognized Mosaik Solutions as the "industry standard" for the presence of wireless service at the census-block level. == History == In 2016, Mosaik purchased Sensorly, a free app developed to crowdsource cellular network performance service and provide coverage mapping for wireless networks worldwide. == Products and services == === MapELEMENTS === MapELEMENTS software is a visualization tool that allows users to analyze data from the largest cellular coverage database in the world. === CellMaps === CellMaps is an interactive mapping solution that allows companies to show their network coverage directly on their website through an iframe or API. In 2013 Mosaik launched an android app for CellMaps that provides data directly from carriers so that users can determine what carrier meets their needs in a given area. On the map you can overlay multiple carriers, zoom to street-view level, and drop a pin onto any given spot to get a breakdown of carrier service in that area. === Signal Insights App === Signal Insights is an SaaS platform service available for android users that measures and analyzes the customer's experience in cellular or Wi-Fi networks. Indoor mode allows a user to upload a building floor plan and then map and test specific points in the building for cellular or Wi-Fi connectivity. === Sensorly App === Sensorly is a free app that crowdsources cellular network performance to provide coverage mapping worldwide and mobile speed data to help consumers make informed decisions when choosing a cellular carrier. In February 2017, Sensorly launched Map Trip, a feature that allows users to map their routes and share with others their signal data at a particular point in real time. === TowerSource === TowerSource is a resource for locating cell towers and identifying ownership, availability, fiber routes, type and height. It was acquired by Mosaik Solutions in September 2014. === Network Validator === Network Validator is a SaaS solution designed for users to quickly determine whether global cellular networks exist - by country, operator and wireless technology. === CoverageRight === CoverageRight is composed of licensed GIS file datasets that identify the marketed coverage of wireless operators in the United States and worldwide. It enables users to perform spatial analyses, monitor competitive build-outs, analyze coverage trends and assemble roaming footprints. This data has been utilized by the FCC to analyze wireless coverage nationwide. === Network QoE === Network QoE is an enterprise platform that uses crowdsourced data from cellular devices to detect wireless network issues including 3G, 4G and wifi accessibility, network coverage holes and data performance issues. === Wireless Spectrum Report === In March 2017, Mosaik Solutions launched the Wireless Spectrum Report, a tabular dataset detailing facts about spectrum ownership and availability in the United States.

    Read more →
  • TheFWA

    TheFWA

    FWA (Favourite Website Awards) is an international award platform that honors and rewards web designers, developers and agencies around the world for excellence within the field of web design and development. The FWA was founded in May 2000 by Rob Ford. In November 2012, The FWA was the most visited website award program in the history of the internet, with over 170 millions site visits. == Jury == The FWA jury is composed of more than 500 web professionals (200 women + 200 men) from 35 countries. == Awards granted == FWA of the Day (FOTD) : Every day, the FWA jury selects the best project, FWA of the Month (FOTM): Every month, the FWA jury selects the best project, People's Choice Award (PCA) : Every year, a public vote selects the people's favourite project, FWA of the Year (FOTY) : Every year, the FWA jury selects the best project. == Hall Of Fame == The FWA Hall of Fame was established in May 2007 (to celebrate the seventh anniversary of the FWA), as a recognition of web's greatest individuals and companies.

    Read more →
  • Foreign key

    Foreign key

    A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples consisting of the foreign key attributes in one relation, R, must also exist in some other (not necessarily distinct) relation, S; furthermore that those attributes must also be a candidate key in S. In other words, a foreign key is a set of attributes that references a candidate key. For example, a table called TEAM may have an attribute, MEMBER_NAME, which is a foreign key referencing a candidate key, PERSON_NAME, in the PERSON table. Since MEMBER_NAME is a foreign key, any value existing as the name of a member in TEAM must also exist as a person's name in the PERSON table; in other words, every member of a TEAM is also a PERSON. == Summary == The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. In database relational modeling and implementation, a candidate key is a set of zero or more attributes, the values of which are guaranteed to be unique for each tuple (row) in a relation. The value or combination of values of candidate key attributes for any tuple cannot be duplicated for any other tuple in that relation. Since the purpose of the foreign key is to identify a particular row of referenced table, it is generally required that the foreign key is equal to the candidate key in some row of the primary table, or else have no value (the NULL value.). This rule is called a referential integrity constraint between the two tables. Because violations of these constraints can be the source of many database problems, most database management systems provide mechanisms to ensure that every non-null foreign key corresponds to a row of the referenced table. For example, consider a database with two tables: a CUSTOMER table that includes all customer data and an ORDER table that includes all customer orders. Suppose the business requires that each order must refer to a single customer. To reflect this in the database, a foreign key column is added to the ORDER table (e.g., CUSTOMERID), which references the primary key of CUSTOMER (e.g. ID). Because the primary key of a table must be unique, and because CUSTOMERID only contains values from that primary key field, we may assume that, when it has a value, CUSTOMERID will identify the particular customer which placed the order. However, this can no longer be assumed if the ORDER table is not kept up to date when rows of the CUSTOMER table are deleted or the ID column altered, and working with these tables may become more difficult. Many real world databases work around this problem by 'inactivating' rather than physically deleting master table foreign keys, or by complex update programs that modify all references to a foreign key when a change is needed. Foreign keys play an essential role in database design. One important part of database design is making sure that relationships between real-world entities are reflected in the database by references, using foreign keys to refer from one table to another. Another important part of database design is database normalization, in which tables are broken apart and foreign keys make it possible for them to be reconstructed. Multiple rows in the referencing (or child) table may refer to the same row in the referenced (or parent) table. In this case, the relationship between the two tables is called a one to many relationship between the referencing table and the referenced table. In addition, the child and parent table may, in fact, be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known in SQL:2003 as a self-referencing or recursive foreign key. In database management systems, this is often accomplished by linking a first and second reference to the same table. A table may have multiple foreign keys, and each foreign key can have a different parent table. Each foreign key is enforced independently by the database system. Therefore, cascading relationships between tables can be established using foreign keys. A foreign key is defined as an attribute or set of attributes in a relation whose values match a primary key in another relation. The syntax to add such a constraint to an existing table is defined in SQL:2003 as shown below. Omitting the column list in the REFERENCES clause implies that the foreign key shall reference the primary key of the referenced table. Likewise, foreign keys can be defined as part of the CREATE TABLE SQL statement. If the foreign key is a single column only, the column can be marked as such using the following syntax: Foreign keys can be defined with a stored procedure statement. child_table: the name of the table or view that contains the foreign key to be defined. parent_table: the name of the table or view that has the primary key to which the foreign key applies. The primary key must already be defined. col3 and col4: the name of the columns that make up the foreign key. The foreign key must have at least one column and at most eight columns. == Referential actions == Because the database management system enforces referential constraints, it must ensure data integrity if rows in a referenced table are to be deleted (or updated). If dependent rows in referencing tables still exist, those references have to be considered. SQL:2003 specifies 5 different referential actions that shall take place in such occurrences: CASCADE RESTRICT NO ACTION SET NULL SET DEFAULT === CASCADE === Whenever rows in the parent (referenced) table are deleted (or updated), the respective rows of the child (referencing) table with a matching foreign key column will be deleted (or updated) as well. This is called a cascade delete (or update). === RESTRICT === A value cannot be updated or deleted when a row exists in a referencing or child table that references the value in the referenced table. Similarly, a row cannot be deleted as long as there is a reference to it from a referencing or child table. To understand RESTRICT (and CASCADE) better, it may be helpful to notice the following difference, which might not be immediately clear. The referential action CASCADE modifies the "behavior" of the (child) table itself where the word CASCADE is used. For example, ON DELETE CASCADE effectively says "When the referenced row is deleted from the other table (master table), then delete also from me". However, the referential action RESTRICT modifies the "behavior" of the master table, not the child table, although the word RESTRICT appears in the child table and not in the master table! So, ON DELETE RESTRICT effectively says: "When someone tries to delete the row from the other table (master table), prevent deletion from that other table (and of course, also don't delete from me, but that's not the main point here)." RESTRICT is not supported by Microsoft SQL 2012 and earlier. === NO ACTION === NO ACTION and RESTRICT are very much alike. The main difference between NO ACTION and RESTRICT is that with NO ACTION the referential integrity check is done after trying to alter the table. RESTRICT does the check before trying to execute the UPDATE or DELETE statement. Both referential actions act the same if the referential integrity check fails: the UPDATE or DELETE statement will result in an error. In other words, when an UPDATE or DELETE statement is executed on the referenced table using the referential action NO ACTION, the DBMS verifies at the end of the statement execution that none of the referential relationships are violated. This is different from RESTRICT, which assumes at the outset that the operation will violate the constraint. Using NO ACTION, the triggers or the semantics of the statement itself may yield an end state in which no foreign key relationships are violated by the time the constraint is finally checked, thus allowing the statement to complete successfully. === SET NULL, SET DEFAULT === In general, the action taken by the DBMS for SET NULL or SET DEFAULT is the same for both ON DELETE or ON UPDATE: the value of the affected referencing attributes is changed to NULL for SET NULL, and to the specified default value for SET DEFAULT. === Triggers === Referential actions are generally implemented as implied triggers (i.e. triggers with system-generated names, often hidden.) As such, they are subject to the same limitations as user-defined triggers, and their order of execution relative to other triggers may need to be considered; in some cases it may become necessary to replace the referential action with its equivalent user-defined trigger to ensure proper execution order, or to work around mutating-table limitations. Another important limitation appears with transaction isolation: your changes to a row may not be able to fully cascade because the row is ref

    Read more →
  • HtmlUnit

    HtmlUnit

    HtmlUnit is a headless web browser written in Java. It allows high-level manipulation of websites from other Java code, including filling and submitting forms and clicking hyperlinks. It also provides access to the structure and the details within received web pages. HtmlUnit emulates parts of browser behaviour including the lower-level aspects of TCP/IP and HTTP. A sequence such as getPage(url), getLinkWith("Click here"), click() allows a user to navigate through hypertext and obtain web pages that include HTML, JavaScript, Ajax and cookies. This headless browser can deal with HTTPS security, basic HTTP authentication, automatic page redirection and other HTTP headers. It allows Java test code to examine returned pages either as text, an XML DOM, or as collections of forms, tables, and links. The goal is to simulate real browsers; namely Chrome, Firefox and Edge. The most common use of HtmlUnit is test automation of web pages, but sometimes it can be used for web scraping, or downloading website content. == Benefits == Provides high-level API, taking away lower-level details away from the user. Compared to other WebDriver implementations, HtmlUnitDriver is the fastest to implement. It can be configured to simulate a specific browser. == Drawbacks == Element layout and rendering can not be tested. The JavaScript support is not complete, which is one of the areas of ongoing enhancements. == Used technologies == W3C DOM HTTP connection, using Apache HttpComponents JavaScript, using forked Rhino HTML Parsing, NekoHTML CSS: using CSS Parser XPath support, using Xalan == Libraries using HtmlUnit == Selenium WebDriver Spring MVC Test Framework Google Web Toolkit tests WebTest Wetator

    Read more →
  • DiscoVision

    DiscoVision

    DiscoVision is the name of several things related to the video LaserDisc format. It was the original name of the "Reflective Optical Videodisc System" format later known as "LaserVision" or LaserDisc. == Description == MCA DiscoVision, Inc. was a division of entertainment giant MCA (Music Corporation of America), established in 1969 to develop and sell an optical videodisc system. MCA released discs pressed in Carson and Costa Mesa, California on the DiscoVision label from the format's Atlanta, Georgia launch in 1978 to 1982 and the release of the film The Four Seasons. DiscoVision titles included films from Universal Pictures, Paramount Pictures, Warner Bros. Pictures, and Disney content. Agreements were made with Columbia Pictures and United Artists, though no discs were released on the DiscoVision label from either studio. Most of these companies later established their own labels for the format, the first being Paramount with a dozen movies released on the Paramount Home Video label in the summer of 1981. The successor to MCA DiscoVision, DiscoVision Associates (DVA), was the result of a partnership between IBM and MCA. It was hoped that the merger would provide the basis for improvement of the quality of DiscoVision pressings, but no appreciable improvement ever took hold. In 1981, responsibility for the laser videodisc was sold to Pioneer Electronic Corporation, after MCA Discovision had previously started a partnership in 1977 with Pioneer, Universal Pioneer, to produce the Pioneer PR-7820 player (the first industrial model of DiscoVision player from 1978), as well as establishing disc pressing plants in Japan. As part of the partnership, Pioneer, in association with MCA, had a disc replication facility in Kofu, Japan that produced discs. Some of the last DiscoVision label discs were manufactured by Pioneer in Japan. In the same year, MCA discontinued their DiscoVision branding, due to the sale of the technology to Pioneer (who then rebranded the format as LaserDisc) and in turn rebranded their laserdisc releases, now fabricated by Pioneer, under the MCA Videodisc banner; this was changed to the "MCA Home Video" name for both its VHS and videodisc releases. Some of DiscoVision's technical staff went on to form MCA Video Games, in an effort to produce video game cartridges. DiscoVision Associates later evolved into a patent holding company which manages and licenses intellectual property related to LaserDisc, Compact Disc, and optical disc technologies, as well as other non-disc related fields. In 1989, Pioneer acquired DiscoVision Associates where it continues to license its technologies independently. As the portfolio of patent expired, the presence of DiscoVision became less visible. However, it established the success of a patent holding company, which other companies are stimulated to generate royalty income from their own patent portfolio.

    Read more →
  • Mortimer Rogoff

    Mortimer Rogoff

    Mortimer Alan Rogoff (May 2, 1921 – August 1, 2008) was an American inventor, businessman, and author as well as an amateur photographer and radio operator. He is recognized for his work in spread spectrum technology which is the technology that modern cell phones and GPS systems are based on. He is also considered the grandfather of the electronic navigation chart. == Early life == Rogoff was born in Brooklyn, New York. He earned his B.S.E.E. from Rensselaer Polytechnic Institute in 1943 and his M.S.E.E. from Columbia University in 1948. While at Rensselaer he was a member of Kappa Nu fraternity and the Features Editor for the student newspaper. During World War II, he enlisted in the United States Navy and worked on developing radio communication and aerial navigation systems. One of the techniques he developed was undetectable by Axis forces because its power was below that of the background noise and its frequency varied in random ways. This secure transmission was the beginning of spread spectrum technology which would become the basis for GPS and CDMA cellular telephone systems. Although he was never able to patent the technology because it was a military secret he did get some recognition for it almost forty years later when he received the Institute of Electrical and Electronics Engineers’ Pioneer Award in 1981. == Career == Rogoff worked for twenty-two years (1946 to 1968) for ITT Laboratories in New Jersey. In 1958, he became their deputy director of Engineering. He was Vice President of ITT Laboratories from 1962 to 1963. From 1963 to 1968, he was promoted to the corporate staff where he became head of European operations. In 1968 he left ITT to work for the Diebold Group where he became an Executive Vice President. After leaving the Diebold Group he founded several technology and automation businesses, including his own consulting firm, and Teletext Communications Corporation. Later in the 1970s, he was a Principal with Booz Allen Hamilton. In 1979, his book ‘’Calculator Navigation’’ was published. This book demonstrated practical methods for calculating precise ship locations using radio navigation with a consumer calculator. In 1981, he founded a new company, Navigation Sciences Inc., in Bethesda, Maryland. With this company he patented a method for marine navigation that combined radar maps with electronic charts in 1986. This was a major advancement in field. Today, this system is known as the Electronic Chart Display and Information System (ECDIS). Rogoff had seen the need for a new charting system in 1968 from his apartment at 180 East End Avenue in New York City. From there, he saw a boating accident where a life was lost and decided there had to be a way to automate navigation. Rogoff then became of member of the International Maritime Organization’s (IMO) sub-committee on Safety of Navigation, a representative to the International Electrotechnical Commission, and became the chairman of the Radio Technical Commission for Maritime Services Special Committee 109 on Electronic Charts. He was able to use his influence on these boards to push through a proposal of ECDIS standards in 1989 where none has been before. As his friend Giuseppe Carnevali said, “Although nobody could argue against the need for a standard, no one was ready to endorse one; however, nobody was brave enough to oppose it.” A Test Bed project on these proposals was conducted by the United States Coast Guard. The amended standards were accepted by the IMO in November, 1995. In 2000, he was named as a Fellow of the Institute of Navigation. He was also a Fellow of the Institute of Electrical and Electronics Engineers. During this time, he was also president of the Navigational Electronic Charts System Association. == Personal == In 1979, he moved to Washington, D.C. and bought a home in Nantucket, Massachusetts. He married Sheila Zunser in 1943 and they were together for sixty-five years. They had three daughters: Louisa Thompson, Alice Rogoff, and Julia Peach. His sister was sociologist Natalie Rogoff Ramsøy of the University of Oslo. He was a member of the Cosmos Club and President of The Navigational Electronic Chart System Association (NECSA). He was a very good amateur photographer and liked amateur radio (call sign W2EE). He died in Nantucket from bladder cancer. == Patents == Patent number: 4176316 – Secure Communication System – November 27, 1979 With Louis A. DeRosa Patent number: 4590569 – Electronic Navigation System – May 20, 1986 With Peter M. Winkler and John N. Ackley Patent number: RE34004 – Secure Communication System – July 21, 1992 With Louis A. DeRosa == Publications == Rogoff, Mortimer September 1957. Automatic Analysis of Infrared Spectra. Annals of the New York Academy of Sciences; vol. 69: no. 1: 27–37. Gen. P.C. Sandretto and Mortimer Rogoff. 1958 “A Novel Concept for Application to the Control of Airways Traffic.” NAVIGATION: Journal of The Institute of Navigation; vol. 6: no. 2: 102–107 Rogoff, Mortimer 1979. Calculator Navigation; ISBN 0-393-03192-6. Published by W.W. Norton & Company (New York and London). Rogoff, Mortimer December 1985. Electronic Charting. Yachting; vol. 158: no. 6: 54–57. Rogoff, Mortimer Winter 1990. Electronic Charts in the Nineties. NAVIGATION: Journal of The Institute of Navigation; vol. 37: no. 4: 305–318.

    Read more →
  • Language model benchmark

    Language model benchmark

    A language model benchmark is a standardized test designed to evaluate the performance of language models on various natural language processing tasks. These tests are intended for comparing different models' capabilities in areas such as language understanding, generation, and reasoning. Benchmarks generally consist of a dataset and corresponding evaluation metrics. The dataset provides text samples and annotations, while the metrics measure a model's performance on tasks like answering questions, text classification, and machine translation. These benchmarks are developed and maintained by academic institutions, research organizations, and industry players to track progress in the field. In addition to accuracy, the metrics can include throughput, energy efficiency, bias, trust, and sustainability. == Overview == === Types === Benchmarks may be described by the following adjectives, not mutually exclusive: Classical: These tasks are studied in natural language processing, even before the advent of deep learning. Examples include the Penn Treebank for testing syntactic and semantic parsing, as well as bilingual translation benchmarked by BLEU scores. Question answering: These tasks have a text question and a text answer, often multiple-choice. They can be open-book or closed-book. Open-book QA resembles reading comprehension questions, with relevant passages included as annotation in the question, in which the answer appears. Closed-book QA includes no relevant passages. Closed-book QA is also called open-domain question-answering. Before the era of large language models, open-book QA was more common, and understood as testing information retrieval methods. Closed-book QA became common since GPT-2 as a method to measure knowledge stored within model parameters. Omnibus: An omnibus benchmark combines many benchmarks, often previously published. It is intended as an all-in-one benchmarking solution. Reasoning: These tasks are usually in the question-answering format, but are intended to be more difficult than standard question answering. Multimodal: These tasks require processing not only text, but also other modalities, such as images and sound. Examples include OCR and transcription. Agency: These tasks are for a language-model–based software agent that operates a computer for a user, such as editing images, browsing the web, etc. Adversarial: A benchmark is "adversarial" if the items in the benchmark are picked specifically so that certain models do badly on them. Adversarial benchmarks are often constructed after state of the art (SOTA) models have saturated (achieved 100% performance) a benchmark, to renew the benchmark. A benchmark is "adversarial" only at a certain moment in time, since what is adversarial may cease to be adversarial as newer SOTA models appear. Public/Private: A benchmark might be partly or entirely private, meaning that some or all of the questions are not publicly available. The idea is that if a question is publicly available, then it might be used for training, which would be "training on the test set" and invalidate the result of the benchmark. Usually, only the guardians of the benchmark have access to the private subsets, and to score a model on such a benchmark, one must send the model weights, or provide API access, to the guardians. The boundary between a benchmark and a dataset is not sharp. Generally, a dataset contains three "splits": training, test, and validation. Both the test and validation splits are essentially benchmarks. In general, a benchmark is distinguished from a test/validation dataset in that a benchmark is typically intended to be used to measure the performance of many different models that are not trained specifically for doing well on the benchmark, while a test/validation set is intended to be used to measure the performance of models trained specifically on the corresponding training set. In other words, a benchmark may be thought of as a test/validation set without a corresponding training set. Conversely, certain benchmarks may be used as a training set, such as the English Gigaword or the One Billion Word Benchmark, which in modern language is just the negative log-likelihood loss on a pretraining set with 1 billion words. Indeed, the distinction between benchmark and dataset in language models became sharper after the rise of the pretraining paradigm, whereby a model is first trained on massive, unlabeled datasets to learn general language patterns, syntax, and knowledge (pretraining), and the base model is then adapted to specific, downstream tasks using smaller, labeled datasets (fine-tuning). === Lifecycle === Generally, the life cycle of a benchmark consists of the following steps: Inception: A benchmark is published. It can be simply given as a demonstration of the power of a new model (implicitly) that others then picked up as a benchmark, or as a benchmark that others are encouraged to use (explicitly). Growth: More papers and models use the benchmark, and the performance on the benchmark grows. Maturity, degeneration or deprecation: A benchmark may be saturated, after which researchers move on to other benchmarks. Progress on the benchmark may also be neglected as the field moves to focus on other benchmarks. Renewal: A saturated benchmark can be upgraded to make it no longer saturated, allowing further progress. === Construction === Like datasets, benchmarks are typically constructed by several methods, individually or in combination: Web scraping: Ready-made question-answer pairs may be scraped online, such as from websites that teach mathematics and programming. Conversion: Items may be constructed programmatically from scraped web content, such as by blanking out named entities from sentences, and asking the model to fill in the blank. This was used for making the CNN/Daily Mail Reading Comprehension Task. Crowd sourcing: Items may be constructed by paying people to write them, such as on Amazon Mechanical Turk. This was used for making the MCTest. === Evaluation === Generally, benchmarks are fully automated. This limits the questions that can be asked. For example, with mathematical questions, "proving a claim" would be difficult to automatically check, while "calculate an answer with a unique integer answer" would be automatically checkable. With programming tasks, the answer can generally be checked by running unit tests, with an upper limit on runtime. The benchmark scores are of the following kinds: For multiple choice or cloze questions, common scores are accuracy (frequency of correct answer), precision, recall, F1 score, etc. pass@n: The model is given n {\displaystyle n} attempts to solve each problem. If any attempt is correct, the model earns a point. The pass@n score is the model's average score over all problems. k@n: The model makes n {\displaystyle n} attempts to solve each problem, but only k {\displaystyle k} attempts out of them are selected for submission. If any submission is correct, the model earns a point. The k@n score is the model's average score over all problems. cons@n: The model is given n {\displaystyle n} attempts to solve each problem. If the most common answer is correct, the model earns a point. The cons@n score is the model's average score over all problems. Here "cons" stands for "consensus" or "majority voting". The pass@n score can be estimated more accurately by making N > n {\displaystyle N>n} attempts, and use the unbiased estimator 1 − ( N − c n ) ( N n ) {\displaystyle 1-{\frac {\binom {N-c}{n}}{\binom {N}{n}}}} , where c {\displaystyle c} is the number of correct attempts. For less well-formed tasks, where the output can be any sentence, there are the following commonly used scores including BLEU ROUGE, METEOR, NIST, word error rate, LEPOR, CIDEr, and SPICE. === Issues === error: Some benchmark answers may be wrong. ambiguity: Some benchmark questions may be ambiguously worded. subjective: Some benchmark questions may not have an objective answer at all. This problem generally prevents creative writing benchmarks. Similarly, this prevents benchmarking writing proofs in natural language, though benchmarking proofs in a formal language is possible. open-ended: Some benchmark questions may not have a single answer of a fixed size. This problem generally prevents programming benchmarks from using more natural tasks such as "write a program for X", and instead uses tasks such as "write a function that implements specification X". inter-annotator agreement: Some benchmark questions may be not fully objective, such that even people would not agree with 100% on what the answer should be. This is common in natural language processing tasks, such as syntactic annotation. shortcut: Some benchmark questions may be easily solved by an "unintended" shortcut. For example, in the SNLI benchmark, having a negative word like "not" in the second sentence is a strong signal for the "Contradiction" category, regardless of what the se

    Read more →
  • International World Wide Web Conference Committee

    International World Wide Web Conference Committee

    The International World Wide Web Conference Committee (abbreviated as IW3C2 also written as IW3C2) is a professional non-profit organization registered in Switzerland (Article 60ff of the Swiss Civil Code) that promotes World Wide Web research and development. The IW3C2 organizes and hosts the annual World Wide Web Conference in conjunction with the W3C. The IW3C2 was founded by Joseph Hardin and Robert Cailliau at a meeting held in Boston, United States, on 14 August 1994 to prepare for the upcoming Second International World Wide Web Conference in Chicago. The IW3C2 formally became an incorporated entity in May 1996 at the fifth conference in Paris, France. The organization is governed by laws of the Swiss Confederation and the By-laws. == Abbreviation == The abbreviation for the International World Wide Web Conference Committee as IW3C2 is as follow: I- The I is represents the leading I in International. W3- The W3 represents the three 3 leading W's in World Wide Web. C2- The C2 represents the three 2 leading C's in Conference Committee. == Mission == The mission of the IW3C2 is: To coordinate the organization and planning of the international WWW conference series and ensure that it remains the foremost conference addressing World Wide Web research and development; To promote a collaborative spirit among conference attendees that is essential to the success of the series; To ensure the global geographical diversity of conference sites and provide support to local organizers at those sites; To make sure that all content arising from these conferences and forums is permanently and openly available on the widest possible scale; To preserve the history of the conference series; To encourage the global development of the World Wide Web through collaboration with WWW standards organizations; To provide a permanent, broad-based international body to achieve these purposes. == Conferences == The conferences are organized by the IW3C2 in collaboration with local organizing committees and technical program committees. The series provides an open forum in which all opinions can be presented, subject to a strict process of peer review. The proceedings of the conference are published in the ACM Digital Library. === Endorsed conferences === The IW3C2 has endorsed regional conferences devoted to a special topic of the Web by working with endorsed conferences on cross-promotion, publicity and programs. == Membership == Members of the IW3C2 are ordinary members, ex officio members, non-voting members, and officers. === Ordinary members === Ordinary members are elected for a period of 3 years during a general meeting. Members are nominated due to their recognition in the WWW community and represent themselves. Members can be re-elected only after at least one year of absence. The following are the founding members at the time when IW3C2 was officially incorporated in May 1996: Jean-François Abramatic Tim Berners-Lee Robert Cailliau Dale Dougherty Ira Goldstein Joseph Hardin Tim Krauskopf Detlef Krömker Corinne Moore R. P. Channing Rodgers Albert Vezza Stuart Weibel Yuri Rubinsky (died prior to incorporation) The following are the current (April 2016) ordinary members: Robin Chen Chin-Wan Chung Allan Ellis Wendy Hall - IW3C2 Chair Ivan Herman Arun Iyengar - IW3C2 Vice Chair Irwin King Yoelle Maarek Luc Mariaux - IW3C2 Treasurer Daniel Schwabe - IW3C2 Vice-Chair === Ex officio members === Ex officio members are selected from the immediate past conference general co-chairs and from future conference co-chairs. Their term expires one year after the conference they organized. Ex officio members can be elected as ordinary members. The following are current (April 2016) ex officio members and the conference with which they are affiliated: Jacqueline Bourdeau - WWW2016 James Hendler - WWW2016 Rick Barrett - WWW2017 Rick Cummings - WWW2017 Laurent Flory - WWW2018 Fabien Gandon - WWW2018 === Officers === The IW3C2 officers consist of a chairperson, a vice-chair (chairperson-elect), a secretary, a treasurer, and other appointees. Officers are elected during a general meeting (usually at the annual WWW conference) and serve for one year. They can be re-elected an indefinite number of times. == The Seoul Test of Time Award == This annual award, presented at the WWW conference, is made possible by a generous contribution from the organizers of WWW2014 (Seoul Korea). Recipients are determined by the IW3C2 and honor the author, or authors, of a paper presented at a previous WWW conference that has "stood the test of time." The first award, announced at WWW2015 (Florence Italy), recognized Sergey Brin and Larry Page, the founders of Google. The recipients of the WWW2016 award are LinkIn scientist Dr. Badrul Sarwar and University of Minnesota professors George Karypis, Joseph Konstan, and John Riedl (posthumous) for their work in item-item collaborative filtering.

    Read more →
  • List of operating systems

    List of operating systems

    This is a list of operating systems. Computer operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. Criteria for inclusion is notability, as shown either through an existing Wikipedia article or citation to a reliable source. == Proprietary == === Acorn Computers === Arthur ARX MOS RISC iX RISC OS === Amazon === Fire OS === Amiga Inc. === AmigaOS AmigaOS 1.0-3.9 (Motorola 68000) AmigaOS 4 (PowerPC) Amiga Unix (a.k.a. Amix) === Amstrad === AMSDOS Contiki CP/M 2.2 CP/M Plus SymbOS === Apple === Apple II Apple DOS Apple Pascal ProDOS GS/OS GNO/ME Contiki Apple III Apple SOS Apple Lisa Mac Classic Mac OS A/UX (UNIX System V with BSD extensions) Copland MkLinux Pink Rhapsody macOS (formerly Mac OS X and OS X) macOS Server (formerly Mac OS X Server and OS X Server) Apple Network Server IBM AIX (Apple-customized) Apple MessagePad Newton OS iPhone and iPod Touch iOS (formerly iPhone OS) iPad iPadOS Apple Watch watchOS Apple TV tvOS Embedded operating systems bridgeOS Apple Vision Pro visionOS Embedded operating systems A/ROSE iPod software (unnamed embedded OS for iPod) Unnamed NetBSD variant for Airport Extreme and Time Capsule === Apollo Computer, Hewlett-Packard === Domain/OS – One of the first network-based systems. Run on Apollo/Domain hardware. Later bought by Hewlett-Packard. === Atari === Atari DOS (for 8-bit computers) Atari TOS Atari MultiTOS Contiki (for 8-bit, ST, Portfolio) === BAE Systems === XTS-400 === Be Inc. === BeOS BeIA BeOS r5.1d0 magnussoft ZETA (based on BeOS r5.1d0 source code, developed by yellowTAB) === Bell Labs === Unix ("Ken's new system," for its creator (Ken Thompson), officially Unics and then Unix, the prototypic operating system created in Bell Labs in 1969 that formed the basis for the Unix family of operating systems) UNIX Time-Sharing System v1 UNIX Time-Sharing System v2 UNIX Time-Sharing System v3 UNIX Time-Sharing System v4 UNIX Time-Sharing System v5 UNIX Time-Sharing System v6 MINI-UNIX PWB/UNIX USG CB Unix UNIX Time-Sharing System v7 (It is from Version 7 Unix (and, to an extent, its descendants listed below) that almost all Unix-based and Unix-like operating systems descend.) Unix System III Unix System IV Unix System V Unix System V Releases 2.0, 3.0, 3.2, 4.0, and 4.2 UNIX Time-Sharing System v8 UNIX Time-Sharing System v9 UNIX Time-Sharing System v10 Non-Unix Operating Systems: BESYS Plan 9 from Bell Labs Inferno === Burroughs Corporation, Unisys === Burroughs MCP === CII === Siris 8 === Commodore International === GEOS AmigaOS AROS Research Operating System === Control Data Corporation === ==== Lower 3000 series ==== SCOPE (Supervisory Control Of Program Execution) ==== Upper 3000 series ==== SCOPE (Supervisory Control Of Program Execution) Drum SCOPE ==== 6x00 and related Cyber ==== Chippewa Operating System (COS) MACE (Mansfield and Cahlander Executive) Kronos (Kronographic OS) NOS (Network Operating System) NOS/VE (NOS Virtual Environment) SCOPE (Supervisory Control Of Program Execution) NOS/BE NOS Batch Environment SIPROS (Simultaneous Processing Operating System) ==== Star-100 ==== Multiple Console Time Sharing System (MCTS), from General Motors Research === CloudMosa === Puffin OS === Convergent Technologies === Convergent Technologies Operating System (CTOS) – later acquired by Unisys === Cromemco === Cromemco DOS (CDOS) – a Disk Operating system compatible with CP/M Cromix – a multitasking, multi-user, Unix-like OS for Cromemco microcomputers with Z80A and/or 68000 CPU === Data General === AOS for 16-bit Data General Eclipse computers and AOS/VS for 32-bit (MV series) Eclipses, MP/AOS for microNOVA-based computers DG/UX RDOS Real-time Disk Operating System, with variants: RTOS and DOS (not related to PC DOS, MS-DOS etc.) === Datapoint === CTOS Cassette Tape Operating System for the Datapoint 2200 DOS Disk Operating System for the Datapoint 2200, 5500, and 1100 === DDC-I, Inc. === Deos – Time & Space Partitioned RTOS, Certified to DO-178B, Level A since 1998 HeartOS – POSIX-based Hard Real-Time Operating System === Digital Research, Inc. === CP/M CP/M CP/M for Intel 8080/8085 and Zilog Z80 Personal CP/M, a refinement of CP/M CP/M Plus with BDOS 3.0 CP/M-68K CP/M for Motorola 68000 CP/M-8000 CP/M for Zilog Z8000 CP/M-86 CP/M for Intel 8088/8086 CP/M-86 Plus Personal CP/M-86 MP/M Multi-user version of CP/M-80 MP/M II MP/M-86 Multi-user version of CP/M-86 MP/M 8-16, a dual-processor variant of MP/M for 8086 and 8080 CPUs. Concurrent CP/M, the successor of CP/M-80 and MP/M-80 Concurrent CP/M-86, the successor of CP/M-86 and MP/M-86 Concurrent CP/M 8-16, a dual-processor variant of Concurrent CP/M for 8086 and 8080 CPUs. Concurrent CP/M-68K, a variant for the 68000 DOS Concurrent DOS, the successor of Concurrent CP/M-86 with PC-MODE Concurrent PC DOS, a Concurrent DOS variant for IBM compatible PCs Concurrent DOS 8-16, a dual-processor variant of Concurrent DOS for 8086 and 8080 CPUs Concurrent DOS 286 Concurrent DOS XM, a real-mode variant of Concurrent DOS with EEMS support Concurrent DOS 386 Concurrent DOS 386/MGE, a Concurrent DOS 386 variant with advanced graphics terminal capabilities Concurrent DOS 68K, a port of Concurrent DOS to Motorola 68000 CPUs with DOS source code portability capabilities FlexOS 1.0 – 2.34, a derivative of Concurrent DOS 286 FlexOS 186, a variant of FlexOS for terminals FlexOS 286, a variant of FlexOS for hosts Siemens S5-DOS/MT, an industrial control system based on FlexOS IBM 4680 OS, a POS operating system based on FlexOS IBM 4690 OS, a POS operating system based on FlexOS Toshiba 4690 OS, a POS operating system based on IBM 4690 OS and FlexOS FlexOS 386, a later variant of FlexOS for hosts IBM 4690 OS, a POS operating system based on FlexOS Toshiba 4690 OS, a POS operating system based on IBM 4690 OS and FlexOS FlexOS 68K, a derivative of Concurrent DOS 68K Multiuser DOS, the successor of Concurrent DOS 386 CCI Multiuser DOS Datapac Multiuser DOS Datapac System Manager, a derivative of Datapac Multiuser DOS IMS Multiuser DOS IMS REAL/32, a derivative of Multiuser DOS IMS REAL/NG, the successor of REAL/32 DOS Plus 1.1 – 2.1, a single-user, multi-tasking system derived from Concurrent DOS 4.1 – 5.0 DR-DOS 3.31 – 6.0, a single-user, single-tasking native DOS derived from Concurrent DOS 6.0 Novell PalmDOS 1.0 Novell "Star Trek" Novell DOS 7, a single-user, multi-tasking system derived from DR DOS Caldera OpenDOS 7.01 Caldera DR-DOS 7.02 and higher === Digital Equipment Corporation, Compaq, Hewlett-Packard, Hewlett Packard Enterprise === Batch-11/DOS-11 OS/8 RSTS/E – multi-user time-sharing OS for PDP-11s RSX-11 – multiuser, multitasking OS for PDP-11s RT-11 – single user OS for PDP-11 TOPS-10 – for the PDP-10 TENEX – an ancestor of TOPS-20 from BBN, for the PDP-10 TOPS-20 – for the PDP-10 DEC MICA – for the DEC PRISM Digital UNIX – derived from OSF/1, became HP's Tru64 UNIX Ultrix VMS – originally by DEC (now by VMS Software Inc.) for the VAX mini-computer range; later renamed OpenVMS and ported to Alpha, and subsequently ported to Intel Itanium and then to x86-64 WAITS – for the PDP-6 and PDP-10 === ENEA AB === OSE – Flexible, small footprint, high-performance RTOS for control processors === Fujitsu === Towns OS XSP OS/IV MSP MSP-EX === GEC Computers === COS DOS OS4000 === General Electric, Honeywell, Bull === Real-Time Multiprogramming Operating System GCOS Multics === Google === ChromiumOS is an open source operating system development version of ChromeOS. Both operating systems are based on the Linux kernel. ChromeOS is designed to work exclusively with web applications, though has been updated to run Android apps with full support for Google Play Store. Announced on July 7, 2009, ChromeOS is currently publicly available and was released summer 2011. The ChromeOS source code was released on November 19, 2009, under the BSD license as ChromiumOS. Container-Optimized OS (COS) is an operating system that is optimized for running Docker containers, based on ChromiumOS. Android is an operating system for mobile devices. It consists of Android Runtime (userland) with Linux (kernel), with its Linux kernel modified to add drivers for mobile device hardware and to remove unused Vanilla Linux drivers. gLinux, a Linux distribution that Google uses internally Fuchsia is a capability-based real-time operating system (RTOS) scalable to universal devices, in early development, from the tiniest embedded hardware, wristwatches, tablets to the largest personal computers. Unlike ChromeOS and Android, it is not based on the Linux kernel, but instead began on a new microkernel called "Zircon", derived from "Little Kernel". Wear OS a version of Google's Android operating system designed for smartwatches and other wearables. === Green Hills Software === INTEGRITY – Reliable Operating system INTEGRITY-178B – A DO-178B certified version of INTEGRITY. μ-

    Read more →