HFST | Aizhi

Helsinki Finite-State Technology (HFST) is a computer programming library and set of utilities for natural language processing with finite-state automata and finite-state transducers. It is free and open-source software, released under a mix of the GNU General Public License version 3 (GPLv3) and the Apache License. == Features == The library functions as an interchanging interface to multiple backends, such as OpenFST, foma and SFST. The utilities comprise various compilers, such as hfst-twolc (a compiler for morphological two-level rules), hfst-lexc (a compiler for lexicon definitions) and hfst-regexp2fst (a regular expression compiler). Functions from Xerox's proprietary scripting language xfst is duplicated in hfst-xfst, and the pattern matching utility pmatch in hfst-pmatch, which goes beyond the finite-state formalism in having recursive transition networks (RTNs). The library and utilities are written in C++, with an interface to the library in Python and a utility for looking up results from transducers ported to Java and Python. Transducers in HFST may incorporate weights depending on the backend. For performing FST operations, this is currently only possible via the OpenFST backend. HFST provides two native backends, one designed for fast lookup (hfst-optimized-lookup), the other for format interchange. Both of them can be weighted. == Uses == HFST has been used for writing various linguistic tools, such as spell-checkers, hyphenators, and morphologies. Morphological dictionaries written in other formalisms have also been converted to HFST's formats.

Roposo

Roposo is an Indian video-sharing social media service, owned by Glance, a subsidiary of InMobi. Roposo provides a space where users can share posts related to different topics like food, comedy, music, poetry, fashion and travel. It is a platform where people express visually with homemade videos and photos. The app offers a TV-like browsing experience with user-generated content on its channels. Users can also use editing tools on the platform and upload their content. == History == Established in July 2014 under Relevant E-solutions Pvt. Ltd., Roposo is the brainchild of three IIT Delhi alumni – Mayank Bhangadia, Avinash Saxena, and Kaushal Shubhank. Under Bhangadia's leadership, the company pivoted from a fashion-based network into a short-form video platform with AI-powered moderation, and its journey was featured as a Harvard Business Publishing case study. In November 2019, Roposo was acquired by InMobi's Glance Digital Experience Pvt. Ltd.(the mobile content platform and part of the InMobi Group). When the Chinese-owned video-sharing app TikTok was banned on 30 June 2020, the app saw a huge spike in users with several TikTok users registering on Roposo. == Technology == The open platform has some features such as a TV-like browsing, different channels, a chat feature that lets buyers and sellers converse directly through the platform, and creation tools such as an option to add voice-over, music and GIF stickers for videos and photos.

Höhere Graphische Bundes-Lehr- und Versuchsanstalt

The Höhere Graphische Bundes-Lehr- und Versuchsanstalt (HGBLuVA) ("Higher Federal Institution for Graphic Education and Research"), now commonly known as "die Graphische", founded in 1888 in Vienna, is a vocational college for professions in visual communication and media technology in Austria. == History == === Opening === Originally set up as a photographic research institute by the President of the Photographic Society, the graphic teaching and research institute (GLV) was created through the incorporation of the photographic school (a department for photographic reproduction processes connected to the Salzburg State Building School) and the Hörwarter general drawing school in Vienna. Since its foundation, it has made an important contribution to the establishment and development of the graphic professions. According to a resolution of March 14, 1887, the City Council of Vienna made three floors of the municipal building in Vienna VII, Westbahnstraße 25, available to the former Schottenfelder Realschule for the establishment of a teaching and research institute for photography and reproduction processes. The k. k. Lehr- und Versuchsanstalt für Photographie und Reproductionsverfahren, founded and directed (1888–1923) by Josef Maria Eder, previously of the Technologische Gewerbemuseum (Museum of Applied Technology), for which he established a Section for Photography and Reproduction Techniques, and the Vienna State Trade School where, recently qualified as a university lecturer, he began teaching chemistry and physics in 1881. It opened on March 1, 1888 with 108 students. In the next school year the number of students rose to 174. In 1890, Eder placed a Wothly solar camera (an early means of enlarging negatives) on the roof. In the context of the history of vocational schools and the applied arts, pioneering educational reforms in Austria from the 1870s created institutions like it outside the format of the classical university, it being a special variation on the “state trade school” (“Staats-Gewerbeschule”). Eder based his institution on earlier foreign models such as the Conservatoire des arts et métiers in Paris (founded 1794), that housed a museum of history and technology and hosted with evening lectures and demonstrations, with lectures in photography commencing in 1891. From 1897 onwards the name Graphische Lehr- und Versuchsanstalt came into being . In 1906, Emperor Franz Joseph granted the school the designation “Imperial and Royal” in the title, and the Republic of Austria confirmed this distinction when the school's Federal Chancellery approved the use of the national coat of arms. === The beginnings === The GLV was instituted on August 27, 1887 "by the highest resolution to approve the activation of this teaching and research institute in Vienna on March 1, 1888". The aim of the institute was the “training of specialist photographers, retouchers, collotype printers, photolithographers, etc., the instruction of artists, scholars and technicians who want to learn photography as an auxiliary science, furthermore the testing of equipment, chemicals and the implementation of independent scientific investigations in the areas of Photochemistry and Related Subjects”. The school consisted of two departments; the Institute for Photography and Reproduction Processes and the Research Institute, and in 1891 the Board of Book Printers and Type Founders pointed out the urgent need to add a department for book printers to the school. In 1897 an additional section for the book and illustration trade was opened, the school called "KK Graphische Lehr- und Versuchsanstalt" was then divided into four sections: Section I: Institute for Photography and Reproduction (corresponds to the former Institute for Photography and Reproduction Processes) Section II: College for the book and illustration trade Section III: Research institute for photochemistry and graphic printing processes (corresponds to the original research institute) Section IV: Collections: graphic collection, library and equipment collection The first original lithographs by famous artists such as Luigi Kasimir and Tina Blau are thanks to the special course for lithography and lithography introduced in 1905 and 'algraphy' - a planographic printing process from an aluminum plate instead of the stone used in lithography - was first taught in Austria in 1896 at the GLV. The specialty course for lithography and lithography existed until 1913/14, after which a specialist course for xylography (wood engraving and woodcuts) was offered. In 1908 the graphic arts department was set up on the top floor of the neighbouring house at Westbahnstraße 27 connected by a spiral staircase still in existence in the courtyard at the current location on Leyserstraße. === Women in the graphic teaching and research institute === From 1908 women were also officially admitted. For the period from 1888 to 1918/19, a total of 718 female students at the Graphische are recorded in the largely preserved class lists. Due to changes and new requirements in the job description, the proportion of women continued to grow, so that in some classes it exceeded two thirds. === The Graphics Department === In 1916, the school statute was changed: all-day lessons with photography internship in the 1st and 2nd years as well as training for disabled people were introduced and a drawing school was added. After the First World War, the school was renamed several times: In 1919 the name was "Deutsch-Österreichische Graphische Lehr- und Versuchsanstalt"; changed in 1920 to "Staatliche Graphische Lehr- und Versuchsanstalt" and in 1923 to "Graphic Education and Research Institute". === The school in the time of National Socialism === The "annexation of Austria by Germany" resulted in organisational restructuring: semesters were introduced and the GLV was made a subordinate level of a university of the graphic arts administered in Leipzig. In 1939 the school became a state graphic teaching and research institute . Up to this point, two thirds of all Austrian postage stamps had been designed and engraved in the Graphische. === Post-war period === In 1945 the period of study at the technical school was extended to four years. In 1948, “manual graphics” became “commercial graphics” followed by an honours year. In 1959, a department A was developed: a three-class specialist department for photography with a master class, and a department B: a specialist department for commercial graphics with four classes and an honours year. Through further school reforms, the university entrance qualification was acquired with the completion of the now five-year course and honours qualification. In 1967, due to a lack of space, the Westbahnstrasse was moved to the new Carl Appel building in Leyserstrasse. === The new building, 1963 === On May 22, 1963, the foundation stone of the new campus was laid in the 14th district in the Breitenseer Strasse, Leyserstrasse and Spallartgasse area (Kommandogebäude Theodor Körner). In 1967 the move to the new building began and in 1968 the official opening coincided with the 80th anniversary of the school. In 1963/64 the first year of the five-year high school for reprography and printing technology began. There was also a four-year technical school. With the advent of personal computers and their use in the graphics industry, change comes first in typesetting and later in image processing, and in 1984 the advent of desktop publishing brought a revolution that permanently challenged the distinction between photographer, typesetter, layout artist and printer. In 1988, the Graphische celebrated its 100th anniversary. The rapid development of technology shaped school events in the 1980s, as did the rapid advance of offset printing - albeit at the expense of Letterpress printing. In reproduction technology, scanner technology for the production of colour separations displaced reprography. === Renovation, 2006 === Due to renovation work on the building in Leyserstraße, the management and the photography, multimedia and graphics departments moved to an alternative location in Vienna's first district at Schellinggasse 13. After the work was completed, the school was relocated in February 2008. == Notable teachers and students ==

Content creation

Content creation is the act of making and sharing media content, particularly in digital contexts. A content creator is the person or studio behind such content. According to Dictionary.com, content refers to "something that is to be expressed through some medium, as speech, writing or any of various arts" for self-expression, distribution, marketing and/or publication. Content creation encompasses various activities, including maintaining and updating web sites, blogging, article writing, photography, videography, online commentary, social media accounts, and editing and distribution of digital media. In a survey conducted by the Pew Research Center, the content thus created was defined as "the material people contribute to the online world". In addition to traditional forms of content creation, digital platforms face growing challenges related to privacy, copyright, misinformation, platform moderation policies, and the repercussions of violating community guidelines. == Content creators == Content creation is the process of producing and sharing various forms of content such as text, images, audio, and video, designed to engage and inform a specific audience. It plays a crucial role in digital marketing, branding, and online communication and brand awareness. Content can be created for a range of platforms, including social media, websites, blogs, and multimedia channels. Whether it's through written articles, compelling photography, or engaging videos, content creation helps businesses build a connection with their audience, increase visibility, and drive traffic. The process typically involves identifying the target audience, brainstorming ideas, creating the content, and distributing it across various channels. Successful content creation combines creativity with strategic planning, considering audience preferences, trends, and platform characteristics to achieve marketing and branding goals. === News organizations === News organizations, especially those with a large and global reach like The New York Times, NPR, and CNN, consistently create some of the most shared content on the Web, especially in relation to current events. In the words of a 2011 report from the Oxford School for the Study of Journalism and the Reuters Institute for the Study of Journalism, "Mainstream media is the lifeblood of topical social media conversations in the UK." While the rise of digital media has disrupted traditional news outlets, many have adapted and have begun to produce content that is designed to function on the web and be shared on social media. The social media site Twitter is a major distributor and aggregator of breaking news from various sources, and the function and value of Twitter in the distribution of news is a frequent topic of discussion and research in journalism. User-generated content, social media blogging and citizen journalism have changed the nature of news content in recent years. The company Narrative Science is now using artificial intelligence to produce news articles and interpret data. === Colleges, universities, and think tanks === Academic institutions, such as colleges and universities, create content in the form of books, journal articles, white papers, and some forms of digital scholarship, such as blogs that are group edited by academics, class wikis, or video lectures that support a massive open online course (MOOC). Through an open data initiative, institutions may make raw data supporting their experiments or conclusions available on the Web. Academic content may be gathered and made accessible to other academics or the public through publications, databases, libraries, and digital libraries. Academic content may be closed source or open access (OA). Closed-source content is only available to authorized users or subscribers. For example, an important journal or a scholarly database may be a closed source, available only to students and faculty through the institution's library. Open-access articles are open to the public, with the publication and distribution costs shouldered by the institution publishing the content. === Companies === Corporate content includes advertising and public relations content, as well as other types of content produced for profit, including white papers and sponsored research. Advertising can also include auto-generated content, with blocks of content generated by programs or bots for search engine optimization. Companies also create annual reports which are part of their company's workings and a detailed review of their financial year. This gives the stakeholders of the company insight into the company's current and future prospects and direction. === Artists and writers === Cultural works, like music, movies, literature, and art, are also major forms of content. Examples include traditionally published books and e-books as well as self-published books, digital art, fanfiction, and fan art. Independent artists, including authors and musicians, have found commercial success by making their work available on the Internet. === Government === Through digitization, sunshine laws, open records laws and data collection, governments may make statistical, legal or regulatory information available on the Internet. National libraries and state archives turn historical documents, public records, and unique relics into online databases and exhibits. This has raised significant privacy issues. In 2012, The Journal News, a New York state paper, sparked an outcry when it published an interactive map of the state's gun owner locations using legally obtained public records. Governments also create online or digital propaganda or misinformation to support domestic and international goals. This can include astroturfing, or using media to create a false impression of mainstream belief or opinion. Governments can also use open content, such as public records and open data, in service of public health, educational and scientific goals, such as crowdsourcing solutions to complex policy problems. In 2013, the National Aeronautics and Space Administration (NASA) joined the asteroid mining company Planetary Resources to crowdsource the hunt for near-Earth objects. Describing NASA's crowdsourcing work in an interview, technology transfer executive David Locke spoke of the "untapped cognitive surplus that exists in the world" which could be used to help develop NASA technology. In addition to making governments more participatory, open records and open data have the potential to make governments more transparent and less corrupt. === Users === The introduction of Web 2.0 made it possible for content consumers to be more involved in the generation and sharing of content. With the advent of digital media, the amount of user generated content, as well as the age and class range of users, has increased. 8% of Internet users are very active in content creation and consumption. Worldwide, about one in four Internet users are significant content creators, and users in emerging markets lead the world in engagement. Research has also found that young adults of a higher socioeconomic background tend to create more content than those from lower socioeconomic backgrounds. 69% of American and European internet users are "spectators", who consume—but do not create—online and digital media. The ratio of content creators to the amount of content they generate is sometimes referred to as the 1% rule, a rule of thumb that suggests that only 1% of a forum's users create nearly all of its content. Motivations for creating new content may include the desire to gain new knowledge, the possibility of publicity, or simple altruism. Users may also create new content in order to bring about social reforms. However, researchers caution that in order to be effective, context must be considered, a diverse array of people must be included, and all users must participate throughout the process. According to a 2011 study, minorities create content in order to connect with their communities online. African-American users have been found to create content as a means of self-expression that was not previously available. Media portrayals of minorities are sometimes inaccurate and stereotypical which affects the general perception of these minorities. African-Americans respond to their portrayals digitally through the use of social media such as Twitter and Tumblr. The creation of Black Twitter has allowed a community to share their problems and ideas. ==== Teens ==== Younger users now have greater access to content, content creating applications, and the ability to publish to different types of media, such as Facebook, Blogger, Instagram, DeviantArt, or Tumblr. As of 2005, around 21 million teens used the internet and 57%, or 12 million teens, consider themselves content creators. This proportion of media creation and sharing is higher than that of adults. With the advent of the Internet, teens have had more access to tools for sharing an

Virtual advertising

Virtual advertising is the use of digital technology to insert virtual advertisements into a live or pre-recorded television show, often in sports events. This technique is often used to allow broadcasters to overlay existing physical advertising panels inside the sports venue with virtual content on the screen when broadcasting the same event in multiple regions; a Spanish football game can be broadcast in Mexico with Mexican advertisements. Similarly, virtual content can be inserted onto empty space within the sports venue such as the pitch, where physical advertising cannot be placed due to regulatory or safety reasons. Virtual advertising content is intended to be photorealistic, so that the viewer has the impression they are seeing the real in-stadium advertising. == History == Throughout the 1980s, 1990s, and 2000s, advertising on television and in newspapers was a popular method of spreading information. The marketer Jeremiah Lynwood stated that "Thirty years ago, [U.S.] consumers viewed an average of 560 ads per day", mostly from newspapers, television shows, gasoline pumps, and so on. Lynwood also stated that, at the time, "American consumers may be exposed to 3,000 commercial messages every day". Within that time frame, the exposure of daily ads have supported many local and big businesses. With the arrival of the 2000s and 2010s, technological advances have created new opportunities for many businesses to grow. In the 21st century, virtual advertising has been used to create virtual product placements in television shows hours, days, or years after they have been produced. Advertisements can be targeted to regional markets and updated over time to ensure maximum efficiency of advertising money. A good example of how virtual advertising is used in everyday life is in sports. Virtual advertising uses the latest technology to place an ad in position to the field of play, regardless of camera motion, and the players' movement over the logos. Recently, the NHL have virtually inserted sponsors on the glass above the physical boards in NHL stadiums. Big brands will not spend their time or money on hitting a certain region when their main goal is to build global brand awareness. Digital signage opportunities allow these larger brands to purchase signage in a stadium during games that are instead nationally televised. This gets even more expansive thanks to social media outlets like Twitter, Facebook, and Amazon. On the other hand, local businesses sign when there are smaller games going on. The signage is much more affordable and still reaches a vast number of people. Virtual advertising may even make live attendance more attractive to sport fans because the technology allows the playing field and surrounding areas to be cleared of advertisements while television viewers at home are exposed to commercials. For the most part, virtual advertising makes a live attendance more attractive to sports fans, because instead of being at home watching commercials, live fans are able to be clear of advertisements and enjoy the game without pop-up ads. == Technology == The technology used in virtual insertions often uses automated processes such as: automatic detection of playfield limits, automatic detection of cuts, recognition of playfield surface, recognition of existing logos for logo replacements, etc. An operator is usually dedicated to the visual control of the effect but new systems allow to use the instant replay operator. == Examples == === Live events === Virtual advertisements can be effectively integrated into live television in real-time. For example, Fox Sports Net places a virtual advertisement on the glass behind the goaltender that can only be seen on television. The advertising in the playfields is property of the club, except in some professional sports where the league or federation owns the advertising rights. However, the advertising rights broadcast on the screen are property of the broadcasters or the TV channel. This means that second right holders can benefit from selling this virtual advertising. The number of TV viewers is also higher than the people in the stadium, generating more visibility to the advertised marks and more income to the broadcasters. Virtual advertising was first introduced in football during the 2015 Audi Cup at the Allianz Arena in Munich. AIM Sport implemented the technology to digitally overlay advertisements on the stadium's perimeter boards, allowing different sponsors to be displayed to viewers in different broadcast regions. In Formula One, virtual ads are placed on the grass or as virtual billboards. In baseball, Major League Baseball places virtual advertisements on a back-board behind the batter which can be targeted differently in local markets or countries. During the World Series, MLB international broadcasts of the World Series feature different advertisements on a per market basis, showing a different ad in the US, Canadian, Latin American and Japanese markets. In tennis, e.g. during the 2019 ATP Finals in London's O2 Arena certain logos in the background were replaced for various country feeds. In table tennis e.g. during the ITTF World Tour Australian Open 2019 virtual advertising overlays were used by uniqFEED AG in Switzerland. Since the 2022–23 season, the National Hockey League (NHL) has used digitally enhanced dasherboards (DED) to erase and replace ads on each arena's boards with up to 120 thirty-second segments on all or part of the rink. Each broadcaster can use a different set of ads. DED were first used at the 2016 World Cup of Hockey, which was organized by the NHL. At UEFA Euro 2024, AIM Sport provided virtual advertising for all matches, marking one of the largest implementations of the technology in an international tournament. In addition to the tournament itself, virtual advertising was also used in the participating teams' domestic matches, extending region-specific advertising beyond the competition itself.

Transduction (machine learning)

In logic, statistical inference, and supervised learning, transduction or transductive inference is reasoning from observed, specific (training) cases to specific (test) cases. In contrast, induction is reasoning from observed training cases to general rules, which are then applied to the test cases. The distinction is most interesting in cases where the predictions of the transductive model are not achievable by any inductive model. Note that this is caused by transductive inference on different test sets producing mutually inconsistent predictions. Transduction was introduced in a computer science context by Vladimir Vapnik in the 1990s, motivated by his view that transduction is preferable to induction since, according to him, induction requires solving a more general problem (inferring a function) before solving a more specific problem (computing outputs for new cases): "When solving a problem of interest, do not solve a more general problem as an intermediate step. Try to get the answer that you really need but not a more general one.". An example of learning which is not inductive would be in the case of binary classification, where the inputs tend to cluster in two groups. A large set of test inputs may help in finding the clusters, thus providing useful information about the classification labels. The same predictions would not be obtainable from a model which induces a function based only on the training cases. Some people may call this an example of the closely related semi-supervised learning, since Vapnik's motivation is quite different. The most well-known example of a case-bases learning algorithm is the k-nearest neighbor algorithm, which is related to transductive learning algorithms. Another example of an algorithm in this category is the Transductive Support Vector Machine (TSVM). A third possible motivation of transduction arises through the need to approximate. If exact inference is computationally prohibitive, one may at least try to make sure that the approximations are good at the test inputs. In this case, the test inputs could come from an arbitrary distribution (not necessarily related to the distribution of the training inputs), which wouldn't be allowed in semi-supervised learning. An example of an algorithm falling in this category is the Bayesian Committee Machine (BCM). == Historical context == The mode of inference from particulars to particulars, which Vapnik came to call transduction, was already distinguished from the mode of inference from particulars to generalizations in part III of the Cambridge philosopher and logician W.E. Johnson's 1924 textbook, Logic. In Johnson's work, the former mode was called 'eduction' and the latter was called 'induction'. Bruno de Finetti developed a purely subjective form of Bayesianism in which claims about objective chances could be translated into empirically respectable claims about subjective credences with respect to observables through exchangeability properties. An early statement of this view can be found in his 1937 La Prévision: ses Lois Logiques, ses Sources Subjectives and a mature statement in his 1970 Theory of Probability. Within de Finetti's subjective Bayesian framework, all inductive inference is ultimately inference from particulars to particulars. == Example problem == The following example problem contrasts some of the unique properties of transduction against induction. A collection of points is given, such that some of the points are labeled (A, B, or C), but most of the points are unlabeled (?). The goal is to predict appropriate labels for all of the unlabeled points. The inductive approach to solving this problem is to use the labeled points to train a supervised learning algorithm, and then have it predict labels for all of the unlabeled points. With this problem, however, the supervised learning algorithm will only have five labeled points to use as a basis for building a predictive model. It will certainly struggle to build a model that captures the structure of this data. For example, if a nearest-neighbor algorithm is used, then the points near the middle will be labeled "A" or "C", even though it is apparent that they belong to the same cluster as the point labeled "B", compared to semi-supervised learning. Transduction has the advantage of being able to consider all of the points, not just the labeled points, while performing the labeling task. In this case, transductive algorithms would label the unlabeled points according to the clusters to which they naturally belong. The points in the middle, therefore, would most likely be labeled "B", because they are packed very close to that cluster. An advantage of transduction is that it may be able to make better predictions with fewer labeled points, because it uses the natural breaks found in the unlabeled points. One disadvantage of transduction is that it builds no predictive model. If a previously unknown point is added to the set, the entire transductive algorithm would need to be repeated with all of the points in order to predict a label. This can be computationally expensive if the data is made available incrementally in a stream. Further, this might cause the predictions of some of the old points to change (which may be good or bad, depending on the application). A supervised learning algorithm, on the other hand, can label new points instantly, with very little computational cost. == Transduction algorithms == Transduction algorithms can be broadly divided into two categories: those that seek to assign discrete labels to unlabeled points, and those that seek to regress continuous labels for unlabeled points. Algorithms that seek to predict discrete labels tend to be derived by adding partial supervision to a clustering algorithm. Two classes of algorithms can be used: flat clustering and hierarchical clustering. The latter can be further subdivided into two categories: those that cluster by partitioning, and those that cluster by agglomerating. Algorithms that seek to predict continuous labels tend to be derived by adding partial supervision to a manifold learning algorithm. === Partitioning transduction === Partitioning transduction can be thought of as top-down transduction. It is a semi-supervised extension of partition-based clustering. It is typically performed as follows: Consider the set of all points to be one large partition. While any partition P contains two points with conflicting labels: Partition P into smaller partitions. For each partition P: Assign the same label to all of the points in P. Of course, any reasonable partitioning technique could be used with this algorithm. Max flow min cut partitioning schemes are very popular for this purpose. === Agglomerative transduction === Agglomerative transduction can be thought of as bottom-up transduction. It is a semi-supervised extension of agglomerative clustering. It is typically performed as follows: Compute the pair-wise distances, D, between all the points. Sort D in ascending order. Consider each point to be a cluster of size 1. For each pair of points {a,b} in D: If (a is unlabeled) or (b is unlabeled) or (a and b have the same label) Merge the two clusters that contain a and b. Label all points in the merged cluster with the same label. === Continuous Label Transduction === These methods seek to regress continuous labels, often via manifold learning techniques. The idea is to learn a low-dimensional representation of the data and infer values smoothly across the manifold. == Applications and related concepts == Transduction is closely related to: Semi-supervised learning – uses both labeled and unlabeled data but typically induces a model. Case-based reasoning – such as the k-nearest neighbor (k-NN) algorithm, often considered a transductive method. Transductive Support Vector Machines (TSVM) – extend standard SVMs to incorporate unlabeled test data during training. Bayesian Committee Machine (BCM) – an approximation method that makes transductive predictions when exact inference is too costly.

Digital cinematography

Digital cinematography is the process of capturing (recording) a motion picture using digital image sensors rather than through film stock. As digital technology has improved in recent years, this practice has become dominant. Since the 2000s, most movies across the world have been captured as well as distributed digitally. Many vendors have brought products to market, including traditional film camera vendors like Arri and Panavision, as well as new vendors like Red, Blackmagic, Silicon Imaging, Vision Research and companies which have traditionally focused on consumer and broadcast video equipment, like Sony, GoPro, and Panasonic. As of 2023, professional 4K digital cameras were approximately equal to 35mm film in their resolution and dynamic range capacity. Some filmmakers still prefer to use film picture formats to achieve the desired results. == History == The basis for digital cameras are metal–oxide–semiconductor (MOS) image sensors. The first practical semiconductor image sensor was the charge-coupled device (CCD), based on MOS capacitor technology. Following the commercialization of CCD sensors during the late 1970s to early 1980s, the entertainment industry slowly began transitioning to digital imaging and digital video over the next two decades. The CCD was followed by the CMOS active-pixel sensor (CMOS sensor), developed in the 1990s. Beginning in the late 1980s, Sony began marketing the concept of "electronic cinematography," utilizing its analog Sony HDVS professional video cameras. The effort met with very little success. However, this led to one of the earliest high definition video shot feature movies, Julia and Julia (1987). Rainbow (1996) was the world's first film to utilize extensive digital post production techniques. Shot entirely with Sony's first Solid State Electronic Cinematography cameras and featuring over 35 minutes of digital image processing and visual effects, all post production, sound effects, editing and scoring were completed digitally. The Digital High Definition image was transferred to a 35mm negative via an electron beam recorder for theatrical release. The first digitally videoed and post produced feature was Windhorse, shot in Tibet and Nepal in 1996 on the Sony DVW-700WS Digital Betacam and the prosumer Sony DCR-VX1000. The offline editing (Avid) and the online post and color work (Roland House / da Vinci) were also all digital. The film, transferred to 35mm negative for theatrical release, won Best U.S. Feature at the Santa Barbara Film Festival in 1998. In 1997, with the introduction of HDCAM recorders and 1920 × 1080 pixel digital professional video cameras based on CCD technology, the idea, now re-branded as "digital cinematography," began to gain traction in the market. Shot and released in 1998, The Last Broadcast is believed by some to be the first feature-length video shot and edited entirely on consumer-level digital equipment. In May 1999, George Lucas challenged the supremacy of the movie-making medium of film for the first time by including footage filmed with high-definition digital cameras in Star Wars: Episode I – The Phantom Menace. The digital footage blended seamlessly with the footage shot on film and he announced later that year he would film its sequels entirely on hi-def digital video. Also in 1999, digital projectors were installed in four theaters for the showing of The Phantom Menace. In May 2000, Vidocq, which was directed by Pitof, began principal photography shot entirely using a Sony HDW-F900 camera, with the video being released in September the next year. According to the Guinness World Records, Vidocq is the first full length feature filmed in digital high resolution. In June 2000, Star Wars: Episode II – Attack of the Clones began principal photography shot entirely using a Sony HDW-F900 camera as Lucas had previously stated. The film was released in May 2002. In May 2001 Once Upon a Time in Mexico was also shot in 24 frame-per-second high-definition digital video, partially developed by George Lucas using a Sony HDW-F900 camera, following Robert Rodriguez's introduction to the camera at Lucas' Skywalker Ranch facility whilst editing the sound for Spy Kids. A lesser-known movie, Russian Ark (2002), was also shot with the same camera and was the first tapeless digital movie, recorded on HDD instead of tape. In 2009, Slumdog Millionaire became the first movie shot mainly in digital to be awarded the Academy Award for Best Cinematography. The highest-grossing movie in the history of cinema, Avatar (2009), not only was shot on digital cameras as well, but also made the main revenues at the box office no longer by film, but digital projection. Major movies shot on digital video overtook those shot on film in 2013. Since 2016 over 90% of major films were shot on digital video. As of 2017, 92% of films are shot on digital. Only 24 major films released in 2018 were shot on 35mm. Since the 2000s, most movies across the world have been captured as well as distributed digitally. Today, cameras from companies like Sony, Panasonic, JVC and Canon offer a variety of choices for shooting high-definition video. At the high-end of the market, there has been an emergence of cameras aimed specifically at the digital cinema market. These cameras from Sony, Vision Research, Arri, Blackmagic Design, Panavision, Grass Valley and Red offer resolution and dynamic range that exceeds that of traditional video cameras, which are designed for the limited needs of broadcast television. == Technology == Digital cinematography captures motion pictures digitally in a process analogous to digital photography. While there is a clear technical distinction that separates the images captured in digital cinematography from video, the term "digital cinematography" is usually applied only in cases where digital acquisition is substituted for film acquisition, such as when shooting a feature film. The term is seldom applied when digital acquisition is substituted for video acquisition, as with live broadcast television programs. === Recording === ==== Cameras ==== Professional cameras include the Sony CineAlta (F) Series, Blackmagic Cinema Camera, Red One, Arri D-20, D-21 and Alexa, Panavision Genesis, Silicon Imaging SI-2K, Thomson Viper, Vision Research Phantom, IMAX 3D camera based on two Vision Research Phantom cores, Weisscam HS-1 and HS-2, GS Vitec noX, and the Fusion Camera System. Independent micro-budget filmmakers have also pressed low-cost consumer and prosumer cameras into service for digital filmmaking. Flagship smartphones like the Apple iPhone have been used to shoot movies like Unsane (shot on the iPhone 7 Plus) and Tangerine (shot on three iPhone 5S phones) and in January 2018, Unsane's director and Oscar winner Steven Soderbergh expressed an interest in filming other productions solely with iPhones going forward. ==== Sensors ==== Digital cinematography cameras capture digital images using image sensors, either charge-coupled device (CCD) sensors or CMOS active-pixel sensors, usually in one of two arrangements. Single chip cameras designed specifically for the digital cinematography market often use a single sensor (much like digital photo cameras), with dimensions similar in size to a 16 or 35 mm film frame or even (as with the Vision 65) a 65 mm film frame. An image can be projected onto a single large sensor exactly the same way it can be projected onto a film frame, so cameras with this design can be made with PL, PV and similar mounts, in order to use the wide range of existing high-end cinematography lenses available. Their large sensors also let these cameras achieve the same shallow depth of field as 35 or 65 mm motion picture film cameras, which many cinematographers consider an essential visual tool. Codecs Professional raw video recording codecs include Blackmagic Raw, Red Raw, Arri Raw and Canon Raw. ==== Video formats ==== Unlike other video formats, which are specified in terms of vertical resolution (for example, 1080p, which is 1920×1080 pixels), digital cinema formats are usually specified in terms of horizontal resolution. As a shorthand, these resolutions are often given in "nK" notation, where n is the multiplier of 1024 such that the horizontal resolution of a corresponding full-aperture, digitized film frame is exactly 1024 n {\displaystyle 1024n} pixels. Here the "K" has a customary meaning corresponding to the binary prefix "kibi" (ki). For instance, a 2K image is 2048 pixels wide, and a 4K image is 4096 pixels wide. Vertical resolutions vary with aspect ratios though; so a 2K image with an HDTV (16:9) aspect ratio is 2048×1152 pixels, while a 2K image with a SDTV or Academy ratio (4:3) is 2048×1536 pixels, and one with a Panavision ratio (2.39:1) would be 2048×856 pixels, and so on. Due to the "nK" notation not corresponding to specific horizontal resolutions per format a 2K image lacking, for example, the typical 35mm film soundtrack space, is only 182