ID3 algorithm

ID3 algorithm

In decision tree learning, ID3 (Iterative Dichotomiser 3) is a greedy algorithm invented by Ross Quinlan used to generate a decision tree from a dataset. ID3 is the precursor to the C4.5 algorithm. The 3 in the name is meant to signify that this was Quinlan's third attempt at a model based on entropy-based splitting, and the term dichotimser is a misnomer as it implies a binary split, but the ID3 algorithm can split on multi-valued attributes. == Algorithm == The ID3 algorithm begins with the original set S {\displaystyle S} as the root node. On each iteration of the algorithm, it iterates through every unused attribute of the set S {\displaystyle S} and calculates the entropy H ( S ) {\displaystyle \mathrm {H} {(S)}} or the information gain I G ( S ) {\displaystyle IG(S)} of that attribute. It then selects the attribute which has the smallest entropy (or largest information gain) value. The set S {\displaystyle S} is then split or partitioned by the selected attribute to produce subsets of the data. (For example, a node can be split into child nodes based upon the subsets of the population whose ages are less than 50, between 50 and 100, and greater than 100.) The algorithm continues to recurse on each subset, considering only attributes never selected before. Recursion on a subset may stop in one of these cases: every element in the subset belongs to the same class; in which case the node is turned into a leaf node and labelled with the class of the examples. there are no more attributes to be selected, but the examples still do not belong to the same class. In this case, the node is made a leaf node and labelled with the most common class of the examples in the subset. there are no examples in the subset, which happens when no example in the parent set was found to match a specific value of the selected attribute. An example could be the absence of a person among the population with age over 100 years. Then a leaf node is created and labelled with the most common class of the examples in the parent node's set. Throughout the algorithm, the decision tree is constructed with each non-terminal node (internal node) representing the selected attribute on which the data was split, and terminal nodes (leaf nodes) representing the class label of the final subset of this branch. === Summary === Calculate the entropy of every attribute a {\displaystyle a} of the data set S {\displaystyle S} . Partition ("split") the set S {\displaystyle S} into subsets using the attribute for which the resulting entropy after splitting is minimized; or, equivalently, information gain is maximum. Make a decision tree node containing that attribute. Recurse on subsets using the remaining attributes. === Properties === ID3 does not guarantee an optimal solution. It can converge upon local optima. It uses a greedy strategy by selecting the locally best attribute to split the dataset on each iteration. The algorithm's optimality can be improved by using backtracking during the search for the optimal decision tree at the cost of possibly taking longer. ID3 can overfit the training data. To avoid overfitting, smaller decision trees should be preferred over larger ones. This algorithm usually produces small trees, but it does not always produce the smallest possible decision tree. ID3 is harder to use on continuous data than on factored data (factored data has a discrete number of possible values, thus reducing the possible branch points). If the values of any given attribute are continuous, then there are many more places to split the data on this attribute, and searching for the best value to split by can be time-consuming. === Usage === The ID3 algorithm is used by training on a data set S {\displaystyle S} to produce a decision tree which is stored in memory. At runtime, this decision tree is used to classify new test cases (feature vectors) by traversing the decision tree using the features of the datum to arrive at a leaf node. == The ID3 metrics == === Entropy === Entropy H ( S ) {\displaystyle \mathrm {H} {(S)}} is a measure of the amount of uncertainty in the (data) set S {\displaystyle S} (i.e. entropy characterizes the (data) set S {\displaystyle S} ). H ( S ) = ∑ x ∈ X − p ( x ) log 2 ⁡ p ( x ) {\displaystyle \mathrm {H} {(S)}=\sum _{x\in X}{-p(x)\log _{2}p(x)}} Where, S {\displaystyle S} – The current dataset for which entropy is being calculated This changes at each step of the ID3 algorithm, either to a subset of the previous set in the case of splitting on an attribute or to a "sibling" partition of the parent in case the recursion terminated previously. X {\displaystyle X} – The set of classes in S {\displaystyle S} p ( x ) {\displaystyle p(x)} – The proportion of the number of elements in class x {\displaystyle x} to the number of elements in set S {\displaystyle S} When H ( S ) = 0 {\displaystyle \mathrm {H} {(S)}=0} , the set S {\displaystyle S} is perfectly classified (i.e. all elements in S {\displaystyle S} are of the same class). In ID3, entropy is calculated for each remaining attribute. The attribute with the smallest entropy is used to split the set S {\displaystyle S} on this iteration. Entropy in information theory measures how much information is expected to be gained upon measuring a random variable; as such, it can also be used to quantify the amount to which the distribution of the quantity's values is unknown. A constant quantity has zero entropy, as its distribution is perfectly known. In contrast, a uniformly distributed random variable (discretely or continuously uniform) maximizes entropy. Therefore, the greater the entropy at a node, the less information is known about the classification of data at this stage of the tree; and therefore, the greater the potential to improve the classification here. As such, ID3 is a greedy heuristic performing a best-first search for locally optimal entropy values. Its accuracy can be improved by preprocessing the data. === Information gain === Information gain I G ( A ) {\displaystyle IG(A)} is the measure of the difference in entropy from before to after the set S {\displaystyle S} is split on an attribute A {\displaystyle A} . In other words, how much uncertainty in S {\displaystyle S} was reduced after splitting set S {\displaystyle S} on attribute A {\displaystyle A} . I G ( S , A ) = H ( S ) − ∑ t ∈ T p ( t ) H ( t ) = H ( S ) − H ( S | A ) . {\displaystyle IG(S,A)=\mathrm {H} {(S)}-\sum _{t\in T}p(t)\mathrm {H} {(t)}=\mathrm {H} {(S)}-\mathrm {H} {(S|A)}.} Where, H ( S ) {\displaystyle \mathrm {H} (S)} – Entropy of set S {\displaystyle S} T {\displaystyle T} – The subsets created from splitting set S {\displaystyle S} by attribute A {\displaystyle A} such that S = ⋃ t ∈ T t {\displaystyle S=\bigcup _{t\in T}t} p ( t ) {\displaystyle p(t)} – The proportion of the number of elements in t {\displaystyle t} to the number of elements in set S {\displaystyle S} H ( t ) {\displaystyle \mathrm {H} (t)} – Entropy of subset t {\displaystyle t} In ID3, information gain can be calculated (instead of entropy) for each remaining attribute. The attribute with the largest information gain is used to split the set S {\displaystyle S} on this iteration.

Faceu

FaceU (Chinese: 激萌) is a camera app for smartphones running Android or Apple iOS that edits portrait photographs, typically selfies. This app uses AR technology to allow users to add stickers or effects in real-time when taking selfies and videos. It was launched in 2016 and had 250 million registered users in 2017. Most of the users of Faceu are females from 15 to 35 years old. In February 2018, Faceu was acquired by Chinese media startup Toutiao, which is worth about $300 million. The app was banned in India (along with other Chinese apps) on 2 September 2020 by the government, the move came amid the 2020 China-India skirmish. == Online marketing == FaceU is one of several selfie camera apps in China, including MeituPic, Pitu, and Camera360. The app includes social functions such as instant messaging and video chat. Photos and short videos are deleted after a short period. . FaceU has worked with brands to create themed stickers for social media campaigns. In 2016, Faceu collaborated with MeituPic's Meipai and launched a rainbow effect. In October 2017, during the Mid-Autumn Festival and National Day, FaceU released a feature that applied historical or military costumes to selfies. The app has also worked with various social media personalities and celebrities, who have posted content using FaceU effects. Faceu group engages users' emotions utilizing key opinion leaders (KOL) and posters on social media. == Usage and Demographics == FaceU had a large user base. According to industry sources, the app had more than 90 million monthly active users (MAU) and over 11 million daily active users (DAU) at certain points. Most of the users were under 30 and mainly women. The app was especially popular in major Chinese cities like Beijing, Shanghai, and Guangzhou. FaceU also caught on in other parts of East Asia, particularly Japan and South Korea. Some app stores claim the app had hundreds of millions of users worldwide, but these numbers mostly come from the company’s marketing materials and have not been confirmed by independent sources. == Product Features == FaceU includes face recognition and live augmented reality (AR) effects. It allows users to add filters and stickers in real time while they are recording, rather than having to apply them later. The app integrates beauty filters, tools to create emojis and GIFs, and follow-video functionality that automatically tracks the face and movements as it records. Studies and market reports indicate that augmented reality (AR) filters and beautification tools are now common in smartphone photography. These features have influenced the way people take photos and what they expect photos to look like when shared online. Adding AR filters and beautification options has become a standard feature that most mobile photography apps now include.

Virtual Print Fee

Virtual Print Fee (VPF) is a subsidy paid by a film distributor towards the purchase of digital cinema projection equipment for use by a film exhibitor in the presentation of first release motion pictures. The subsidy is paid in the form of a fee per booking of a movie, intended to match the savings that occurs by not shipping a film print. The model is designed to help redistribute the savings realized by studios when using digital distribution instead of film print distribution and is intended to vanish when the transition phase is over when the vast majority of cinemas screens are equipped. == History == The first public demonstration of digital projection for cinema took place at ShoWest in 1999, and it was readily apparent that the technology was further ahead than the business model. Early technology presentations attempted to claim that the technology would pay for itself through new revenues generated by new forms of content. But exhibitors knew their audience, and could see that digital projection was only a replacement technology, creating new financial liabilities, and not new revenue. It wasn’t until the rollout of digital 3-D years later in 2005 that digital projection demonstrated that it could be used to generate additional revenue. The economics were challenging. Film projectors and platters cost in the neighborhood of US$30,000, while early digital projectors cost up to US$150,000. Further, film projectors had a lifetime of 30 years with relatively small annual expenditures in maintenance and replacement parts. On the other hand, exhibitors felt they would be lucky to get 10 years of service from a digital projector, after which there would have to be a refresh in capital expenditure. Meanwhile, distributors would realize significant savings by eliminating the high cost of film prints with corresponding shipping costs, and instead distributing digital files either by satellite or hard drive. The Virtual Print Fee was designed to better balance savings and expenditures for both exhibitors and distributors. It is intended to primarily assist in the replacement of film projectors, and not assist in the purchase of new projection equipment for new construction. To give confidence to financial institutions that digital cinema technology was stable and worthy of investment, Digital Cinema Initiatives was created in 2002, resulting in the release of the first version of the DCI Digital Cinema System Specification in 2005. The DCI Specification continues to be the core specification for digital cinema, establishing the baseline technology and system requirements for which studios will release digital movies. The first set of VPF agreements executed with four major studios were announced by Christie/AIX in November 2005. Christie/AIX at that time was a subsidiary of Access Integrated Technology, now renamed Cinedigm Digital Cinema Corp. The agreements were for the rollout of digital cinema technology to 4000 screens. Since that time, numerous other Digital Cinema Deployment Agreements have been executed around the world, allowing exhibitors in nearly every territory to benefit from VPF subsidies in the conversion from film projection to digital projection.

Virtual advertising

Virtual advertising is the use of digital technology to insert virtual advertisements into a live or pre-recorded television show, often in sports events. This technique is often used to allow broadcasters to overlay existing physical advertising panels inside the sports venue with virtual content on the screen when broadcasting the same event in multiple regions; a Spanish football game can be broadcast in Mexico with Mexican advertisements. Similarly, virtual content can be inserted onto empty space within the sports venue such as the pitch, where physical advertising cannot be placed due to regulatory or safety reasons. Virtual advertising content is intended to be photorealistic, so that the viewer has the impression they are seeing the real in-stadium advertising. == History == Throughout the 1980s, 1990s, and 2000s, advertising on television and in newspapers was a popular method of spreading information. The marketer Jeremiah Lynwood stated that "Thirty years ago, [U.S.] consumers viewed an average of 560 ads per day", mostly from newspapers, television shows, gasoline pumps, and so on. Lynwood also stated that, at the time, "American consumers may be exposed to 3,000 commercial messages every day". Within that time frame, the exposure of daily ads have supported many local and big businesses. With the arrival of the 2000s and 2010s, technological advances have created new opportunities for many businesses to grow. In the 21st century, virtual advertising has been used to create virtual product placements in television shows hours, days, or years after they have been produced. Advertisements can be targeted to regional markets and updated over time to ensure maximum efficiency of advertising money. A good example of how virtual advertising is used in everyday life is in sports. Virtual advertising uses the latest technology to place an ad in position to the field of play, regardless of camera motion, and the players' movement over the logos. Recently, the NHL have virtually inserted sponsors on the glass above the physical boards in NHL stadiums. Big brands will not spend their time or money on hitting a certain region when their main goal is to build global brand awareness. Digital signage opportunities allow these larger brands to purchase signage in a stadium during games that are instead nationally televised. This gets even more expansive thanks to social media outlets like Twitter, Facebook, and Amazon. On the other hand, local businesses sign when there are smaller games going on. The signage is much more affordable and still reaches a vast number of people. Virtual advertising may even make live attendance more attractive to sport fans because the technology allows the playing field and surrounding areas to be cleared of advertisements while television viewers at home are exposed to commercials. For the most part, virtual advertising makes a live attendance more attractive to sports fans, because instead of being at home watching commercials, live fans are able to be clear of advertisements and enjoy the game without pop-up ads. == Technology == The technology used in virtual insertions often uses automated processes such as: automatic detection of playfield limits, automatic detection of cuts, recognition of playfield surface, recognition of existing logos for logo replacements, etc. An operator is usually dedicated to the visual control of the effect but new systems allow to use the instant replay operator. == Examples == === Live events === Virtual advertisements can be effectively integrated into live television in real-time. For example, Fox Sports Net places a virtual advertisement on the glass behind the goaltender that can only be seen on television. The advertising in the playfields is property of the club, except in some professional sports where the league or federation owns the advertising rights. However, the advertising rights broadcast on the screen are property of the broadcasters or the TV channel. This means that second right holders can benefit from selling this virtual advertising. The number of TV viewers is also higher than the people in the stadium, generating more visibility to the advertised marks and more income to the broadcasters. Virtual advertising was first introduced in football during the 2015 Audi Cup at the Allianz Arena in Munich. AIM Sport implemented the technology to digitally overlay advertisements on the stadium's perimeter boards, allowing different sponsors to be displayed to viewers in different broadcast regions. In Formula One, virtual ads are placed on the grass or as virtual billboards. In baseball, Major League Baseball places virtual advertisements on a back-board behind the batter which can be targeted differently in local markets or countries. During the World Series, MLB international broadcasts of the World Series feature different advertisements on a per market basis, showing a different ad in the US, Canadian, Latin American and Japanese markets. In tennis, e.g. during the 2019 ATP Finals in London's O2 Arena certain logos in the background were replaced for various country feeds. In table tennis e.g. during the ITTF World Tour Australian Open 2019 virtual advertising overlays were used by uniqFEED AG in Switzerland. Since the 2022–23 season, the National Hockey League (NHL) has used digitally enhanced dasherboards (DED) to erase and replace ads on each arena's boards with up to 120 thirty-second segments on all or part of the rink. Each broadcaster can use a different set of ads. DED were first used at the 2016 World Cup of Hockey, which was organized by the NHL. At UEFA Euro 2024, AIM Sport provided virtual advertising for all matches, marking one of the largest implementations of the technology in an international tournament. In addition to the tournament itself, virtual advertising was also used in the participating teams' domestic matches, extending region-specific advertising beyond the competition itself.

Contact cleaner

Contact cleaner, also known as switch-cleaner, is any of various chemicals, or mixtures of chemicals, intended to remove or prevent the build-up of oxides or other unwanted substances on the conductive surfaces of connectors, switches, and other electronic components with moving surface-contacts, and thus reduce the contact resistance encountered. The use of contact cleaner can help to minimize the wetting current across a pair of contacts. An example of a simple contact cleaner is isopropyl alcohol Some contact cleaners are designed to evaporate completely and rapidly, leaving no residue. Others may contain lubricants. Lubricants themselves should not necessarily be used as contact cleaners, especially if they are designed to leave an unsuitable residue. However, appropriate lubricants may work well as contact cleaners.

The Most Dangerous Writing App

The Most Dangerous Writing App is a web application for free writing that combats writer's block by deleting all progress if the user stops typing for five seconds. It is targeted at creative writers who want to write first drafts without worrying about editing or formatting. == Features == The app is designed to "shut down your inner editor and get you into a state of flow", referring to the psychological concept of being in a flow state. Users start a writing session by choosing a time or word limit, and can only save or download their work if they complete the set limit without interruption. An optional "hardcore mode" blurs out everything the user has written so far, making it impossible to edit before finishing the writing session. == History == The Most Dangerous Writing App was created by software engineer Manuel Ebert and was released as free, open source software on February 29, 2016. It was reviewed by Wired, Forbes, Vogue, Huffington Post, The Verge, The Next Web, and others. It has been used in free writing contests and is recommended by NaNoWriMo. In April 2019, The Most Dangerous Writing App was acquired by Squibler, but the original version remains freely accessible.

Influence-for-hire

Influence-for-hire or collective influence, refers to the economy that has emerged around buying and selling influence on social media platforms. == Overview == Companies that engage in the influence-for-hire industry range from content farms to high-end public relations agencies. Traditionally influence operations have largely been confined to public sector actors like intelligence agencies, in the influence-for-hire industry the groups conduction the operations are private with commerce being their primary consideration. However many of the clients in the influence-for-hire industry are countries or countries acting through proxies. They are often located in countries with less expensive digital labor. == History == In May 2021, Facebook took a Ukrainian influence-for-hire network offline. Facebook attributed the network to organizations and consultants linked to Ukrainian politicians including Andriy Derkach. During the COVID-19 pandemic state sponsored misinformation was spread through influence-for-hire networks. In August 2021, a report published by the Australian Strategic Policy Institute implicated the Chinese government and the ruling Chinese Communist Party in campaigns of online manipulation conducted against Australia and Taiwan using influence-for-hire.