AI Data Warehouse

AI Data Warehouse — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Avid Free DV

    Avid Free DV

    Avid Free DV is a non-linear editing video editing software application developed by Avid Technology. Avid introduced Free DV in January 2003 at the 2003 MacWorld Expo; the company discontinued it in September 2007. Free DV was intended to give editors a sample of the Avid interface to use in deciding whether or not to purchase Avid software, so when compared with other Avid products its features were relatively minimal. When it was available it was not limited by time or watermarking, so it could be used as a non-linear editor for as long as desired. == Comparisons == When compared with other consumer-end non-linear editors such as iMovie and Windows Movie Maker, it sported more powerful video processing tools, but lacked the ease-of-use and shallow learning curve emphasized in similar programs because it had the full interface of the professional Avid system. However, Avid did offer a number of flash-based tutorials to help new users learn how to use the program for capturing, editing, clipping, processing, and outputting audio/video, among other things. == Limitations == The limitations of Avid Free DV included that it allowed only two video and audio tracks, had fewer editing tools than other Avid products, had few import and export formats, and allowed capture and output of standard-definition DV only, via FireWire. Avid Free DV projects and media were not compatible with other Avid systems. As the name implied, Avid Free DV was available as a free download, although users were required to complete a short survey on the Avid website before they were given a download link and key. In addition to using Free DV to evaluate Avid prior to purchase, it could also act as a stepping stone for people wishing to learn to use Avid's other editing products, such as Xpress Pro, Media Composer and Symphony. While additional skills and techniques are necessary to use these professionally geared systems, the basic operation remains the same. == Operating systems == Avid Free DV was available for Windows XP and Mac OS X. The officially supported Mac OS X versions were Panther versions up to 10.3.5, and Tiger versions up to 10.4.3 only. == Supported formats == Avid Free DV supported QuickTime (MOV) and DV AVIs. == Reception == John P. Mello Jr. of The Boston Globe gave Free DV a negative review, finding the user interface obfuscatory and the process of ingesting video error-prone. He summarized: "Professional video editors who use an Avid competitor may jump at the chance to take a free look at how Avid does things. But for the merely curious, this software is a nightmare". Video Systems's Steve Mullen opined that its lack of interoperability with Avid's professional editing software contracted Avid's stated goal to entice budding video editors into buying into the company's software ecosystem.

    Read more →
  • Digital asset

    Digital asset

    A digital asset is anything that exists only in digital form and comes with a distinct usage right or distinct permission for use. Data that do not possess those rights are not considered assets. Digital assets include, but are not limited to: digital documents, audio content, motion pictures, and other relevant digital data currently in circulation or stored on digital appliances, such as personal computers, laptops, portable media players, tablets, data storage devices, and telecommunication devices. This encompasses any apparatus that currently exists or will exist as technology progresses to accommodate the conception of new modalities capable of carrying digital assets. This holds true regardless of the ownership of the physical device on which the digital asset is located. == Types == Types of digital assets include, but are not limited to: software, photography, logos, illustrations, animations, audiovisual media, presentations, spreadsheets, digital paintings, word documents, electronic mails, websites, and various other digital formats with their respective metadata. The number of different types of digital assets is exponentially increasing due to the rising number of devices that leverage these assets, such as smartphones, serving as conduits for digital media. In Intel's presentation at the 'Intel Developer Forum 2013,' they introduced several new types of digital assets related to medicine, education, voting, friendships, conversations, and reputation, among others. == Digital asset management system == A digital asset management (DAM) is an integrated structure that combines software, hardware, and/or other services to manage, store, ingest, organize, and retrieve digital assets. These systems enable users to find and use content when needed. == Digital asset metadata == Metadata is data about other data. Any structured information that defines a specification of any form of data is referred to as metadata. Metadata is also a claimed relationship between two entities, often used to establish connections or associations. Librarian Lorcan Dempsey says "Think of metadata as data which removes from a user (human or machine) the need to have full advance knowledge of the existence or characteristics of things of potential interest in the environment". At first, the term metadata was used for digital data exclusively, but nowadays metadata can apply to both physical and digital data. Catalogs, inventories, registers, and other similar standardized forms of organizing, managing, and retrieving resources contain metadata. Metadata can be stored and contained directly within the file it refers to or independently from it with the help of other forms of data management such as a DAM system. The more metadata is assigned to an asset the easier it gets to categorize it, especially as the amount of information grows. The asset's value rises the more metadata it has for it becomes more accessible, easier to manage, and more complex. Structured metadata can be shared with open protocols like OAI-PMH to allow further aggregation and processing. Open data sources like institutional repositories have thus been aggregated to form large datasets and academic search engines comprising tens of millions of open access works, like BASE, CORE, and Unpaywall. == Issues == Due to a lack of either legislation or legal precedent, there is limited existing governmental control and regulation surrounding digital assets in the United States and other large economies globally. Many of the control issues relating to access and transferability are maintained by individual companies. Some consequences of this include 'What is to become of the assets once their owner is deceased?' as well as can, and, if so, how, may they be inherited. This subject was broached in a bogus story about Bruce Willis allegedly looking to sue Apple as the end user agreement prevented him from bequeathing his iTunes collection to his children. Another case of this was when a soldier died on duty and the family requested access to the Yahoo! account. When Yahoo! refused to grant access, the probate judge ordered them to give the emails to the family but Yahoo! still was not required to give access. The Music Modernization Act was passed in September 2018 by the U.S. Congress to create a new music licensing system, with the aim to help songwriters get paid more.

    Read more →
  • Digital zombie

    Digital zombie

    A digital zombie is a person so engaged with digital technology or social media they are unable to separate themselves from a persistent online presence. Writing in 2017, University of Sydney researcher Andrew Campbell expressed concerns over whether or not the individual can truly live a full and healthy life while they are preoccupied with the digital world. Other individuals have also begun referencing certain types of behaviour with being a digital zombie. Stefanie Valentic, managing editor of EHS Today, refers to it as people hunting digital creatures through their smartphones in public spaces, always fixed on their phones. The University of Warwick has used the term to argue that further research needs to be done with people who exist in digital form after death to help people grieve their loss. == Modern applications == === Distracted walking === The term digital zombie can refer to a person performing distracted walking, which has been labelled dangerous by the American Academy of Orthopaedic Surgeons. They created the "Digital Deadwalkers" campaign after physicians became aware of the risks associated with walking across intersections and sidewalks while paying attention only to smartphones and not one's surroundings. Also stating that the name is derived from the fact that "they're oblivious to everyone else, so it's like they're dead-walking, sleepwalking." === Living through media === The Department of Sociology, University of Warwick has also identified the term, digital zombie, to refer to an individual who has died but is digitally resurrected, reanimated and socially active. These digital zombies do things in death they did not do when they were alive as they "live" again through a digital self on a digital medium. Dead celebrities sometimes become digital zombies when they are reanimated to appear in commercial advertisements (such as Audrey Hepburn and Bob Monkhouse). Other accidental digital zombies include Tupac Shakur and Michael Jackson who were both digitally resurrected and recreated to perform "live" on stage years after their death. Researchers at the University of Warwick have carried out research into the area of human-computer interaction. in an effort to understand the affect these digital zombies have on grief and bereavement. === Mobile gaming === Writer for EHS Today, Stefanie Valentic, has made observations with the mobile phone video game Pokémon Go, which offers players the experience to hunt and collect digital creatures called Pokémon through their smartphone in real world. Players can be observed simultaneously gazing at their phone while also obliviously walking around their environments looking for Pokémon. Stefanie references these individuals as "digital zombies" since they walk around with no cognition of their surroundings while engaged with their phone. == Health risks == === Heavy use of technology === Research by the University of Sydney has begun looking at how new technology such as digital media and smartphones impact our lives and questioning whether they can create new compulsions and obsessions. The research demonstrates that increased heavy technological use can have negative health consequences similar to drugs, smoking, and alcohol. Marcel O'Gorman, an associate professor of English at the University of Waterloo, has commented on the body of research examining how technology impacts cognition, stating currently that there is no empirical evidence to support any theories that suggest that technology can damage memory and attention span. === Heightened risk to children === Manfred Spitzer, a German psychiatrist, has raised concerns with providing digital devices to children. During the early childhood stage while their brains are rapidly growing, increased exposure to digital devices may deprive them of necessary development required to facilitate brain growth. These concerns are also shared by Korean doctors who believe giving digital devices, like smartphones to children, limits their cognitive development.

    Read more →
  • ProjectExplorer

    ProjectExplorer

    ProjectExplorer is a documentary short film series. The films, directed and produced by ProjectExplorer's Founder, Jenny M Buccos, focus on histories and cultures of foreign places and people using interviews with subject experts, artists, and public figures including Archbishop Desmond Tutu, Dr. John Kani, Greg Marinovich, and Sipho “Hotstix” Mabuse. Produced for a child and young adult audience, segments in each series depict everyday life and the challenges and concerns of those living in the locations and regions featured. Each film is 2–4 minutes in length, with each series containing approximately 40 films. The ProjectExplorer series is distributed internationally without charge via the web by ProjectExplorer, LTD. an American not-for-profit organization. Three series have been produced and distributed. In fall 2009, ProjectExplorer's third series, Jordan, received a GOLD level Parents' Choice Award for excellence in web programming. == Film series == === Shakespeare's England (2006) === The first series was filmed in London, Stratford-upon-Avon, and New York City. The series includes more than 30 film segments. United Kingdom locations and individuals include: The London Eye The Tower of London The Whitechapel Bell Foundry, which demonstrates the process of making a bell Simon Hughes, Member of Parliament and President of the Liberal Democrats The Old Vic The Royal Shakespeare Company The National Archives (UK) Segments filmed in New York City include: Michael Cumpsty discusses and performs monologues from Hamlet (while starring in the Classic Stage Company production) Michael Stuhlbarg discusses and performs a monologue from Macbeth === South Africa (2007) === Filmed in Johannesburg, Cape Town, and KwaZulu Natal, the series contains over 40 film segments including: Ntate Thabong Phosa, a lesiba player from Lesotho. Due to the rarity of lesiba players globally, this is one of the only publicly available examples of the lesiba played on film. A Robben Island piece, filmed at the cell in which Nelson Mandela was held for 18 of his 27-year imprisonment. JSE Securities Exchange with Leigh Roberts, correspondent for CNBC Africa. A 3-part series on HIV/AIDS with amfAR Director of Research, Dr. Rowena Johnson. Dr. Johnson discusses high cost of anti-retroviral drugs and testing in South Africa. The June 16, 1976 Soweto Uprising, with archival film footage and photography from SABC and The Sowetan newspaper. Prominent South Africans featured in the series: Dr. John Kani, Chairperson of the Apartheid Museum and TONY Award Winning Actor Musician Sipho “Hotstix” Mabuse Former U.N. Ambassador Dave A. Steward, Executive Director of the FW de Klerk Foundation Director and producer, Duma Ndlovu Malcolm Purkey, Artistic Director of the Market Theatre === South Africa, Part II (2008) === Filmed in Johannesburg, Cape Town, and New York City, the series contains over 10 film segments. Prominent South Africans featured in the series: Archbishop Desmond Tutu, Nobel Peace Prize laureate Photojournalist Greg Marinovich, Pulitzer Prize winner and co-author of The Bang-Bang Club Vusi Mahlasela, musician Author, Max du Preez === Jordan (2008) === Filmed in Amman, Petra, Umm Qais, Jerash, Madaba, Bethany, the Dead Sea, and New York City, the series contains more than 45 film segments. Jordan series segments include: A tour of the throne room of King Abdullah II, at Raghadan Palace Sharing mansaf with a Bedouin family in the Wadi Rum desert The UNRWA Jabal Hussein refugee camp The Siq, Treasury, and Monastery at Petra The ruins of Gadara at Umm Qais Jerash, the capital and largest city of Jordan's Jerash Governorate Madaba, home of the Madaba Map and the mosaic capital of Jordan The archaeological site at Bethany Traditional clothing from Salt and Ma'an The reintroduction into the wild of the endangered Arabian Oryx The Desert Castles The science of the Dead Sea Her Royal Highness Princess Basma bint Ali and her Royal Botanic Garden

    Read more →
  • StatCrunch

    StatCrunch

    StatCrunch is a web-based statistical software application from Pearson Education. StatCrunch was originally created for use in college statistics courses. As a full-featured statistics package, it is now also used for research and for other statistical analysis purposes. == History == American statistics professor Webster West created StatCrunch in 1997. Over the next 19 years West assisted by others added many more statistical procedures and graphing capabilities, and made user interface improvements. In 2005, West received two awards for StatCrunch: the CAUSEweb Resource of the Year Award and the MERLOT Classics Award. In 2013, the StatCrunch Java code was rewritten in JavaScript in order to avoid Java browser security problems, and so that it would run on iOS and Android. In 2015, new ways of importing data were added, including importing multi-page data directly from Wikipedia tables and other Web sources, and also importing with drag-and-drop for various data formats. In 2016, StatCrunch was acquired by Pearson Education, which had already been serving as the primary distributor of StatCrunch for several years. == Software == A StatCrunch license is included with many of Pearson's statistical textbooks. Because StatCrunch is a web application, it works on multiple platforms, including Windows, macOS, iOS, and Android. Data in StatCrunch is represented in a "data table" view, which is similar to a spreadsheet view, but unlike spreadsheets, the cells in a data table can only contain numbers or text. Formulas cannot be stored in these cells. There are many ways to import data into StatCrunch. Data can be typed directly into cells in the data table. Entire blocks of data may be cut-and-pasted into the data table. Text files (.csv, .txt, etc.) and Microsoft Excel files (.xls and .xlsx) can be drag-and-dropped into the data table. Data can be pulled into StatCrunch directly from Wikipedia tables or other Web tables, including multi-page tables. Data can be loaded directly from Google Drive and Dropbox. Shared data sets saved by other StatCrunch community users can be searched for by title or keyword and opened in a data table. Graphs, results, and reports created by StatCrunch can be shared with other users, in addition to the sharing of data sets. StatCrunch has a library of data transformation functions. StatCrunch can also recode and reorganize data. All data is stored in memory, and all processing happens on the client, so response is fast, even with large data sets. StatCrunch can interact with multiple graphs simultaneously. If a user selects a data point on one graph, then that same data point is highlighted on all other displayed graphs. In addition to standard statistical and graphing procedures, StatCrunch has a collection of about forty "applets" which illustrate statistical concepts interactively.

    Read more →
  • Online exhibition

    Online exhibition

    An online exhibition, also referred to as a virtual exhibition, online gallery, cyber-exhibition, is an exhibition whose venue is cyberspace. Museums and other organizations create online exhibitions for many reasons. For example, an online exhibition may: expand on material presented at, or generate interest in, or create a durable online record of, a physical exhibition; save production costs (insurance, shipping, installation); solve conservation/preservation problems (e.g., handling of fragile or rare objects); reach lots more people: "Access to information is no longer restricted to those who can afford travel and museum visits, but is available to anyone who has access to a computer with an Internet connection. Unlike physical exhibitions, online exhibitions are not restricted by time; they are not forced to open and close but may be available 24 hours a day. In the nonprofit world, many museums, libraries, archives, universities, and other cultural organizations create online exhibitions. A database of such exhibitions is Library and Archival Exhibitions on the Web. Online exhibition organizers may use techniques such as marquee text, display advertisements, and in-event emails to engage patrons. Various guides have been published to help organizations create effective online exhibitions. The earliest museum with a physical existence to create a programme of substantial online exhibitions with high resolution images of artefacts was the Museum of the History of Science in Oxford, the first of which, The Measurers: a Flemish Image of Mathematics in the Sixteenth Century and an exhibition of early photographs, were published on 21 August 1995. == Examples of online exhibitions == International Museum of Women is an online-only museum that does not have a physical building and instead offers online exhibitions about women's issues globally as well as an online community. Online exhibitions include "Imagining Ourselves" (launched 2006) about women's identity, "Women, Power and Politics" (2008), and "Economica: Women and the Global Economy" (2009). Tucson LGBTQ Museum is an online-only museum that does not have a physical building and instead offers online exhibitions about LGBTQ history. The online photographic, audio, video, text, and other historical exhibitions include exhibits from the 1700s to the present day. The effort began in the summer of 1967 and spanned almost 50 years. International New Media Gallery (INMG) is an online museum specialising in moving image and screen-based art. The INMG is dedicated to exploring current debates and topics in art history: touching on areas such as migration, war, environmental activism and the internet itself. The gallery publishes extensive academic catalogues alongside its exhibitions. It also hosts spaces for discussion and debate, both online and offline. Virtual Museum of Modern Nigerian Art – the VMMNA is the first of its kind in Africa. Hosted by the Pan-African University, Lagos, Nigeria this virtual museum offers a good view of the development on Nigerian Art in the past fifty years.

    Read more →
  • Algorithmic amplification

    Algorithmic amplification

    Algorithmic amplification is the process by which automated ranking and recommendation systems on digital platforms increase the visibility of certain content beyond its initial audience. Major platforms including Facebook, YouTube, TikTok, and X (formerly Twitter) use such systems to determine what appears in users' feeds and search results. The term is used in research on social media and digital media regulation to describe how platform design choices influence the distribution of online information. Unlike chronological feeds, algorithmic systems evaluate content using signals such as engagement rates, viewing duration, and predicted relevance to individual users. Content that performs strongly on these metrics may be promoted to progressively larger audiences through feeds, search rankings, or autoplay systems. The process is distinct from content moderation, which involves removing, labelling, or restricting content under platform rules, although the two can interact in practice. The concept is closely connected to the attention economy. Research has linked algorithmic amplification to the spread of misinformation and the circulation of political content, as well as to effects on young users' mental health. The scale and direction of those effects remain debated, in part because independent researchers have limited access to the internal workings of platform recommendation systems. Governments in the European Union, United Kingdom, United States, and China have pursued differing regulatory approaches to recommendation algorithms. The EU's Digital Services Act and the UK's Online Safety Act 2023 impose obligations on large platforms related to recommendation system transparency and risk, while China became the first country to enact binding legislation specifically targeting such systems. Internal documents and whistleblower testimony reported by the BBC in 2026 described how competitive pressure between Meta and TikTok led to trade-offs between engagement and user safety in the design of their recommendation systems. == Terminology == The term algorithmic amplification is used in media studies, platform governance scholarship and regulatory literature to describe how automated systems influence the distribution of content beyond what organic user sharing alone would produce. It is distinct from viral spread, which refers primarily to user-driven sharing behaviour, and from algorithmic bias, which describes systematic errors or unfairness in algorithmic outputs. The related term algorithmic curation is used for the broader process of selecting and ordering content, of which amplification is one possible outcome. The phrase also appears in regulatory and legislative discussion of recommendation systems. The European Union's Digital Services Act (DSA) identifies recommendation systems as a potential source of systemic risk, and the term appears frequently in academic and policy commentary on the regulation. In the United States, proposals including the Filter Bubble Transparency Act and the Kids Online Safety Act (KOSA) have used it to frame requirements around recommendation system transparency. In the United Kingdom, the House of Commons Science, Innovation and Technology Committee used the term in a 2025 report on how recommendation algorithms contributed to the spread of misinformation during the 2024 Southport riots. A Joint Declaration on AI and Freedom of Expression adopted in October 2025 by four international freedom of expression mandate holders, including the UN Special Rapporteur on Freedom of Opinion and Expression and the OSCE Representative on Freedom of the Media, stated that recommender systems and other AI-powered curation tools exert "a large hidden influence and gatekeeper role" over what information people access and consume. == Background == Early internet platforms typically displayed content in reverse-chronological order or through keyword-based search systems. Although the term is most often applied to social media, the underlying logic predates social media itself. A 2021 overview traced the origins of modern recommendation systems to the early 1990s, when they were first used experimentally for personal email and information filtering. The 1992 Tapestry mail system and the 1994 GroupLens news filtering system were early milestones before recommendation systems spread into e-commerce and other online services. As user bases and content volumes grew during the 2000s, major platforms including Google, YouTube, and Facebook developed machine-learning systems to personalise content delivery and prioritise material predicted to generate engagement. Facebook introduced its News Feed in 2006, which gradually shifted from chronological presentation towards algorithmically ranked content. YouTube altered its recommendation system in 2012 to prioritise watch time rather than clicks, a change the platform said was prompted by concerns that click-based metrics encouraged misleading thumbnails and low-quality videos. TikTok, launched internationally in 2018, adopted a model in which its primary content surface, the For You feed, is driven almost entirely by algorithmic recommendation rather than by a user's social graph. An internal document obtained by The New York Times in 2021 showed that the platform's algorithm optimised for retention and time spent, using signals such as watch duration, replays, likes, and comments to score and rank videos. Algorithmic recommendation also became central to platforms outside social media. Spotify's personalised features, including Discover Weekly, Release Radar, and Home recommendations, use behavioural signals and inferred "taste profiles" to surface tracks and artists beyond a listener's existing library. An ethnographic study of music curators at streaming platforms described this blend of algorithmic and human editorial selection as an "algo-torial" model of gatekeeping. Amazon adopted item-based collaborative filtering for product recommendations in 1998, and its recommendation engine has been described as one of the earliest large-scale deployments of recommendation technology in e-commerce. The same dynamics operate on adult content platforms. Law professor Amy Adler has argued that from 2007 onwards the pornography industry migrated to algorithm-driven streaming platforms, most of which are controlled by a single near-monopoly company, Aylo (formerly MindGeek). These platforms use algorithmic search engines, suggestions, rigid categorisation of content, and AI-driven search term optimisation in ways that produce the same distorting effects found on mainstream speech platforms, including filter bubbles, feedback loops, and the tendency of algorithmic recommendations to alter individual preferences. == Mechanisms == Recommendation systems commonly combine collaborative filtering, which predicts a user's preferences from the behaviour of similar users, with machine-learning models that predict which content a user is likely to engage with from their prior activity. In a common two-stage design, a platform first generates a set of candidate items from a large content pool and then ranks them using a scoring model with objectives such as predicted engagement or user satisfaction. Small changes in ranking criteria can shift exposure at scale, particularly when applied repeatedly across multiple browsing sessions. These systems typically rely on signals including engagement rates, viewing duration, click-through rates, and network relationships between users. Modern recommendation pipelines continuously update predictions as new behavioural data arrives, allowing platforms to adjust rankings in near real time. Users' revealed preferences, expressed through behaviour such as clicks and viewing time, do not always align with their stated preferences, expressed through explicit feedback such as surveys or content controls. Popularity signals can create feedback dynamics in which early engagement increases the likelihood that content will be shown to additional users. Experimental research on online cultural markets has demonstrated how such feedback processes can produce unequal visibility outcomes even when initial differences in content quality are small. == Beneficial and public-interest uses == Recommendation systems can help users navigate large volumes of content by surfacing material predicted to match their interests or needs, which can improve discoverability on platforms with large content libraries. In public health communication, platforms can help health authorities distribute timely information at scale, though the same recommendation systems also risk amplifying misinformation alongside official guidance. Sociologist Zeynep Tufekci has argued that the shift from independent blogs to large centralised platforms transferred gatekeeping power from traditional media to corporate algorithms. In the case of the Egyptian uprising of 2011, she noted that ordinary users

    Read more →
  • Mean opinion score

    Mean opinion score

    Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale that a subject assigns to his opinion of the performance of a system quality". Such ratings are usually gathered in a subjective quality evaluation test, but they can also be algorithmically estimated. MOS is a commonly used measure for video, audio, and audiovisual quality evaluation, but not restricted to those modalities. ITU-T has defined several ways of referring to a MOS in Recommendation ITU-T P.800.1, depending on whether the score was obtained from audiovisual, conversational, listening, talking, or video quality tests. == Rating scales and mathematical definition == The MOS is expressed as a single rational number, typically in the range 1–5, where 1 is lowest perceived quality, and 5 is the highest perceived quality. Other MOS ranges are also possible, depending on the rating scale that has been used in the underlying test. The Absolute Category Rating scale is very commonly used, which maps ratings between Bad and Excellent to numbers between 1 and 5, as seen in below table. Other standardized quality rating scales exist in ITU-T Recommendations (such as ITU-T P.800 or ITU-T P.910). For example, one could use a continuous scale ranging between 1–100. Which scale is used depends on the purpose of the test. In certain contexts there are no statistically significant differences between ratings for the same stimuli when they are obtained using different scales. The MOS is calculated as the arithmetic mean over single ratings performed by human subjects for a given stimulus in a subjective quality evaluation test. Thus: M O S = ∑ n = 1 N R n N {\displaystyle MOS={\frac {\sum _{n=1}^{N}{R_{n}}}{N}}} Where R {\displaystyle R} are the individual ratings for a given stimulus by N {\displaystyle N} subjects. == Properties of the MOS == The MOS is subject to certain mathematical properties and biases. In general, there is an ongoing debate on the usefulness of the MOS to quantify Quality of Experience in a single scalar value. When the MOS is acquired using a categorical rating scales, it is based on – similar to Likert scales – an ordinal scale. In this case, the ranking of the scale items is known, but their interval is not. Therefore, it is mathematically incorrect to calculate a mean over individual ratings in order to obtain the central tendency; the median should be used instead. However, in practice and in the definition of MOS, it is considered acceptable to calculate the arithmetic mean. It has been shown that for categorical rating scales (such as ACR), the individual items are not perceived equidistant by subjects. For example, there may be a larger "gap" between Good and Fair than there is between Good and Excellent. The perceived distance may also depend on the language into which the scale is translated. However, there exist studies that could not prove a significant impact of scale translation on the obtained results. Several other biases are present in the way MOS ratings are typically acquired. In addition to the above-mentioned issues with scales that are perceived non-linearly, there is a so-called "range-equalization bias": subjects, over the course of a subjective experiment, tend to give scores that span the entire rating scale. This makes it impossible to compare two different subjective tests if the range of presented quality differs. In other words, the MOS is never an absolute measure of quality, but only relative to the test in which it has been acquired. For the above reasons – and due to several other contextual factors influencing the perceived quality in a subjective test – a MOS value should only be reported if the context in which the values have been collected in is known and reported as well. MOS values gathered from different contexts and test designs therefore should not be directly compared. Recommendation ITU-T P.800.2 prescribes how MOS values should be reported. Specifically, P.800.2 says:it is not meaningful to directly compare MOS values produced from separate experiments, unless those experiments were explicitly designed to be compared, and even then the data should be statistically analysed to ensure that such a comparison is valid. == MOS for speech and audio quality estimation == MOS historically originates from subjective measurements where listeners would sit in a "quiet room" and score a telephone call quality as they perceived it. This kind of test methodology had been in use in the telephony industry for decades and was standardized in Recommendation ITU-T P.800. It specifies that "the talker should be seated in a quiet room with volume between 30 and 120 m³ and a reverberation time less than 500 ms (preferably in the range 200–300 ms). The room noise level must be below 30 dBA with no dominant peaks in the spectrum." Requirements for other modalities were similarly specified in later ITU-T Recommendations. == MOS estimation using quality models == Obtaining MOS ratings may be time-consuming and expensive as it requires the recruitment of human assessors. For various use cases such as codec development or service quality monitoring purposes – where quality should be estimated repeatedly and automatically – MOS scores can also be predicted by objective quality models, which typically have been developed and trained using human MOS ratings. A question that arises from using such models is whether the MOS differences produced are noticeable to the users. For example, when rating images on a five point MOS scale, an image with a MOS equal to 5 is expected to be noticeably better in quality than one with a MOS equal to 1. Contrary to that, it is not evident whether an image with a MOS equal to 3.8 is noticeably better in quality than one with a MOS equal to 3.6. Research conducted on determining the smallest MOS difference that is perceptible to users for digital photographs showed that a MOS difference of approximately 0.46 is required in order for 75% of the users to be able to detect the higher quality image. Nevertheless, image quality expectation, and hence MOS, changes over time with the change of user expectations. As a result, minimum noticeable MOS differences determined using analytical methods such as in may change over time.

    Read more →
  • ShowDocument

    ShowDocument

    ShowDocument is an online web application that allows multiple users to conduct web meetings, upload, share and review documents from remote locations. The service was developed by the HBR Labs company, established in 2007. == Features == Users can collaborate on and review documents in real time, with annotations and text being visible to all users and accessible for co-editing. The idea of every user being able to annotate can cause conflicts within the sessions, and so main navigation options are under the "presenter"'s control - which can be given to a different user as well. An earlier version of the application, by contrast, had allowed all users to navigate and edit at once, causing the system to drop all incomplete edits. It is possible to draw and write on a virtual whiteboard, and to stream a YouTube video to a group in full synchronization. A feature also exists for co-browsing of Google Maps. Entering an open session in the application can be done with a given code number, or by receiving a link through an Email message. Different file formats can be uploaded and saved either online or offline, such as PDF. A PDF file's text cannot be edited - text is edited through the separate text editor. Although the platform contains a text chat, it is not intended to replace instant messaging software, as there are no extensive messaging features. The application has a paid and free version, with the free version having a few limitations: audio and video options are disabled, number of participants is limited and sessions are time-limited. == Development == ShowDocument was first developed in 2007. On September 8, 2009, HBR labs released a new update which included features such as secure online document storage and mobile device support.

    Read more →
  • IBM Retail Store Systems

    IBM Retail Store Systems

    This article describes IBM point of sale equipment from 1973 with the introduction of the IBM 3650 till 1986 with the introduction of the IBM 4680. IBM continued to announced new retail products until the sale of the IBM Retail Store Solutions business to Toshiba TEC, announced on 17 April 17 2012. == Background == IBM began selling retail point of sale systems starting in 1973 with the IBM 3650 Retail Store System aimed at department and chain stores and the IBM 3660 Supermarket System designed for supermarkets. The IBM 3650 was announced alongside other IBM vertical industry systems such as the IBM 3600 Finance Communication System, and the IBM 3790 communications system, the combination of which IBM described as a "revolution in terminal based systems". All of these systems relied on a significant number of developments across IBM: New chips: Large Scale Integration allowed advanced Field Effect Transistor logic chips that packed far more transistors onto a new metalized one-inch square ceramic substrate Gas panels: Developed as an alternative to cathode ray tubes, the neon argon gas panel provided clear and flicker-free images. Modem communications: Synchronous Data Link Control provided lower-cost communications over telephone lines New disks: The "Gulliver" disk file that supplied a hard drive smaller than three cubic feet and also the "Igar" diskette drive Smaller printers: A disk printer system called "spica" that used a rotating disk print element with engraved print elements that are struck by a single hammer as the disk rotates Belt printers: A new system, known as "Lynx," using a removable belt that was significantly cheaper, quieter and simpler than earlier chain printers Keyboards: New keyboard technology called "Calico" that could build a wide variety of keyboards using common manufacturing facilities Power supplies: Transistorised Switching Regulators or TsRs: compact power supplies that are one third to one-fourth the size of previous generations === Store Loop (SLOOP) architecture === The 36xx retail terminals are connected to the store controller via a loop also called a Store Loop, similar to that used by the IBM 3600 Finance System. If a terminal detects an error, it runs a self-diagnosis routine, displays an error code to the operator, and uses bypass circuitry to remove itself from the loop and allow the loop to continue operating. If the loop fails, the most downstream terminal transmits an error code to the controller. Intermittent errors are written to disk on the store controller. === Supplies Manufacturing === While IBM's Data Processing Division created the retail store systems, it's Information Record Division (IRD) also saw signifiant opportunity in manufacturing supplies for retail systems. As an example in their Dayton NJ plant they used a high-speed Webtron press to create up to 1 million magnet merchandise tags per shift. == IBM 3650 Retail Store System == The 3650 System is a family of products designed to computerise a retail store, both at the point of sale and for back office store management functions. It includes a method to generate encoded tickets for merchandise, rather than use the Universal Product Code (UPC). The key devices for the system were as follows: === Shop Floor === ==== 3653 Point of Sale Terminal ==== Designed for the store floor, it is a loop attached device with: a wire matrix printer with 3 stations: cash receipt, sales-check and transaction journal. a keyboard with 10 numeric keys and 19 function keys an 8 digit display and description lights. in addition to the 8 digits it also displays the following characters: "$", "." and "-" operator guidance panel with 20 backlit captions status indicators a cash drawer a check verification station. Options include a wand magnet label reader with a 4 foot flexible cord, and locks for the journal tape and the till cover. The terminal effectively loads its software remotely from the 3651 over the loop, which IBM calls an IML (initial microcode load). It can also be IMLed locally using a tape cassette recorder. IBM later offered a choice of OEM Wand Attachments that could be ordered by RPQ that could use OCR or scan UPCs, instead of a wand magnet label reader. Only one wand could be attached to a specific 3653. There are two models: Model 1, which is not programmable. Was announced 10 August 1973. Model P1, which is customer programmable. Has 36 KB of storage expandable to 60 KB. Was announced 13 October 1978. === Back office equipment === ==== 3651 Store Controller ==== Controls data flow inside either a single store or multiple stores and sends retail transactions to a mainframe using a modem. For point of sale it performed functions such as: Automatic price lookup from a master price file Automatic distribution of net sales by up to 54 departments Automatic application of applicable discounts and sales taxes Automatic control of food stamp maximums Check authorization facilities For back office it also helped report preparation such as: store summary individual cashier performance store office reconciliation sales by up to 54 departments Current inquiries for department sales; cashier performance & cash position; store cash position. Inquiries and changes to the master price records and operator authorization control records. Setting the time and date for the internal clock. Running the customer checkouts in training mode. Printing of messages received from the host mainframe Entry of messages to send to the host mainframe Reporting of customer stock returns Updating the system with data received from the mainframe Preparing shelf Labels Basic features include: Each loop attaches up to 63 or 64 terminals depending on traffic volumes and desired response times Has an error and operator panel. There were many models including: A25 Has a 5 MB internal disk. Has 60K of memory expandable to 76KB. Supports one store loop. Attaches to 3275, 3653 and 3663. Announced 19 May 1978, withdrawn 19 February 1981 B25 Same as a A25 with a 9.2 MB internal disk. Announced 19 May 1978 C25 Announced 15 May 1981, withdrawn 15 December 1987 A50 Has a 5 MB internal disk. Announced 5 May 1975. Announced 10 August 1973, withdrawn 15 December 1987 B50 Same as B50 with a 9.2 MB internal disk. Announced 5 May 1975, withdrawn 15 December 1987 A60 Has a 5 MB internal disk. Has an integrated 3669. Attaches up to 24 3663 terminals. Announced 11 October 1973, withdrawn 15 December 1987 B60 Same as A60 with a 9.3 MB internal disk. Announced 17 November 1975, withdrawn 15 December 1987 A75 Has 5 MB internal disk. Has 60K of memory expandable to 124KB. Supports one to three store loops. Attaches to 3275, 3653, 3657, 3784 and 3663 terminals. Announced 19 May 1978 B75 Same as A75 with 9.3 MB internal disk. Announced 19 May 1978, withdrawn 15 December 1987 C75 Same as A75 with 18.6 MB internal disk. Announced 19 May 1978, withdrawn 15 December 1987 D75 Same as A75 with 27.9 MB internal disk. Announced 19 May 1978, withdrawn 15 December 1987 There were also two additional models that could be used instead of the 3651: 7480 Model 1: Has a 18.6 MB internal disk 7480 Model 2: Has a 27.9 MB internal disk ==== 3872 Modem ==== Used to attach to a 3659 for remote loops. Each 3872 can attach three 3659s. ==== 3659 Remote Communication Unit ==== Connected to an IBM 3872 and provides a remote loop for up to 64 point of sale terminals. Announced 10 August 1973, withdrawn 15 December 1987 (Model 2, announced 17 March 1976, withdrawn 20 December 1982) Intended to be used in a back office location like the store manager's office or the data entry office ==== 3275-3 Display Station ==== It is a loop attached display terminal with printer attachment hardware ==== 3784 Line Printer ==== A belt printer for higher-volume end-of-day reporting. The maximum print speed is 155 Ipm using a 48 character set. ==== 3657 Ticket Unit ==== Used to print tickets and encoded labels to attach to store merchandise. It is a loop attached device. It prints the following: 1" by 1" adhesive backed labels with up to 11 characters at 500 tickets per minute. IBM sold these in rolls of 9000 1" x 2" tickets with up to 42 encoded characters and two lines of print of up to 21 characters at 250 tickets per minute. IBM sold these in rolls of 2800 1" x 3" tickets with up to 79 encoded characters and two lines of print of up to 32 characters at 167 tickets per minute. IBM sold these in rolls of 1900 It can also batch read the tickets for validation, separating good tickets from bad ones into two cartridges. Announced 10 August 1973, withdrawn 15 December 1987 ==== 7481 Data Storage Unit ==== This optional unit is used to record transaction data and initialize terminals if the store controller is not available. It uses a built in tape drive to store this data. === Early deployments === The first customer installation of a 3650 was at a Dillard's department store in Little Rock, Arkansas, in late 1974. They placed arou

    Read more →
  • Digital goods

    Digital goods

    Digital goods or e-goods are intangible goods that exist in digital form. Examples are Wikipedia articles; digital media, such as e-books, downloadable music, internet radio, internet television and streaming media; fonts, logos, photos and graphics; digital subscriptions; online ads (as purchased by the advertiser); internet coupons; electronic tickets; electronically treated documentation in many different fields; downloadable software (Digital Distribution) and mobile apps; cloud-based applications and online games; virtual goods used within the virtual economies of online games and communities; community access; workbooks; worksheets; planners; e-learning (online courses); webinars, video tutorials, blog posts; cards; patterns; website themes and templates. == Legal concerns about digital goods == Special legal concerns regarding digital goods include copyright infringement and taxation. Also the question of the ownership (versus licensed use or service only) of purely digital goods is not finally resolved. For instance, the software installers of the digital software distributor gog.com are technically independent to the account but are still subject to the EULA, where a "licensed, not sold" formulation is used. Therefore, it is not clear if the software can be legally used after a hypothetical loss of the account; a question which was also raised before in practice for the similar service Steam. In July 2012, the European Court of Justice ruled in the case UsedSoft GMbH v. Oracle International Corp. that the sale of a software product, either through a physical support or download, constituted a transfer of ownership in EU law, thus the first sale doctrine applies; the ruling thereby breaks the "licensed, not sold" legal theory, but leaves open numerous questions. Therefore, it is also permissible to resell software licenses even if the digital good has been downloaded directly from the Internet, as the first-sale doctrine applied whenever software was originally sold to a customer for an unlimited amount of time, thus prohibiting any software maker from preventing the resale of their software by any of their legitimate owners. The court requires that the previous owner must no longer be able to use the licensed software after the resale, but finds that the practical difficulties in enforcing this clause should not be an obstacle to authorizing resale, as they are also present for software which can be installed from physical supports, where the first-sale doctrine is in force. In several cases, content providers have faced criticism for revoking access to digital goods due to expired licenses or the discontinuation of a product, such as ebooks (which resulted in a lawsuit against Amazon.com, Inc.), digital video (with Sony Interactive Entertainment revoking access to purchased StudioCanal content from its now-defunct PlayStation video store; a similar move involving Warner Bros. Discovery content was averted by an updated license agreement), and video games (such as Ubisoft discontinuing and revoking access to its game The Crew without providing refunds or the ability to redownload the game) In September 2024, the U.S. state of California implemented a consumer protection law that prohibits the use of terms such as "buy" or "purchase" during transactions involving digital goods if there is no way to obtain the purchases in a manner that cannot be revoked by the seller (such as allowing it to be downloaded for permanent, offline access), and requires a disclaimer to be displayed to the customer at the time of purchase.

    Read more →
  • Svelte

    Svelte

    Svelte is a free and open-source component-based front-end software framework and language created by Rich Harris and maintained by the Svelte core team. Svelte is not a monolithic JavaScript library imported by applications: instead, Svelte compiles HTML templates to specialized code that manipulates the DOM directly, which may reduce the size of transferred files and give better client performance. Application code is also processed by the compiler, inserting calls to automatically recompute data and re-render UI elements when the data they depend on is modified. This also avoids the overhead associated with runtime intermediate representations, such as virtual DOM, unlike traditional frameworks (such as React and Vue) which carry out the bulk of their work at runtime, i.e. in the browser. The compiler itself is written in JavaScript. Its source code is licensed under MIT License and hosted on GitHub. Among comparable frontend libraries, Svelte has one of the smallest bundle footprints at merely 2KB. == History == The predecessor of Svelte is Ractive.js, which Rich Harris created in 2013. Version 1 of Svelte was written in JavaScript and was released on 29 November 2016. The name Svelte was chosen by Rich Harris and his coworkers at The Guardian. Version 2 of Svelte was released on 19 April 2018. It set out to correct what the maintainers viewed as mistakes in the earlier version such as replacing double curly braces with single curly braces. Version 3 of Svelte was written in TypeScript and was released on 21 April 2019. It rethought reactivity by using the compiler to instrument assignments behind the scenes. The SvelteKit web framework was announced in October 2020 and entered beta in March 2021. SvelteKit 1.0 was released in December 2022 after two years in development. Version 4 of Svelte was released on 22 June 2023. It was a maintenance release, smaller and faster than version 3. A part of this release was an internal rewrite from TypeScript back to JavaScript, with JSDoc type annotations. This was met with confusion from the developer community, which was addressed by the creator of Svelte, Rich Harris. Version 5 of Svelte was released on October 19, 2024 at Svelte Summit Fall 2024 with Rich Harris cutting the release live while joined by other Svelte maintainers. Svelte 5 was a ground-up rewrite of Svelte, changing core concepts such as reactivity and reusability. Its primary feature, runes, reworked how reactive state is declared and used. Runes are function-like macros that are used to declare a reactive state, or code that uses reactive states. These runes are used by the compiler to indicate values that may change and are depended on by other states or the DOM. Svelte 5 also introduces Snippets, which are reusable "snippets" of code that are defined once and can be reused anywhere else in the component. Svelte 5 was initially met with controversy due to its many changes, and thus deprecations caused primarily by runes. However, most of this has subsided since the initial announcement of runes, and the further refining of Svelte 5. Also at Svelte Summit Fall 2024, Ben McCann announced the Svelte CLI under the sv package name on npm. In early 2025, the Svelte team announced Asynchronous Svelte, an experimental feature set centered around asynchronous reactivity in Svelte using await expressions. As of August 2025, the feature is available via an experimental compiler option. This coincided with the experimental release of remote functions, an RPC feature in SvelteKit, Svelte's metaframework. Key early contributors to Svelte became involved with Conduitry joining with the release of Svelte 1, Tan Li Hau joining in 2019, and Ben McCann joining in 2020. Rich Harris and Simon Holthausen joined Vercel to work on Svelte fulltime in 2022. Dominic Gannaway joined Vercel from the React core team to work on Svelte fulltime in 2023. == Syntax == Svelte applications and components are defined in .svelte files, which are HTML files extended with templating syntax that is based on JavaScript and is similar to JSX. Svelte's core features are accessed through runes, which syntactically look like functions, but are used as macros by the compiler. These runes include: The $state rune, used for declaring a reactive state value The $derived rune, used for declaring reactive state derived from one or more states The $effect rune, used for declaring code that reruns whenever its dependencies change Starting with Svelte 5, the framework introduced a significant reactivity overhaul that replaces the previous `$:` reactive declarations with new runes such as $state, $derived, and $effect. The $effect rune is now used for post-render operations without modifying state, while $derived is used for computations that depend on other reactive values. This change aims to simplify the mental model of reactivity and make component logic more explicit. Additionally, the { JavaScript code } syntax can be used for templating in HTML elements and components, similar to template literals in JavaScript. This syntax can also be used in element attributes for uses such as two-way data binding, event listeners, and CSS styling. A Todo List example made in Svelte is below: == Associated projects == The Svelte maintainers created SvelteKit as the official way to build projects with Svelte. It is a Next.js/Nuxt-style full-stack framework that dramatically reduces the amount of code that gets sent to the browser. The maintainers had previously created Sapper, which was the predecessor of SvelteKit. The Svelte maintainers also maintain a number of integrations for popular software projects under the Svelte organization including integrations for Vite, Rollup, Webpack, TypeScript, VS Code, Chrome Developer Tools, ESLint, and Prettier. A number of external projects such as Storybook have also created integrations with Svelte and SvelteKit. == Influence == Vue.js modeled its API and single-file components after Ractive.js, the predecessor of Svelte. == Adoption == Svelte is widely praised by developers. Taking the top ranking in multiple large scale developer surveys, it was chosen as the Stack Overflow 2021 most loved web framework and 2020 State of JS frontend framework with the most satisfied developers. Recent surveys continue to show Svelte's strong developer satisfaction, with the 2024 State of JS survey maintaining its position among the most praised frontend frameworks. The 2024 Stack Overflow Developer Survey reported that 73% of developers who used Svelte want to continue working with it, and noted that Stack Overflow's own team used Svelte for building their 2024 Developer Survey results site. Svelte has been adopted by a number of high-profile web companies including The New York Times, Google, Apple, Spotify, Radio France, Square, Yahoo, ByteDance, Rakuten, Bloomberg, Reuters, Ikea, Facebook, Logitech, and Brave. A community group of primarily non-maintainers, known as the Svelte Society, run the Svelte Summit conference, write a Svelte newsletter, host a Svelte podcast, and host a directory of Svelte tooling, components, and templates.

    Read more →
  • Feature engineering

    Feature engineering

    Feature engineering is a preprocessing step in supervised machine learning and statistical modeling which transforms raw data into a more effective set of inputs. Each input comprises several attributes, known as features. By providing models with relevant information, feature engineering significantly enhances their predictive accuracy and decision-making capability. Beyond machine learning, the principles of feature engineering are applied in various scientific fields, including physics. For example, physicists construct dimensionless numbers such as the Reynolds number in fluid dynamics, the Nusselt number in heat transfer, and the Archimedes number in sedimentation. They also develop first approximations of solutions, such as analytical solutions for the strength of materials in mechanics. == Clustering == One of the applications of feature engineering has been clustering of feature-objects or sample-objects in a dataset. Especially, feature engineering based on matrix decomposition has been extensively used for data clustering under non-negativity constraints on the feature coefficients. These include Non-Negative Matrix Factorization (NMF), Non-Negative Matrix-Tri Factorization (NMTF), Non-Negative Tensor Decomposition/Factorization (NTF/NTD), etc. The non-negativity constraints on coefficients of the feature vectors mined by the above-stated algorithms yields a part-based representation, and different factor matrices exhibit natural clustering properties. Several extensions of the above-stated feature engineering methods have been reported in literature, including orthogonality-constrained factorization for hard clustering, and manifold learning to overcome inherent issues with these algorithms. Other classes of feature engineering algorithms include leveraging a common hidden structure across multiple inter-related datasets to obtain a consensus (common) clustering scheme. An example is Multi-view Classification based on Consensus Matrix Decomposition (MCMD), which mines a common clustering scheme across multiple datasets. MCMD is designed to output two types of class labels (scale-variant and scale-invariant clustering), and: is computationally robust to missing information, can obtain shape- and scale-based outliers, and can handle high-dimensional data effectively. Coupled matrix and tensor decompositions are popular in multi-view feature engineering. == Predictive modelling == Feature engineering in machine learning and statistical modeling involves selecting, creating, transforming, and extracting data features. Key components include feature creation from existing data, transforming and imputing missing or invalid features, reducing data dimensionality through methods like Principal Components Analysis (PCA), Independent Component Analysis (ICA), and Linear Discriminant Analysis (LDA), and selecting the most relevant features for model training based on importance scores and correlation matrices. Features vary in significance. Even relatively insignificant features may contribute to a model. Feature selection can reduce the number of features to prevent a model from becoming too specific to the training data set (overfitting). Feature explosion occurs when the number of identified features is too large for effective model estimation or optimization. Common causes include: Feature templates - implementing feature templates instead of coding new features Feature combinations - combinations that cannot be represented by a linear system Feature explosion can be limited via techniques such as regularization, kernel methods, and feature selection. == Automation == Automation of feature engineering is a research topic that dates back to the 1990s. Machine learning software that incorporates automated feature engineering has been commercially available since 2016. Related academic literature can be roughly separated into two types: Multi-relational Decision Tree Learning (MRDTL) uses a supervised algorithm that is similar to a decision tree. Deep Feature Synthesis uses simpler methods. === Multi-relational Decision Tree Learning (MRDTL) === Multi-relational Decision Tree Learning (MRDTL) extends traditional decision tree methods to relational databases, handling complex data relationships across tables. It innovatively uses selection graphs as decision nodes, refined systematically until a specific termination criterion is reached. Most MRDTL studies base implementations on relational databases, which results in many redundant operations. These redundancies can be reduced by using techniques such as tuple id propagation. === Open-source implementations === There are a number of open-source libraries and tools that automate feature engineering on relational data and time series: featuretools is a Python library for transforming time series and relational data into feature matrices for machine learning. MCMD: An open-source feature engineering algorithm for joint clustering of multiple datasets. OneBM or One-Button Machine combines feature transformations and feature selection on relational data with feature selection techniques. OneBM helps data scientists reduce data exploration time allowing them to try and error many ideas in short time. On the other hand, it enables non-experts, who are not familiar with data science, to quickly extract value from their data with a little effort, time, and cost. getML community is an open source tool for automated feature engineering on time series and relational data. It is implemented in C/C++ with a Python interface. It has been shown to be at least 60 times faster than tsflex, tsfresh, tsfel, featuretools or kats. tsfresh is a Python library for feature extraction on time series data. It evaluates the quality of the features using hypothesis testing. tsflex is an open source Python library for extracting features from time series data. Despite being 100% written in Python, it has been shown to be faster and more memory efficient than tsfresh, seglearn or tsfel. seglearn is an extension for multivariate, sequential time series data to the scikit-learn Python library. tsfel is a Python package for feature extraction on time series data. kats is a Python toolkit for analyzing time series data. === Deep feature synthesis === The deep feature synthesis (DFS) algorithm beat 615 of 906 human teams in a competition. == Feature stores == The feature store is where the features are stored and organized for the explicit purpose of being used to either train models (by data scientists) or make predictions (by applications that have a trained model). It is a central location where you can either create or update groups of features created from multiple different data sources, or create and update new datasets from those feature groups for training models or for use in applications that do not want to compute the features but just retrieve them when it needs them to make predictions. A feature store includes the ability to store code used to generate features, apply the code to raw data, and serve those features to models upon request. Useful capabilities include feature versioning and policies governing the circumstances under which features can be used. Feature stores can be standalone software tools or built into machine learning platforms. == Alternatives == Feature engineering can be a time-consuming and error-prone process, as it requires domain expertise and often involves trial and error. Deep learning algorithms may be used to process a large raw dataset without having to resort to feature engineering. However, deep learning algorithms still require careful preprocessing and cleaning of the input data. In addition, choosing the right architecture, hyperparameters, and optimization algorithm for a deep neural network can be a challenging and iterative process.

    Read more →
  • Content creation

    Content creation

    Content creation is the act of making and sharing media content, particularly in digital contexts. A content creator is the person or studio behind such content. According to Dictionary.com, content refers to "something that is to be expressed through some medium, as speech, writing or any of various arts" for self-expression, distribution, marketing and/or publication. Content creation encompasses various activities, including maintaining and updating web sites, blogging, article writing, photography, videography, online commentary, social media accounts, and editing and distribution of digital media. In a survey conducted by the Pew Research Center, the content thus created was defined as "the material people contribute to the online world". In addition to traditional forms of content creation, digital platforms face growing challenges related to privacy, copyright, misinformation, platform moderation policies, and the repercussions of violating community guidelines. == Content creators == Content creation is the process of producing and sharing various forms of content such as text, images, audio, and video, designed to engage and inform a specific audience. It plays a crucial role in digital marketing, branding, and online communication and brand awareness. Content can be created for a range of platforms, including social media, websites, blogs, and multimedia channels. Whether it's through written articles, compelling photography, or engaging videos, content creation helps businesses build a connection with their audience, increase visibility, and drive traffic. The process typically involves identifying the target audience, brainstorming ideas, creating the content, and distributing it across various channels. Successful content creation combines creativity with strategic planning, considering audience preferences, trends, and platform characteristics to achieve marketing and branding goals. === News organizations === News organizations, especially those with a large and global reach like The New York Times, NPR, and CNN, consistently create some of the most shared content on the Web, especially in relation to current events. In the words of a 2011 report from the Oxford School for the Study of Journalism and the Reuters Institute for the Study of Journalism, "Mainstream media is the lifeblood of topical social media conversations in the UK." While the rise of digital media has disrupted traditional news outlets, many have adapted and have begun to produce content that is designed to function on the web and be shared on social media. The social media site Twitter is a major distributor and aggregator of breaking news from various sources, and the function and value of Twitter in the distribution of news is a frequent topic of discussion and research in journalism. User-generated content, social media blogging and citizen journalism have changed the nature of news content in recent years. The company Narrative Science is now using artificial intelligence to produce news articles and interpret data. === Colleges, universities, and think tanks === Academic institutions, such as colleges and universities, create content in the form of books, journal articles, white papers, and some forms of digital scholarship, such as blogs that are group edited by academics, class wikis, or video lectures that support a massive open online course (MOOC). Through an open data initiative, institutions may make raw data supporting their experiments or conclusions available on the Web. Academic content may be gathered and made accessible to other academics or the public through publications, databases, libraries, and digital libraries. Academic content may be closed source or open access (OA). Closed-source content is only available to authorized users or subscribers. For example, an important journal or a scholarly database may be a closed source, available only to students and faculty through the institution's library. Open-access articles are open to the public, with the publication and distribution costs shouldered by the institution publishing the content. === Companies === Corporate content includes advertising and public relations content, as well as other types of content produced for profit, including white papers and sponsored research. Advertising can also include auto-generated content, with blocks of content generated by programs or bots for search engine optimization. Companies also create annual reports which are part of their company's workings and a detailed review of their financial year. This gives the stakeholders of the company insight into the company's current and future prospects and direction. === Artists and writers === Cultural works, like music, movies, literature, and art, are also major forms of content. Examples include traditionally published books and e-books as well as self-published books, digital art, fanfiction, and fan art. Independent artists, including authors and musicians, have found commercial success by making their work available on the Internet. === Government === Through digitization, sunshine laws, open records laws and data collection, governments may make statistical, legal or regulatory information available on the Internet. National libraries and state archives turn historical documents, public records, and unique relics into online databases and exhibits. This has raised significant privacy issues. In 2012, The Journal News, a New York state paper, sparked an outcry when it published an interactive map of the state's gun owner locations using legally obtained public records. Governments also create online or digital propaganda or misinformation to support domestic and international goals. This can include astroturfing, or using media to create a false impression of mainstream belief or opinion. Governments can also use open content, such as public records and open data, in service of public health, educational and scientific goals, such as crowdsourcing solutions to complex policy problems. In 2013, the National Aeronautics and Space Administration (NASA) joined the asteroid mining company Planetary Resources to crowdsource the hunt for near-Earth objects. Describing NASA's crowdsourcing work in an interview, technology transfer executive David Locke spoke of the "untapped cognitive surplus that exists in the world" which could be used to help develop NASA technology. In addition to making governments more participatory, open records and open data have the potential to make governments more transparent and less corrupt. === Users === The introduction of Web 2.0 made it possible for content consumers to be more involved in the generation and sharing of content. With the advent of digital media, the amount of user generated content, as well as the age and class range of users, has increased. 8% of Internet users are very active in content creation and consumption. Worldwide, about one in four Internet users are significant content creators, and users in emerging markets lead the world in engagement. Research has also found that young adults of a higher socioeconomic background tend to create more content than those from lower socioeconomic backgrounds. 69% of American and European internet users are "spectators", who consume—but do not create—online and digital media. The ratio of content creators to the amount of content they generate is sometimes referred to as the 1% rule, a rule of thumb that suggests that only 1% of a forum's users create nearly all of its content. Motivations for creating new content may include the desire to gain new knowledge, the possibility of publicity, or simple altruism. Users may also create new content in order to bring about social reforms. However, researchers caution that in order to be effective, context must be considered, a diverse array of people must be included, and all users must participate throughout the process. According to a 2011 study, minorities create content in order to connect with their communities online. African-American users have been found to create content as a means of self-expression that was not previously available. Media portrayals of minorities are sometimes inaccurate and stereotypical which affects the general perception of these minorities. African-Americans respond to their portrayals digitally through the use of social media such as Twitter and Tumblr. The creation of Black Twitter has allowed a community to share their problems and ideas. ==== Teens ==== Younger users now have greater access to content, content creating applications, and the ability to publish to different types of media, such as Facebook, Blogger, Instagram, DeviantArt, or Tumblr. As of 2005, around 21 million teens used the internet and 57%, or 12 million teens, consider themselves content creators. This proportion of media creation and sharing is higher than that of adults. With the advent of the Internet, teens have had more access to tools for sharing an

    Read more →
  • Bootstrap (front-end framework)

    Bootstrap (front-end framework)

    Bootstrap (formerly Twitter Bootstrap) is a free and open-source CSS framework directed at responsive, mobile-first front-end web development. It contains HTML, CSS and (optionally) JavaScript-based design templates for typography, forms, buttons, navigation, and other interface components. As of May 2023, Bootstrap is the 17th most starred project (4th most starred library) on GitHub, with over 164,000 stars. According to W3Techs, Bootstrap is used by 19.2% of all websites. == Features == Bootstrap is an HTML, CSS and JS library that focuses on simplifying the development of informative web pages (as opposed to web applications). The primary purpose of adding it to a web project is to apply Bootstrap's choices of color, size, font and layout to that project. As such, the primary factor is whether the developers in charge find those choices to their liking. Once added to a project, Bootstrap provides basic style definitions for all HTML elements. The result is a uniform appearance for prose, tables and form elements across web browsers. In addition, developers can take advantage of CSS classes defined in Bootstrap to further customize the appearance of their contents. For example, Bootstrap has provisioned for light- and dark-colored tables, page headings, more prominent pull quotes, and text with a highlight. Bootstrap also comes with several JavaScript components which do not require other libraries like jQuery. They provide additional user interface elements such as dialog boxes, tooltips, progress bars, navigation drop-downs, and carousels. Each Bootstrap component consists of an HTML structure, CSS declarations, and in some cases accompanying JavaScript code. They also extend the functionality of some existing interface elements, including for example an auto-complete function for input fields. The most prominent components of Bootstrap are its layout components, as they affect an entire web page. The basic layout component is called "Container", as every other element in the page is placed in it. Developers can choose between a fixed-width container and a fluid-width container. While the latter always fills the width with the web page, the former uses one of the five predefined fixed widths, depending on the size of the screen showing the page: Smaller than 576 pixels 576–768 pixels 768–992 pixels 992–1200 pixels 1200–1400 pixels Larger than 1400 pixels Once a container is in place, other Bootstrap layout components implement a CSS Flexbox layout through defining rows and columns. A precompiled version of Bootstrap is available in the form of one CSS file and three JavaScript files that can be readily added to any project. The raw form of Bootstrap, however, enables developers to implement further customization and size optimizations. This raw form is modular, meaning that the developer can remove unneeded components, apply a theme and modify the uncompiled Sass files. == History == === Early beginnings === Bootstrap, originally named Twitter Blueprint, was developed by Mark Otto and Jacob Thornton at Twitter in 2010 as a framework to encourage consistency across internal tools. Before Bootstrap, various libraries were used for interface development, which led to inconsistencies and a high maintenance burden. According to Otto: A super small group of developers and I got together to design and build a new internal tool and saw an opportunity to do something more. Through that process, we saw ourselves build something much more substantial than another internal tool. Months later, we ended up with an early version of Bootstrap as a way to document and share common design patterns and assets within the company. After a few months of development by a small group, many developers at Twitter began to contribute to the project as a part of Hack Week, a hackathon-style week for the Twitter development team. It was renamed from Twitter Blueprint to Twitter Bootstrap and released as an open-source project on August 19, 2011. It has continued to be maintained by Otto, Thornton, a small group of core developers, and a large community of contributors. === Bootstrap 2 === On January 31, 2012, Bootstrap 2 was released, which added built-in support for Glyphicons, several new components, as well as changes to many of the existing components. This version supports responsive web design, meaning the layout of web pages adjusts dynamically, taking into account the characteristics of the device used (whether desktop, tablet, mobile phone). Shortly before the release of Bootstrap 2.1.2, Otto and Thornton left Twitter, but committed to continue to work on Bootstrap as an independent project. === Bootstrap 3 === On August 19, 2013, Bootstrap 3 was released. It redesigned components to use flat design and a mobile first approach. Bootstrap 3 features new plugin system with namespaced events. Bootstrap 3 dropped Internet Explorer 7 and Firefox 3.6 support, but there is an optional polyfill for these browsers. Bootstrap 3 was also the first version released under the twbs organization on GitHub instead of the Twitter one. === Bootstrap 4 === Otto announced Bootstrap 4 on October 29, 2014. The first alpha version of Bootstrap 4 was released on August 19, 2015. The first beta version was released on August 10, 2017. Otto suspended work on Bootstrap 3 on September 6, 2016, to free up time to work on Bootstrap 4. Bootstrap 4 was finalized on January 18, 2018. Significant changes include: Major rewrite of the code Replacing Less with Sass Addition of Reboot, a collection of element-specific CSS changes in a single file, based on Normalize Dropping support for IE8, IE9, and iOS 6 CSS Flexible Box support Adding navigation customization options Adding responsive spacing and sizing utilities Switching from the pixels unit in CSS to root ems Increasing global font size from 14px to 16px for enhanced readability Dropping the panel, thumbnail, pager, and well components Dropping the Glyphicons icon font Huge number of utility classes Improved form styling, buttons, drop-down menus, media objects and image classes Bootstrap 4 supports the latest versions of Google Chrome, Firefox, Internet Explorer, Opera, and Safari (except on Windows). It additionally supports back to IE10 and the latest Firefox Extended Support Release (ESR). === Bootstrap 5 === Bootstrap 5 was officially released on May 5, 2021. Major changes include: New offcanvas menu component Removing dependence on jQuery in favor of vanilla JavaScript Rewriting the grid to support responsive gutters and columns placed outside of rows Migrating the documentation from Jekyll to Hugo Dropping support for Internet Explorer Moving testing infrastructure from QUnit to Jasmine Adding custom set of SVG icons Adding CSS custom properties Improved API Enhanced grid system Improved customizing docs Updated forms RTL support Built in darkmode support

    Read more →