AI Chatbot Creator

AI Chatbot Creator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Cowrie (honeypot)

Cowrie is a medium interaction SSH and Telnet honeypot designed to log brute force attacks and shell interaction performed by an attacker. Cowrie also functions as an SSH and telnet proxy to observe attacker behavior to another system. Cowrie was developed from Kippo. == Reception == Cowrie has been referenced in published papers. The Book "Hands-On Ethical Hacking and Network Defense" includes Cowrie in a list of 5 commercial honeypots. === Prior uses === Discussing a honeypot effort called the Project Heisenberg Cloud by Rapid7, Bob Rudis, the company's chief data scientist, told eWEEK, "There are custom Rapid7-developed low- and medium-interaction honeypots used within the framework, along with open-source ones, such as Cowrie." Doug Rickert has experimented with the open-source Cowrie SSH honeypot and wrote about it on Medium. Putting up a simple honeypot isn't difficult, and there are many open-source products besides Cowrie, including the original Honeyd to MongoDB and NoSQL honeypots, to ones that emulate web servers. Some appear to be SCADA or other more advanced applications. === Best practices === Researchers at the SysAdmin, Audit, Network and Security (SANS) institute urged administrators and security researchers to run the latest version of Cowrie on a honeypot to monitor shifts in the type of passwords being scanned for and pattern of attacks on IoT devices. === Discussion and further resources === Attack Detection and Forensics Using Honeypot in an IoT Environment calls Cowrie a "medium interaction honeypot" and describes results from using it for 40 days to capture "all communicated sessions in log files." The book Advances on Data Science also devotes chapter two to "Cowrie Honeypot Dataset and Logging." ICCWS 2018 13th International Conference on Cyber Warfare and Security describes using Cowrie. On the Move to Meaningful Internet Systems: OTM 2019 Conferences includes details of using Cowrie. Splunk, a security tool that can receive information from honeypots, outlines how to set up a honeypot using the open-source Cowrie package.
Read more →
Influencer speak

Influencer speak is a speech pattern commonly associated with English-speaking digital content creators, particularly on platforms such as TikTok. This style is characterized by linguistic features such as uptalk, where intonation rises at the end of declarative sentences, and vocal fry, a low, creaky vibration in speech. These features are often used to engage audiences. == Characteristics == Influencer speak is commonly associated with: Uptalk – a rising intonation at the end of statements Vocal fry – a creaky sound often occurring at the end of sentences Use of filler words and slang – contributes to a conversational tone that resonates with audiences == Origins == The origins of "influencer speak" are linked to the "Valley Girl" accent, which became prominent in the 1980s. This earlier style included features such as uptalk and vocal fry, which have been adapted for digital platforms. Linguists have noted that these patterns are often led by young women, who are recognized as linguistic innovators in sociolinguistic research. == Sociolinguistic significance == "Influencer speak" is used to maintain audience engagement. Features such as uptalk help speakers retain the "conversational floor," ensuring continuous attention from listeners. A study conducted by UCLA researchers has shown that creators adjust their speech styles based on the platform and audience. For example, a comedic tone may be emphasized on TikTok, while a more professional tone may be used on platforms such as LinkedIn or YouTube.
Read more →
Hardware-based encryption

Hardware-based encryption is the use of computer hardware to assist software, or sometimes replace software, in the process of data encryption. Typically, this is implemented as part of the processor's instruction set. For example, the AES encryption algorithm (a modern cipher) can be implemented using the AES instruction set on the ubiquitous x86 architecture. Such instructions also exist on the ARM architecture. However, more unusual systems exist where the cryptography module is separate from the central processor, instead being implemented as a coprocessor, in particular a secure cryptoprocessor or cryptographic accelerator, of which an example is the IBM 4758, or its successor, the IBM 4764. Hardware implementations can be faster and less prone to exploitation than traditional software implementations, and furthermore can be protected against tampering. == History == Prior to the use of computer hardware, cryptography could be performed through various mechanical or electro-mechanical means. An early example is the Scytale used by the Spartans. The Enigma machine was an electro-mechanical system cipher machine notably used by the Germans in World War II. After World War II, purely electronic systems were developed. In 1987 the ABYSS (A Basic Yorktown Security System) project was initiated. The aim of this project was to protect against software piracy. However, the application of computers to cryptography in general dates back to the 1940s and Bletchley Park, where the Colossus computer was used to break the encryption used by German High Command during World War II. The use of computers to encrypt, however, came later. In particular, until the development of the integrated circuit, of which the first was produced in 1960, computers were impractical for encryption, since, in comparison to the portable form factor of the Enigma machine, computers of the era took the space of an entire building. It was only with the development of the microcomputer that computer encryption became feasible, outside of niche applications. The development of the World Wide Web lead to the need for consumers to have access to encryption, as online shopping became prevalent. The key concerns for consumers were security and speed. This led to the eventual inclusion of the key algorithms into processors as a way of both increasing speed and security. == Implementations == === In the instruction set === ==== x86 ==== The X86 architecture, as a CISC (Complex Instruction Set Computer) Architecture, typically implements complex algorithms in hardware. Cryptographic algorithms are no exception. The x86 architecture implements significant components of the AES (Advanced Encryption Standard) algorithm, which can be used by the NSA for Top Secret information. The architecture also includes support for the SHA Hashing Algorithms through the Intel SHA extensions. Whereas AES is a cipher, which is useful for encrypting documents, hashing is used for verification, such as of passwords (see PBKDF2). ==== ARM ==== ARM processors can optionally support Security Extensions. Although ARM is a RISC (Reduced Instruction Set Computer) architecture, there are several optional extensions specified by ARM Holdings. === As a coprocessor === IBM 4758 – The predecessor to the IBM 4764. This includes its own specialised processor, memory and a Random Number Generator. IBM 4764 and IBM 4765, identical except for the connection used. The former uses PCI-X, while the latter uses PCI-e. Both are peripheral devices that plug into the motherboard. === Proliferation === Advanced Micro Devices (AMD) processors are also x86 devices, and have supported the AES instructions since the 2011 Bulldozer processor iteration. Due to the existence of encryption instructions on modern processors provided by both Intel and AMD, the instructions are present on most modern computers. They also exist on many tablets and smartphones due to their implementation in ARM processors. == Advantages == Implementing cryptography in hardware means that part of the processor is dedicated to the task. This can lead to a large increase in speed. In particular, modern processor architectures that support pipelining can often perform other instructions concurrently with the execution of the encryption instruction. Furthermore, hardware can have methods of protecting data from software. Consequently, even if the operating system is compromised, the data may still be secure (see Software Guard Extensions). == Disadvantages == If, however, the hardware implementation is compromised, major issues arise. Malicious software can retrieve the data from the (supposedly) secure hardware – a large class of method used is the timing attack. This is far more problematic to solve than a software bug, even within the operating system. Microsoft regularly deals with security issues through Windows Update. Similarly, regular security updates are released for Mac OS X and Linux, as well as mobile operating systems like iOS, Android, and Windows Phone. However, hardware is a different issue. Sometimes, the issue will be fixable through updates to the processor's microcode (a low level type of software). However, other issues may only be resolvable through replacing the hardware, or a workaround in the operating system which mitigates the performance benefit of the hardware implementation, such as in the Spectre exploit.
Read more →
Grid network

A grid network is a computer network consisting of a number of computer systems connected in a grid topology. In a regular grid topology, each node in the network is connected with two neighbors along one or more dimensions. If the network is one-dimensional, and the chain of nodes is connected to form a circular loop, the resulting topology is known as a ring. Network systems such as FDDI use two counter-rotating token-passing rings to achieve high reliability and performance. In general, when an n-dimensional grid network is connected circularly in more than one dimension, the resulting network topology is a torus, and the network is called "toroidal". When the number of nodes along each dimension of a toroidal network is 2, the resulting network is called a hypercube. A parallel computing cluster or multi-core processor is often connected in regular interconnection network such as a de Bruijn graph, a hypercube graph, a hypertree network, a fat tree network, a torus, or cube-connected cycles. A grid network is not the same as a grid computer or a computational grid, although the nodes in a grid network are usually computers, and grid computing requires some kind of computer network or "universal coding" to interconnect the computers.
Read more →
How Data Happened

How Data Happened: A History from the Age of Reason to the Age of Algorithms is a 2023 non-fiction book written by Columbia University professors Chris Wiggins and Matthew L. Jones. The book explores the history of data and statistics from the end of the 18th century to the present day. == Content == The book starts at the end of the 18th century, when European states began tabulating physical resources, and ends at the present day, when algorithms manipulate our personal information as a commodity. It looks at the rise of data and statistics, and how early statistical methods were used to justify eugenics, quantify supposed racial differences, and develop military and industrial applications. The authors also discuss the impact of the internet and e-commerce on data collection, the rise of data science, and the consequences of government-run surveillance systems collecting vast amounts of personal data for customized, targeted advertising. They emphasize the importance of privacy and democracy and propose remedies to the problems caused by mass data collection, including stronger regulation of the tech industry and collective action by its employees. The book is a historical analysis that provides context for understanding the debates surrounding data and its control. The book has 336 pages and was published in 2023 by W. W. Norton & Company.
Read more →
Digital media in education

Digital media in education refers to the use of digital technologies to support and enhance teaching and learning processes. This includes the application of multiple digital software applications, devices, and online platforms as tools for learning. Learners interact with these technologies to access, analyze, evaluate, and create media content and communication in various forms. The integration of digital media in education has dramatically increased over time, significantly transforming traditional educational practices. When viewed through a global and inclusive lens, digital education should be guided by principles of equity, inclusion, and public infrastructure to ensure meaningful participation of all learners. == History == === 20th century === Technological advances in the 20th century, particularly the invention of the Internet, laid the foundation for incorporating technology into education. In the early 1900s, the overhead projector and instructional radio broadcasts were among the first technologies used for educational purposes. The introduction of computers in classrooms occurred in 1950, when a flight simulation program was developed to train pilots at the Massachusetts Institute of Technology. However, access to computers remained extremely limited for several decades. In 1964, John Kemeny and Thomas Kurtz developed the BASIC programming language, which simplified computer interaction and introduced time-sharing, enabling multiple users to work on the same system simultaneously. This innovation made computing increasingly accessible for educational settings. By the 1980s, schools began to show more interest in computers as companies released mass-market devices to the public. Networking further enabled the interconnection of computers into unified communication systems, which proved more efficient and cost-effective than previous stand-alone machines. This development prompted wider adoption of computing in educational institutions. The invention of the World Wide Web in 1992 further simplified internet navigation and sparked further interest in educational settings. Initially, computers were integrated into school curricula for tasks such as word processing, spreadsheet creation, and data organization. By the late 1990s, the Internet became a research tool, functioning as a vast library. By 1999, 99% of public school teachers in the United States reported having access to at least one computer in their schools, and 84% had a computer available in their classrooms. The emergence of World Wide Web also contributed to the development of learning management systems (LMS), which allowed educators to create online teaching environments for content storage, student activities, discussions, and assignments. Advances in digital compression and high-speed Internet made video creation and distribution more affordable, fostering the use of the systems designed for recording lectures. These tools were often incorporated into learning management platforms, supporting the expansion of fully online courses. === 21st century === By 2002, the Massachusetts Institute of Technology began offering recorded lectures to the public, marking a significant milestone in the movement toward accessible online education. The launch of YouTube in 2005 further transformed educational content distribution. Educators increasingly uploaded lectures and instructional videos on platforms with initiatives like Khan Academy, which was active in 2006, contributing to You Tube's role as a prominent educational resource. In 2007, Apple launched iTunesU, another platform for sharing educational resources and videos. Meanwhile, learning management systems gained popularity, with Blackboard and Canvas becoming two of the most widely used platforms with Canvas's release in 2008. That same year also marked the introduction of the first Massive Open Online Course (MOOC), which provided open access to webinars and expert-led instructions for global learners. As technology evolved, traditional projectors were gradually replaced by interactive whiteboards, which enabled educators to integrate digital tools more effectively in their classrooms. By 2009, 97% of classrooms in the United States had at least one computer, and 93% had Internet access. The COVID-19 pandemic, which forced schools across the world to close, significantly impacted education with schools shifting to distance education. Students attended classes remotely using devices such as laptops, phones, and tablets, supported by digital platforms that facilitated at-home learning environments. However, adapting assessment methods to the new learning environment posed certain challenges. A study conducted by Eddie M. Mulenga and José M. Marbán on Zambian students during the pandemic revealed difficulties in adapting to digital learning, particularly in subjects like mathematics. Similar issues were reported among students in Romania, where the transition to virtual learning presented significant obstacles in engagement and adaptability. === Post-pandemic developments === In the period following the onset of COVID-19, education systems worldwide rapidly adopted digital solutions to maintain continuity of learning and teaching. By the end of March 2020, all 46 OECD and partners countries closed some or all of their schools nationwide. By June 2020, the length of school closures in these countries ranged from 7 to over 18 weeks. These disruptions in formal education prompted governments and educators to quickly adopt digital learning. This global shift to online education highlighted considerable inequalities in digital access, although many systems struggled with inequitable access, especially in regions lacking devices, stable internet connections, or conducive home learning environments. Stimultaneously, commercial educational technology (ed-tech) companies introduced rapid digital solutions to the disruption caused by the pandemic. This led to what has been described as a "seller's market," where the urgency of implementation may cause the prioritization of availability and scale over pedagogical and equity considerations. In the post-pandemic era, digital media in education continues to evolve. It increasingly intersects with artificial intelligence (AI) technologies such as adaptive learning platforms, AI-enabled content generation, and personalized learning environments. These tools enhance global engagement and access but also raise concerns about infrastructure, inclusivity, ethical implementation as well as critical pedagogies. Scholars recommend that educators and policymakers adopt inclusive practices, prioritize equitable infrastructure, and develop critical digital literacy. Facer and Selwyn also emphasize the need for public digital infrastructure and sustainable and justice-oriented policies that empower all learners. Overall, these perspectives reflect a growing consensus that digital media in education should be implemented critically to promote inclusive, multimodal, and future-oriented learning environments.
Read more →
Filter (social media)

Filters are digital image effects often used on social media. They initially simulated the effects of camera filters, and they have since developed with facial recognition technology and computer-generated augmented reality. Social media filters—especially beauty filters—are often used to alter the appearance of selfies taken on smartphones or other similar devices. While filters are commonly associated with beauty enhancement and feature alterations, there is a wide range of filters that have different functions. From adjusting photo tones to using face animations and interactive elements, users have access to a range of tools. These filters allow users to enhance photos and allow room for creative expression and fun interactions with digital content. == History == Beauty filters originate from Purikura ("print club"), a type of Japanese photographic arcade game machine conceived in 1994 by Sasaki Miho, a female employee at Atlus, and released in 1995 by Atlus and Sega primarily for female visitors at Japanese arcades. They allowed the manipulation of digital selfie photos with kawaii beauty filters similar to later Snapchat filters. Purikura filters included beautifying the image, cat whiskers, bunny ears, writing text, scribbling graffiti, selecting backdrops, borders, insertable decorations, icons, hair extensions, twinkling diamond tiaras, tenderized light effects, and predesigned decorative margins. To capitalize on the Purikura phenomenon in Japan during the late 1990s, Japanese mobile phones began including a front-facing camera, starting with the Kyocera Visual Phone VP‑210 in 1999. The Sanyo SCP-5300 released in 2002 was the first camera phone with filter effects, such as illumination, white‑balance control, sepia, black and white, and negative colors. Purikura-like beauty filters later appeared in smartphone apps such as Instagram and Snapchat in the 2010s. In 2010, Apple introduced the iPhone 4—the first iPhone model with a front-facing camera. It gave rise to a dramatic increase in selfies, which could be touched up with more flattering lighting effects with applications such as Instagram. The American photographer Cole Rise was involved in the creation of the original filters for Instagram around 2010, designing several of them himself, including Sierra, Mayfair, Sutro, Amaro, and Willow. However, the technology for virtual lens filters was invented and patented by Patrick Levy-Rosenthal in 2007. The patent received 100 citations, including Facebook, Nvidia, Microsoft, Samsung, and Snap. In September, 2011, the Instagram 2.0 update for the application introduced "live filters," which allowed the user to preview the effect of the filter while shooting with the application's camera. #NoFilter, a hashtag label to describe an image that had not been filtered, became popular around 2013. An update in 2014 allowed users to adjust the intensity of the filters as well as fine-tune other aspects of the image, features that had been available for years on applications such as VSCO and Litely. In 2014, Snapchat started releasing sponsored filters to monetize the participatory use of the application. In September 2015, Snapchat acquired Looksery and released a feature called "lenses," animated filters using facial recognition technology. Some of the early lenses available on Snapchat at the time were Heart Eyes, Terminator, Puke Rainbows, Old, Scary, Rage Face, Heart Avalanche. The Coachella filter released April 2016 was a popular early augmented reality filter. In April 2017, Facebook released the Camera Effects Platform, which is the first augmented reality platform that allows developers to create their own filters and effects on Facebook's Camera. In December 2017, Snapchat also launched their Lens Studio augmented reality developer tool that allows users and advertisers to do the same on the Snapchat application. In April 2022,TikTok joined the two, and launched their own augmented reality developer platform called Effect house. In February 2023, Effect House gave opened up the access to generative AI tools that allowed creators to change facial features in real time. In November 2023, TikTok released a feature where users no longer needed Effect House to create their own filters, as they are now able to create their own effects on the TikTok application. In August 2024, Meta announced that it would be removing third-party filter effects from its family of apps by January 14, 2025. The AR development software Meta Spark AR will also be retired at the same time; it was at one point the "world's largest mobile AR platform". Brand and creator effects represent the vast majority of filters available on Meta platforms, with over 2 million third-party filters available as of 2021. == Beauty filter == A beauty filter is a filter applied to still photographs, or to video in real time, to enhance the physical attractiveness of the subject. Typical effects of such filters include smoothing skin texture and modifying the proportions of facial features, for example enlarging the eyes or narrowing the nose. Filters may be included as a built-in feature of social media apps such as Instagram or Snapchat, or implemented through standalone applications such as Facetune. In 2020, the "Perfect Skin" filter for Snapchat and Instagram which was created by Brazilian augmented reality developer Brenno Faustino gained more than 36 million impressions in the first 24 hours of its release. In 2021, TikTok users pointed out how the default front-facing camera on the platform automatically applied the retouch and other feature-altering filters. Users noted that these filters slimmed down faces, smoothed skin, whitened teeth, and altered facial features such as nose and eye size, without the option to disable this feature through settings. In March 2023, the "Bold Glamour" filter was released on TikTok and instantly went viral with over 18 million videos created within its first week. This filter subtly enhances the user's facial features seamlessly, giving the illusion of fuller eyebrows, taller cheekbones, enhanced eye make up, a smaller nose, plumper lips, and clearer skin, giving off a natural yet distinct effect. As of May 2024, the filter has been used in over 220 million videos and has become a pivotal moment for beauty filters on digital platforms. Critics have raised concerns that the widespread use of such filters on social media may lead to negative body image, particularly among girls. Though Meta's intention of removing third-party filters will likely see all beauty filters removed, academics feel that the damage of beautifying filters is already done. === Background === The manipulation of photos to enhance attractiveness has long been possible using software such as Adobe Photoshop and, before that, analogue techniques such as airbrushing. However, such tools required considerable technical and artistic skill, and so their use was mostly limited to professional contexts, such as magazines or advertisements. By contrast, filters work in an automated fashion through the use of complex algorithms, requiring little or no input from the user. This ease of use, in combination with the increase in processing power of smartphones, and the rise of social media and selfie culture, have led to photographic manipulation occurring on a much wider scale than ever before. One of the earliest examples of a content-aware digital photographic filter is red-eye reduction. === Effects === Typical changes applied by beauty filters include: Smoothing skin texture; minimizing fine lines and blemishes Erasing under-eye bags Erasing naso-labial lines ("laugh lines") Application of virtual makeup, such as lipstick or eyeshadow Slimming the face; erasing double chins Enlarging the eyes Whitening teeth Narrowing the nose Increasing fullness of the lips Beauty filters most frequently target the face, though in some cases they may affect other body parts. For example, the app "Retouch Me" was reported to have a feature which allows users to superimpose visible abdominal muscles (a "six pack") onto photos featuring the subject's bare stomach. === Reception and psychological effects === Some commentators have expressed concern that beauty filters may create unrealistic beauty standards, particularly among girls, and contribute to rates of body dysmorphic disorder. A correlation has been established between negative body image and the use of beautifying filters, though the direction of causation is unknown. The inability to discern whether a particular image has been filtered is thought to exacerbate their negative psychological effects. Policymakers have advocated for social networks to disclose the use of filters; TikTok, Instagram, and Snapchat all label filtered photos and videos with the name of the filter applied. It has also been noted that beauty filters on social media tend to highlight Eurocentric features, like lighter eyes, a smaller nose, and flushed ch
Read more →
Digital artifactual value

Digital artifactual value, a preservation term, is the intrinsic value of a digital object, rather than the informational content of the object. Though standards are lacking, born-digital objects and digital representations of physical objects may have a value attributed to them as artifacts. == Intrinsic value in analog materials == With respect to analog or non-digital materials, artifacts are determined to have singular research or archival value if they possess qualities and characteristics that make them the only acceptable form for long-term preservation. These qualities and characteristics are commonly referred to as the item's intrinsic value and form the basis upon which digital artifactual value is currently evaluated. Artifactual value based on this idea is predicated upon the artifact's originality, faithfulness, fixity, and stability. The intrinsic value of a particular object, as interpreted by archival professionals, largely determines the selection process for archives. The National Archives and Records Administration Committee on Intrinsic Value in "Intrinsic Value in Archival Material" classified an analog object as having intrinsic value if it possessed one or more of the follow qualities: Physical form that may be the subject for study if the records provide meaningful documentation or significant examples of the form. Aesthetic or artistic quality. Unique or curious physical features. Age that provides a quality of uniqueness. Value for use in exhibits. Questionable authenticity, date, author, or other characteristic that is significant and ascertainable by physical examination. General and substantial public interest because of direct association with famous or historically significant people, places, things, issues or events. Significance as documentation of the establishment or continuing legal basis of an agency or institution. Significance as documentation of the formulation of policy at the highest executive levels when the policy has significance and broad effect throughout or beyond the agency or institution. Other archival professionals such as Lynn Westney have written that the characteristics of materials exhibiting intrinsic value include age, content, usage, particularities of creation, signatures, and attached seals. Westney and others have stated that paper-based artifacts can be thought to have evidentiary value, or significant contextual markings, insofar that the original manifestation of the artifact can attest to the originality, faithfulness or authenticity, fixity, and stability of the content. For other analog materials, properly articulating intrinsic value remains essential for determining artifactual value. Similar to paper-based objects in many respects, artifactual value for images typically takes into account artistic value, age, authorial prestige, significant provenance, and institutional priorities. Analog audio preservation is based upon similar factors, including the cultural value of the item, its historical uniqueness, the estimated longevity of the medium, the current condition of the item, and the state of playback equipment, among other things. == Analog conventions in a digital realm == The standard definition of artifactual value, as it has applied to analog or non-digital materials in the twentieth century, is based upon a set of conventions which do not ordinarily apply to digital objects in toto. The Council on Library and Information Resources (CLIR) has stated that printed texts and other paper-based manuscripts, when considered as objects, are imbued with meaning distilled from a general set of understandings inherent to these conventions: The object is of a fixed and stable composition/form. Authorship and intellectual property are a recognizable concept. Duplication is possible. Fungibility of informational content (or, in other words, the ability to be replaced by another identical object). These conventions are important to consider because they help to describe the physical and even metaphysical relationship between a document's content and its physical manifestation. The underpinnings of this relationship are not identical and do not apply with the same degree of clarity to an immaterial digital realm. The idea of fixity with regard to printed materials, for example, is largely predicated on the notion that an object has been recorded on a relatively stable medium. The physical presence of a print text serves as proof of its authenticity as an object or artifact, as well as its scarcity and uniqueness in relation to other print materials. Variations in the chemical properties and storage conditions of print-based materials, as well as other cultural variables, certainly impact the fixity or stability of print materials, but there is little controversy about determining its fundamental existence or originality. However, uniqueness in the physical, paper-based sense does not translate to a digital realm in which immaterial objects are subject to theoretically infinite levels of reproduction and dissemination. Born-digital and digital surrogates may or may not look any different from each other on a server, and alterations can be made without explicit notice to the user. These alterations are normally called migration events, or actions taken on the digital object that change the original object's composition. They can enact subtle but fundamental alterations to the original document, thereby compromising its existence as an original object. Furthermore, because the tools used to generate and access digital objects have historically evolved quite rapidly, issues of playback obsolescence, incapability, data loss, and broken pathways to information have changed traditional ideas of fixity and stability. Therefore, artifactual value in a digital realm requires a modified set of generalized standards for determining artifactual originality. Michael J. Giarlo and Ronald Jantz, only two of many, have posited a list of methods for establishing digital intrinsic value by way of careful metadata generation and records maintenance. In their report, a digital original possesses three key characteristics that distinguishes it from identical copies. These include continuous verification and re-verification of the document's digital signature starting from the date of creation; retaining versions and recordings of all changes to the object in an audit trail; and having the archival master contain the creation date of the digital object. They also reported that originality in digital sources could be verified or produced by the following techniques: Digital object is given a date-time stamp that's automatically inserted into the METS-XML header upon creation. Date-time is inserted into archival metadata. Encapsulation. Digital signatures. == The role of digital surrogates == Digital surrogates are considered a utility for aiding in the preservation and increased access of certain artifacts. However, digital surrogates can have different utilities for objects depending on the nature of the original artifact and the condition the artifact is in. In 2001 the Council on Library and Information Resources (CLIR) published a report on the artifact in library collections. The CLIR states that the utility of the digital surrogate can be determined by dividing the original material (artifact) into two different categories, artifacts that are rare and those that are not. These two categories can be further divided by two categories, artifacts that are frequently used and those that are not. === Materials that are frequently used and not rare === According to the CLIR "it is not obvious that digital surrogates provide all the functionality, all the information, or all the aesthetic value of originals. Therefore, while it may be sensible to recommend that digital surrogates be used to reduce the cost and increase the availability of library holdings that circulate frequently, the decision to deaccession a physical object in library collections and replace it with a digital surrogate should be based on a careful assessment of the way in which library patrons use the original object or objects of its kind." === Materials that are infrequently used and not rare === Keeping the original is always the best solution for libraries and especially archives but in the case of libraries where an artifact is not rare or used infrequently there must be a barometer that is developed to help "balance functionality with actual use in order to help decide when digital surrogates that provide most of the functionality of originals are acceptable." === Materials that are rare and frequently used === A professional in the field of Library and Information Science (LIS) would almost certainly not argue that a digital surrogate could replace a rare object. However, in the case of a rare object that is falling into poor shape due to heavy use a digital surrogate could be extremely useful in reducing the wear a
Read more →
Referring expression generation

Referring expression generation (REG) is the subtask of natural language generation (NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of referring expressions (noun phrases) that identify specific entities called targets. This task can be split into two sections. The content selection part determines which set of properties distinguish the intended target and the linguistic realization part defines how these properties are translated into natural language. A variety of algorithms have been developed in the NLG community to generate different types of referring expressions. == Types of referring expressions == A referring expression (RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to identify some individual object (thing, being, event...) The technical terminology for identify differs a great deal from one school of linguistics to another. The most widespread term is probably refer, and a thing identified is a referent, as for example in the work of John Lyons. In linguistics, the study of reference relations belongs to pragmatics, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of knowledge, perception and cognition more generally. Various devices can be used for reference: determiners, pronouns, proper names... Reference relations can be of different kinds; referents can be in a "real" or imaginary world, in discourse itself, and they may be singular, plural, or collective. === Pronouns === The simplest type of referring expressions are pronoun such as he and it. The linguistics and natural language processing communities have developed various models for predicting anaphor referents, such as centering theory, and ideally referring-expression generation would be based on such models. However most NLG systems use much simpler algorithms, for example using a pronoun if the referent was mentioned in the previous sentence (or sentential clause), and no other entity of the same gender was mentioned in this sentence. === Definite noun phrases === There has been a considerable amount of research on generating definite noun phrases, such as the big red book. Much of this builds on the model proposed by Dale and Reiter. This has been extended in various ways, for example Krahmer et al. present a graph-theoretic model of definite NP generation with many nice properties. In recent years a shared-task event has compared different algorithms for definite NP generation, using the TUNA corpus. === Spatial and temporal reference === Recently there has been more research on generating referring expressions for time and space. Such references tend to be imprecise (what is the exact meaning of tonight?), and also to be interpreted in different ways by different people. Hence it may be necessary to explicitly reason about false positive vs false negative tradeoffs, and even calculate the utility of different possible referring expressions in a particular task context. === Criteria for good expressions === Ideally, a good referring expression should satisfy a number of criteria: Referential success: It should unambiguously identify the referent to the reader. Ease of comprehension: The reader should be able to quickly read and understand it. Computational complexity: The generation algorithm should be fast No false inferences: The expression should not confuse or mislead the reader by suggesting false implicatures or other pragmatic inferences. For example, a reader may be confused if he is told Sit by the brown wooden table in a context where there is only one table. == History == === Pre-2000 era === REG goes back to the early days of NLG. One of the first approaches was done by Winograd in 1972 who developed an "incremental" REG algorithm for his SHRDLU program. Afterwards researchers started to model the human abilities to create referring expressions in the 1980s. This new approach to the topic was influenced by the researchers Appelt and Kronfeld who created the programs KAMP and BERTRAND and considered referring expressions as parts of bigger speech acts. Some of their most interesting findings were the fact that referring expressions can be used to add information beyond the identification of the referent as well as the influence of communicative context and the Gricean maxims on referring expressions. Furthermore, its skepticism concerning the naturalness of minimal descriptions made Appelt and Kronfeld's research a foundation of later work on REG. The search for simple, well-defined problems changed the direction of research in the early 1990s. This new approach was led by Dale and Reiter who stressed the identification of the referent as the central goal. Like Appelt they discuss the connection between the Gricean maxims and referring expressions in their culminant paper in which they also propose a formal problem definition. Furthermore, Reiter and Dale discuss the Full Brevity and Greedy Heuristics algorithms as well as their Incremental Algorithm(IA) which became one of the most important algorithms in REG. === Later developments === After 2000 the research began to lift some of the simplifying assumptions, that had been made in early REG research in order to create more simple algorithms. Different research groups concentrated on different limitations creating several expanded algorithms. Often these extend the IA in a single perspective for example in relation to: Reference to Sets like "the t-shirt wearers" or "the green apples and the banana on the left" Relational Descriptions like "the cup on the table" or "the woman who has three children" Context Dependency, Vagueness and Gradeability include statements like "the older man" or "the car on the left" which are often unclear without a context Salience and Generation of Pronouns are highly discourse dependent making for example "she" a reference to "the (most salient) female person" Many simplifying assumptions are still in place or have just begun to be worked on. Also a combination of the different extensions has yet to be done and is called a "non-trivial enterprise" by Krahmer and van Deemter. Another important change after 2000 was the increasing use of empirical studies in order to evaluate algorithms. This development took place due to the emergence of transparent corpora. Although there are still discussions about what the best evaluation metrics are, the use of experimental evaluation has already led to a better comparability of algorithms, a discussion about the goals of REG and more task-oriented research. Furthermore, research has extended its range to related topics such as the choice of Knowledge Representation(KR) Frameworks. In this area the main question, which KR framework is most suitable for the use in REG remains open. The answer to this question depends on how well descriptions can be expressed or found. A lot of the potential of KR frameworks has been left unused so far. Some of the different approaches are the usage of: Graph search which treats relations between targets in the same way as properties. Constraint Satisfaction which allows for a separation between problem specification and the implementation. Modern Knowledge Representation which offers logical inference in for example Description Logic or Conceptual Graphs. == Problem definition == Dale and Reiter (1995) think about referring expressions as distinguishing descriptions. They define: The referent as the entity that should be described The context set as set of salient entities The contrast set or potential distractors as all elements of the context set except the referent A property as a reference to a single attribute–value pair Each entity in the domain can be characterised as a set of attribute–value pairs for example ⟨ {\displaystyle \langle } type, dog ⟩ {\displaystyle \rangle } , ⟨ {\displaystyle \langle } gender, female ⟩ {\displaystyle \rangle } or ⟨ {\displaystyle \langle } age, 10 years ⟩ {\displaystyle \rangle } . The problem then is defined as follows: Let r {\displaystyle r} be the intended referent, and C {\displaystyle C} be the contrast set. Then, a set L {\displaystyle L} of attribute–value pairs will represent a distinguishing description if the following two conditions hold: Every attribute–value pair in L {\displaystyle L} applies to r {\displaystyle r} : that is, every element of L {\displaystyle L} specifies an attribute–value that r {\displaystyle r} possesses. For every member c {\displaystyle c} of C {\displaystyle C} , there is at least one element l {\displaystyle l} of L {\displaystyle L} that does not apply to c {\displaystyle c} : that is, there is an l {\displaystyle l} in L {\displaystyle L} that specifies an attribute–value that c {\displaystyle c} does not possess. l {\displaystyle l} is said
Read more →
Data plan

A data plan is a subscription plan from a cellular or other mobile service provider to provide internet data and connectivity. == Formatting == Data plans are usually created by a contract between the telecommunications carrier and the user of their service. This contract outlines a maximum amount of usable data, usually highlighted in either megabytes or gigabytes, allotted per month for the user. In most cases companies will allow a user to surpass the amount of data allowed in the contract, however, will have to pay a per-gigabyte fee, ranging anywhere from five to fifteen U.S. dollars. === Popularization of unlimited plans === Unlimited data plans have seen a large increase in usage by consumers since their initial introduction by U.S. network T-Mobile. These plans, instead of setting an overall maximum for the user, have an amount set-up that, when surpassed, will slow the speed of the network for that user. Unlimited plans typically cost significantly more than the traditional shared data plans, which is a major reason that carriers have set large boundaries and fees. The limits imposed on unlimited plans are designed to fight against attempts to misuse the network, such as a DDoS attack, but are more commonly reasoned as a method to increase the number of people that can use one tower simultaneously. === Data speed changes === When a network is near reaching peak capacity data speeds may be slowed down by carriers as part of most major telecom contracts. This, as stated previously, allows for more people to be utilizing one tower, reducing needed capital for the company. Since speed changes are allowed at the company's will, the user has no official guarantee of speed on most major networks. === Costs brought upon by additional data === In many cases both the user and carrier have to incur additional costs when a user utilizes more of a given data package, which has helped in the proliferation of data caps and other forms of shared data plans. Most of the charges that the carrier has to incur for additional data usage is partially or fully given to the user of the network. ==== Users ==== Users are required to pay flat-rate additional fees that occur when they go above the amount of data given to them in their contract, utility, or prepaid plan. The cost per gigabyte of this fee is usually higher than what the contract itself offers, which discourages users from over-utilizing data and incurring a charge for the carrier. Certain contracts, which do not offer paying additional fees for an increase in data, may result in a shutdown of service, or in extremely rare cases, termination of the service as a whole. ==== Carriers ==== Carriers incur costs for additional data usage, as it limits the number of customers, and associated contracts, that they can handle on one network. Creating more cell phone towers in a given area would be costly, and largely useless until particular spikes in traffic. When the peak usable amount of one tower is reached, it may cause negative public relations towards the reliability of the corporation as a whole.
Read more →
Creative work

A creative work is a manifestation of creative effort in the world through a creative process involving one or more individuals. The term includes fine artwork (sculpture, paintings, drawing, sketching, performance art), dance, writing (literature), filmmaking, and musical composition. The term is frequently used in the context of copyright. It is an important concept in both philosophy and law. Creative works require a creative mindset and are not typically rendered in an arbitrary fashion, although works may demonstrate (i.e., have in common) a degree of arbitrariness, such that it is improbable that two people would independently create the same work. At its base, creative work involves two main steps – having an idea, and then turning that idea into a substantive form or process. Typically, the creative process results in work that has some aesthetic value, identified as a creative expression. Naturally, this expression generally invokes external stimuli (e.g., influences and experiences) which a person draws on because they view the source as creative or inspirational; the degree to which this is reflected may be used in determinations of the derivativeness of the created work. Alternatively, the creator may draw on imagination, and their references may be clouded even to them, for the nature of imagination is as yet not fully understood philosophically, and the level of necessary self-examination of an artist's internal processing is a challenge for even those most self-aware of their minds and mental processes. == Legal definition == === United Kingdom === For the purpose of section 221(2)(c) of the Income Tax (Trading and Other Income) Act 2005, the expression "creative works" means: (a) literary, dramatic, musical or artistic works, or (b) designs,created by the taxpayer personally or, if the qualifying trade, profession or vocation is carried on in partnership, by one or more of the partners personally.
Read more →
Hardware compatibility list

A hardware compatibility list (HCL) is a list of computer hardware (typically including many types of peripheral devices) that is compatible with a particular operating system or device management software. The list contains both whole computer systems and specific hardware elements including motherboards, sound cards, and video cards. In today's world, there is a vast amount of computer hardware in circulation, and many operating systems too. A hardware compatibility list is a database of hardware models and their compatibility with a certain operating system. HCLs can be centrally controlled (one person or team keeps the list of hardware maintained) or user-driven (users submit reviews on hardware they have used). There are many HCLs. Usually, each operating system will have an official HCL on its website.
Read more →
Shape context

Shape context is a feature descriptor used in object recognition. Serge Belongie and Jitendra Malik proposed the term in their paper "Matching with Shape Contexts" in 2000. == Theory == The shape context is intended to be a way of describing shapes that allows for measuring shape similarity and the recovering of point correspondences. The basic idea is to pick n points on the contours of a shape. For each point pi on the shape, consider the n − 1 vectors obtained by connecting pi to all other points. The set of all these vectors is a rich description of the shape localized at that point but is far too detailed. The key idea is that the distribution over relative positions is a robust, compact, and highly discriminative descriptor. So, for the point pi, the coarse histogram of the relative coordinates of the remaining n − 1 points, h i ( k ) = # { q ≠ p i : ( q − p i ) ∈ bin ( k ) } {\displaystyle h_{i}(k)=\#\{q\neq p_{i}:(q-p_{i})\in {\mbox{bin}}(k)\}} is defined to be the shape context of p i {\displaystyle p_{i}} . The bins are normally taken to be uniform in log-polar space. The fact that the shape context is a rich and discriminative descriptor can be seen in the figure below, in which the shape contexts of two different versions of the letter "A" are shown. (a) and (b) are the sampled edge points of the two shapes. (c) is the diagram of the log-polar bins used to compute the shape context. (d) is the shape context for the point marked with a circle in (a), (e) is that for the point marked as a diamond in (b), and (f) is that for the triangle. As can be seen, since (d) and (e) are the shape contexts for two closely related points, they are quite similar, while the shape context in (f) is very different. For a feature descriptor to be useful, it needs to have certain invariances. In particular it needs to be invariant to translation, scaling, small perturbations, and, depending on the application, rotation. Translational invariance comes naturally to shape context. Scale invariance is obtained by normalizing all radial distances by the mean distance α {\displaystyle \alpha } between all the point pairs in the shape although the median distance can also be used. Shape contexts are empirically demonstrated to be robust to deformations, noise, and outliers using synthetic point set matching experiments. One can provide complete rotational invariance in shape contexts. One way is to measure angles at each point relative to the direction of the tangent at that point (since the points are chosen on edges). This results in a completely rotationally invariant descriptor. But of course this is not always desired since some local features lose their discriminative power if not measured relative to the same frame. Many applications in fact forbid rotational invariance e.g. distinguishing a "6" from a "9". == Use in shape matching == A complete system that uses shape contexts for shape matching consists of the following steps (which will be covered in more detail in the Details of Implementation section): Randomly select a set of points that lie on the edges of a known shape and another set of points on an unknown shape. Compute the shape context of each point found in step 1. Match each point from the known shape to a point on an unknown shape. To minimize the cost of matching, first choose a transformation (e.g. affine, thin plate spline, etc.) that warps the edges of the known shape to the unknown (essentially aligning the two shapes). Then select the point on the unknown shape that most closely corresponds to each warped point on the known shape. Calculate the "shape distance" between each pair of points on the two shapes. Use a weighted sum of the shape context distance, the image appearance distance, and the bending energy (a measure of how much transformation is required to bring the two shapes into alignment). To identify the unknown shape, use a nearest-neighbor classifier to compare its shape distance to shape distances of known objects. == Details of implementation == === Step 1: Finding a list of points on shape edges === The approach assumes that the shape of an object is essentially captured by a finite subset of the points on the internal or external contours on the object. These can be simply obtained using the Canny edge detector and picking a random set of points from the edges. Note that these points need not and in general do not correspond to key-points such as maxima of curvature or inflection points. It is preferable to sample the shape with roughly uniform spacing, though it is not critical. === Step 2: Computing the shape context === This step is described in detail in the Theory section. === Step 3: Computing the cost matrix === Consider two points p and q that have normalized K-bin histograms (i.e. shape contexts) g(k) and h(k). As shape contexts are distributions represented as histograms, it is natural to use the χ2 test statistic as the "shape context cost" of matching the two points: C S = 1 2 ∑ k = 1 K [ g ( k ) − h ( k ) ] 2 g ( k ) + h ( k ) {\displaystyle C_{S}={\frac {1}{2}}\sum _{k=1}^{K}{\frac {[g(k)-h(k)]^{2}}{g(k)+h(k)}}} The values of this range from 0 to 1. In addition to the shape context cost, an extra cost based on the appearance can be added. For instance, it could be a measure of tangent angle dissimilarity (particularly useful in digit recognition): C A = 1 2 ‖ ( cos ⁡ ( θ 1 ) sin ⁡ ( θ 1 ) ) − ( cos ⁡ ( θ 2 ) sin ⁡ ( θ 2 ) ) ‖ {\displaystyle C_{A}={\frac {1}{2}}{\begin{Vmatrix}{\dbinom {\cos(\theta _{1})}{\sin(\theta _{1})}}-{\dbinom {\cos(\theta _{2})}{\sin(\theta _{2})}}\end{Vmatrix}}} This is half the length of the chord in unit circle between the unit vectors with angles θ 1 {\displaystyle \theta _{1}} and θ 2 {\displaystyle \theta _{2}} . Its values also range from 0 to 1. Now the total cost of matching the two points could be a weighted-sum of the two costs: C = ( 1 − β ) C S + β C A {\displaystyle C=(1-\beta )C_{S}+\beta C_{A}\!\,} Now for each point pi on the first shape and a point qj on the second shape, calculate the cost as described and call it Ci,j. This is the cost matrix. === Step 4: Finding the matching that minimizes total cost === Now, a one-to-one matching π ( i ) {\displaystyle \pi (i)} that matches each point pi on shape 1 and qj on shape 2 that minimizes the total cost of matching, H ( π ) = ∑ i C ( p i , q π ( i ) ) {\displaystyle H(\pi )=\sum _{i}C\left(p_{i},q_{\pi (i)}\right)} is needed. This can be done in O ( N 3 ) {\displaystyle O(N^{3})} time using the Hungarian method, although there are more efficient algorithms. To have robust handling of outliers, one can add "dummy" nodes that have a constant but reasonably large cost of matching to the cost matrix. This would cause the matching algorithm to match outliers to a "dummy" if there is no real match. === Step 5: Modeling transformation === Given the set of correspondences between a finite set of points on the two shapes, a transformation T : R 2 → R 2 {\displaystyle T:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} can be estimated to map any point from one shape to the other. There are several choices for this transformation, described below. ==== Affine ==== The affine model is a standard choice: T ( p ) = A p + o {\displaystyle T(p)=Ap+o\!} . The least squares solution for the matrix A {\displaystyle A} and the translational offset vector o is obtained by: o = 1 n ∑ i = 1 n ( p i − q π ( i ) ) , A = ( Q + P ) t {\displaystyle o={\frac {1}{n}}\sum _{i=1}^{n}\left(p_{i}-q_{\pi (i)}\right),A=(Q^{+}P)^{t}} Where P = ( 1 p 11 p 12 ⋮ ⋮ ⋮ 1 p n 1 p n 2 ) {\displaystyle P={\begin{pmatrix}1&p_{11}&p_{12}\\\vdots &\vdots &\vdots \\1&p_{n1}&p_{n2}\end{pmatrix}}} with a similar expression for Q {\displaystyle Q\!} . Q + {\displaystyle Q^{+}\!} is the pseudoinverse of Q {\displaystyle Q\!} . ==== Thin plate spline ==== The thin plate spline (TPS) model is the most widely used model for transformations when working with shape contexts. A 2D transformation can be separated into two TPS function to model a coordinate transform: T ( x , y ) = ( f x ( x , y ) , f y ( x , y ) ) {\displaystyle T(x,y)=\left(f_{x}(x,y),f_{y}(x,y)\right)} where each of the ƒx and ƒy have the form: f ( x , y ) = a 1 + a x x + a y y + ∑ i = 1 n ω i U ( ‖ ( x i , y i ) − ( x , y ) ‖ ) , {\displaystyle f(x,y)=a_{1}+a_{x}x+a_{y}y+\sum _{i=1}^{n}\omega _{i}U\left({\begin{Vmatrix}(x_{i},y_{i})-(x,y)\end{Vmatrix}}\right),} and the kernel function U ( r ) {\displaystyle U(r)\!} is defined by U ( r ) = r 2 log ⁡ r 2 {\displaystyle U(r)=r^{2}\log r^{2}\!} . The exact details of how to solve for the parameters can be found elsewhere but it essentially involves solving a linear system of equations. The bending energy (a measure of how much transformation is needed to align the points) will also be easily obtained. ==== Regularized TPS ==== The TPS formulation above has exact matching requirement for the pairs of points on the two shapes. For noisy data, it is best to
Read more →
Static web page

A static web page, sometimes called a flat page or a stationary page, is a web page that is delivered to a web browser exactly as stored, in contrast to dynamic web pages which are generated by a web application. Consequently, a static web page displays the same information for all users, from all contexts, subject to modern capabilities of a web server to negotiate content-type or language of the document where such versions are available and the server is configured to do so. However, a webpage's JavaScript can introduce dynamic functionality which may make the static web page dynamic. == Overview == Static web pages are often HTML documents, stored as files in the file system and made available by the web server over HTTP (nevertheless URLs ending with ".html" are not always static). However, loose interpretations of the term could include web pages stored in a database, and could even include pages formatted using a template and served through an application server, as long as the page served is unchanging and presented essentially as stored. The content of static web pages remains stationary irrespective of the number of times it is viewed. Such web pages are suitable for the contents that rarely need to be updated, though modern web template systems are changing this. Maintaining large numbers of static pages as files can be impractical without automated tools, such as static site generators. Any personalization or interactivity has to run client-side, which is restricting. Cloud-based website builders, including Wix, Weebly, and Duda, offer no-code platforms for creating static and dynamic web pages through graphical interfaces, without requiring programming expertise. === Advantages === Provide improved security over dynamic websites (dynamic websites are at risk to web shell attacks if a vulnerability is present) Improved performance for end users compared to dynamic websites Fewer or no dependencies on systems such as databases or other application servers Cost savings from utilizing cloud storage, as opposed to a hosted environment Security configurations are easy to set up, which makes it more secure Static files can be cached by content delivery networks (CDNs) and other intermediate caches, which both reduces page load times at the user and also reduces load on the origin server. Static websites can have improved uptime, since they are still available through any available CDN exit node even when other CDN nodes or the origin webserver are temporarily offline. === Disadvantages === Dynamic functionality must be performed on the client side. After each update of a static website, some or all users may see old, stale, outdated previous versions instead of the latest version until the old version is flushed from CDNs and other caches. == Static site generators == Static site generators are applications that compile static websites - typically populating HTML templates in a predefined folder and file structure, with content supplied in a format such as Markdown or AsciiDoc. === Implementations === Jekyll (powers GitHub Pages) Middleman Hugo Next.js Astro.build Pelican Franklin
Read more →
Over-the-top media services in India

As per Govt of India, there are currently about 57 providers of over-the-top media services (OTT) in India, which distribute streaming media or video on demand over the Internet. == History and growth == The first dependent Indian OTT platform was BIGFlix, launched by Reliance Entertainment in 2008. In 2010 Digivive launched India's first OTT mobile app called nexGTv, which provides access to both live TV and on–demand content. nexGTV was the first app to live–stream Indian Premier League matches on smart phones and did so during 2013 and 2014. The livestream of the IPL since 2015, when rights were won, played an important role in the growth of another OTT platform, Hotstar (now JioHotstar) in India. OTT Platforms gained significant momentum in India when both DittoTV (Zee) and Sony Liv were launched in the Indian market around 2013. Following the initial push of Regional OTT platforms like Aha, Hoichoi, Sun NXT, Planet Marathi, Chaupal & MX Player. The Indian OTT industry saw rapid transformation with the entry of global OTT companies such as Netflix and Amazon Prime Video into the Indian market in 2016. Replacement of this competition with global enterprises caused local rivals to innovate in both region and hyper-regional content. === Hotstar === Hotstar (now JioHotstar) is the most subscribed–to OTT platform in India, owned by JioStar as of February 2025, with around 500 million active users and over 650 million downloads. According to Hotstar's India Watch Report 2018, 96% of watch time on Hotstar comes from videos longer than 20 minutes, while one–third of Hotstar subscribers watch television shows. In 2019, Hotstar began investing ₹120 crore in generating original content such as "Hotstar Specials." 80% of the viewership on Hotstar comes from drama, movies and sports programs. Hotstar has the exclusive streaming rights of IPL in India. === Netflix === American streaming service Netflix entered India in January 2016. In April 2017, it was registered as a limited liability partnership (LLP) and started commissioning content. It earned a net profit of ₹2020,000 (₹2.02 million) for fiscal year 2017. In fiscal year 2018, Netflix earned revenues of ₹580 million. According to Morgan Stanley Research, Netflix had the highest average watch time of more than 120 minutes but viewer counts of around 20 million in July 2018. As of 2018, Netflix has six million subscribers, of which 5–6% are paid members. India was not affected by Netflix's July 2018 increase in subscription rates for the US and Latin America. Netflix has stated its intent to invest ₹600 crore in the production of Indian original programming. In late 2018, Netflix bought 150,000 square feet (14,000 m2) of office space in Bandra–Kurla Complex (BKC) in Mumbai as their head office. As of December 2018, Netflix has more than 40 employees in India. === Other OTT providers === Sun NXT is an Indian video on demand service run by Sun TV Network. It was launched in June 2017, streaming in the Tamil language and six other languages. The platform has more than 4,000 Tamil movies and 200 Tamil shows, as well as regional movies and shows. Sun NXT also streams a large library of its own Sun TV shows and movies. Amazon Prime Video was launched in 2016. The platform has 2,300 titles available including 2,000 movies and about 400 shows. It has announced that it will invest ₹20 billion in creating original content in India. Besides English, Prime Video is available in six Indian languages as of December 2018. Amazon India launched Amazon Prime Music in February 2018. Eros Now, an OTT platform launched by Eros International, has the most content among the OTT providers in India, including over 12,000 films, 100,000 music tracks and albums, and 100 TV shows. Eros Now was named the Best OTT Platform of the Year 2019 at the British Asian Media Awards. It has 211.5 million registered users and 36.2 million paying subscribers as of September 2020. In February 2020, Aha OTT platform was launched, broadcasting exclusively Telugu content. In 2021, Planet Marathi became the first OTT platform dedicated to Marathi content in India, including web-series, films, music, theater, fiction and non-fiction reality shows. It is available for both Android and iOS mobile devices along with Android TV and Amazon Fire TV devices. Bollywood actress Madhuri Dixit helped launch the platform. With rising interest for Korean dramas, Rakuten Viki saw its biggest jump of web traffic from India in 2020 due to the COVID-19 lockdown, which led to ad localization on the platform. The OTT market in fiscal year 2020 was estimated to be worth $1.7 billion. === SonyLIV and ZEE5 === In December 2021, Sony and Zee announced their merger, and announced plans to merge their OTT platforms. The merger was called off. === OTT services launched as Amazon Prime video channels === The list is by alphabetical order, not by rank or popularity. == Content regulation == Due to the absence of any rules and regulation regarding OTT content, many OTT providers were accused of showing nudity, vulgarity and obscenity and hurting Hindu religious sentiments in their shows. Series which were the focus of controversy include Four More Shots Please!, Tandav, Paatal Lok, Sacred Games, Mirzapur Lust stories franchise, Rana Naidu. Thank You for Coming, and Annapoorani (2023). According to media reports, between 2018 and 2024, some OTT platforms emerged which started showing porn in the form of web series. Both the Supreme Court and Delhi High Court say that OTT regulation is necessary. === OTT regulation === On 25 Feb 2021, Indian govt introduced self-regulation rules for OTT platforms to stop obscene content and abusive language. On 19 March 2023, I&B minister Anurag Thakur said that self regulation does not mean that OTT should show obscenity and nudity. On 15 April 2023, I&B Secretary Apurva Chandra has said because of the government's soft-touch regulations on OTT industry have led to the creation of content that is undesirable and vulgar. On 26 April 2023, MIB India said that if nudity and obscenity is seen on any OTT platform, strict action will be taken against it. On 16 May 2023, Don't show obscene content, parliamentary panel told to Netflix and Amazon Prime Video. On 20 June 2023, the government told Netflix, Disney+ Hotstar and all other streaming services that their content should be independently reviewed for obscenity and violence before being shown online. On 27 June 2023, DPCGC took punitive action against Ullu for streaming obscene content and asked them to remove all their explicit shows or remove all adult scenes within 15 days. On 18 July 2023, Anarug Thakur said in a meeting with all OTT stakeholders that demeaning Indian culture will not be tolerated. OTT can't show vulgarity and nudity in the garb of 'creative expression'.The cited sources do not mention vulgarity - they say this was about demeaning Indian culture/society. On 22 August 2023, Indian government assured that it will bring rules and regulation to regulate vulgar and obscene content on social media and OTT platforms. On 10 November 2023, MIB India introduces the 'Broadcasting Service Regulation Bill', which included Programme code with Content Evaluation Committee(CEC) for every OTT platforms. Currently public consultation is ongoing till 15 January 2024. The draft bill mandates that all OTT streaming platforms can only broadcast those web series or content, which will be duly certified by Content Evaluation Committee(CEC). On 14 March 2024, the Ministry of Information and Broadcasting banned over 18 OTT apps from Google play store and suspended all of their 57 social media accounts, as well as closed nineteen streaming websites. The banned platforms were MoodX, Prime Play, Hunters, Besharams, Rabbit movies, Voovi, Fugi, Mojflix, Chikooflix, Nuefliks, Xtramood, NeonX VIP, X Prime, Tri Flicks, Uncut Adda, Dreams Films, Hot Shots VIP, and Yessma. On 25 July 2025, the Ministry of Information and Broadcasting banned from 25 OTT apps from Google play store and suspended all of their 40 social media accounts, as well as 26 closed streaming websites. The banned platforms were include ALTT, Ullu, Big Shots App, Desiflix, Boomex, NeonX VIP, Navarasa Lite, Gulab App, Kangan App, Bull App, ShowHit, Jalva App, Wow Entertainment, Look Entertainment, Hitprime, Fugi, Feneo, ShowX, Sol Talkies, Adda TV, HotX VIP, Hulchul App, MoodX, Triflicks, and Mojflix. On 24 February 2026, the Ministry of Information and Broadcasting banned from 5 OTT apps from Google play store and suspended all of their 5 social media accounts, as well as 5 closed streaming websites. The banned platforms were include Feel App, Digi Movieplex, Jugnu App, MoodX VIP, and Koyal Playpro. === Legal action === Currently OTT is regulated under the IT Rules 2021, which clearly stated that 'No content that is prohibited by law at the time being force can be Publishing or transmitted'. MIB has continuously taking action
Read more →