AI Writing Tools

Explore the best AI Writing Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • EPages

    EPages

    ePages is an e-commerce software that allows merchants to create and run online shops in the cloud. The number of shops based on ePages is currently 140,000 worldwide. ePages software is regularly updated due to its Software-as-a-Service model. An investor in the company is United Internet, with a 25% stake. ePages focuses upon distributing its products mainly through hosting providers. ePages is headquartered in Hamburg, with additional offices Barcelona, Jena, and Bilbao. == History == The name ePages was used for the first time for software in 1997 to market "Intershop ePages". In 2002, the product line then called Intershop 4 was taken over by ePages GmbH and renamed to ePages. == Features == Depending on the ePages product and packages offered by hosting providers, merchants can sell up to an unlimited number of items. Users can offer their products and services in 15 languages and with all currencies. With ePages, merchants can use web marketing tools; e.g. newsletters, coupons or social media plug-ins for social commerce.

    Read more →
  • Digital entertainment

    Digital entertainment

    Digital entertainment Industry includes, but is not restricted to, any combination of the following industries (that themselves have a considerable degree of overlap): digital media new media video on demand video games interactive entertainment online gambling mobile entertainment social media streaming services "Digital entertainment", largely a hard to define marketing term, rests upon entertainment technology and ultimately on the enabling basic technologies computers, Internet/World Wide Web, digital rights management, multimedia and streaming media. Apart from pure entertainment, the term rests upon the observation that already in 2011 in the UK, for example, "nearly half of people’s waking hours are spent using media content and communications services" ("screen time"). Digital entertainment is inextricably connected with digital marketing. People who follow influencers on social media for entertainment will receive a fair share of advertising at the same time. Digital merchandise is distributed with every computer game and popup ads or similar are ubiquitous in the online (gaming) world.

    Read more →
  • HtmlUnit

    HtmlUnit

    HtmlUnit is a headless web browser written in Java. It allows high-level manipulation of websites from other Java code, including filling and submitting forms and clicking hyperlinks. It also provides access to the structure and the details within received web pages. HtmlUnit emulates parts of browser behaviour including the lower-level aspects of TCP/IP and HTTP. A sequence such as getPage(url), getLinkWith("Click here"), click() allows a user to navigate through hypertext and obtain web pages that include HTML, JavaScript, Ajax and cookies. This headless browser can deal with HTTPS security, basic HTTP authentication, automatic page redirection and other HTTP headers. It allows Java test code to examine returned pages either as text, an XML DOM, or as collections of forms, tables, and links. The goal is to simulate real browsers; namely Chrome, Firefox and Edge. The most common use of HtmlUnit is test automation of web pages, but sometimes it can be used for web scraping, or downloading website content. == Benefits == Provides high-level API, taking away lower-level details away from the user. Compared to other WebDriver implementations, HtmlUnitDriver is the fastest to implement. It can be configured to simulate a specific browser. == Drawbacks == Element layout and rendering can not be tested. The JavaScript support is not complete, which is one of the areas of ongoing enhancements. == Used technologies == W3C DOM HTTP connection, using Apache HttpComponents JavaScript, using forked Rhino HTML Parsing, NekoHTML CSS: using CSS Parser XPath support, using Xalan == Libraries using HtmlUnit == Selenium WebDriver Spring MVC Test Framework Google Web Toolkit tests WebTest Wetator

    Read more →
  • M-DISC

    M-DISC

    M-DISC (Millennial Disc) is a write-once optical disc technology introduced in 2009 by Millenniata, Inc. and available as DVD and Blu-ray discs. == Overview == M-DISC's design is intended to provide archival media longevity. M-Disc claims that properly stored M-DISC DVD recordings will last up to 1000 years. The M-DISC DVD looks like a standard disc, except it is almost transparent with later DVD and BD-R M-Disks having standard and inkjet printable labels. The patents protecting the M-DISC technology assert that the data layer is a glassy carbon material that is substantially inert to oxidation and has a melting point of 200–1000 °C (392–1832 °F). M-Discs are readable by most regular DVD players made after 2005 and Blu-Ray and BDXL disc drives and writable by most made after 2011. Available recording capacities conform to standard DVD/Blu-ray sizes: 4.7 GB DVD+R to 25 GB BD-R, 50 GB BD-R and 100 GB BDXL. == History == M-DISC developer Millenniata, Inc. was co-founded by Brigham Young University professors Barry Lunt, Matthew Linford, CEO Henry O'Connell and CTO Doug Hansen. The company was incorporated on May 13, 2010, in American Fork, Utah. Millenniata, Inc. officially went bankrupt in December 2016. Under the direction of CEO Paul Brockbank, Millenniata had issued convertible debt. When the obligation for conversion was not satisfied, the company defaulted on the debt payment and the debt holders took possession of all of the company's assets. The debt holders subsequently started a new company, Yours.co, to sell M-DISCs and related services. As of the 2020s, there are only 2 licensed manufacturers of M-Discs: Ritek, sold under the Ritek and RiDATA brands, and Verbatim with co-branded discs, marketed as the "Verbatim M-DISC". 128 GB BDXL never made it to market due to the 2016 bankruptcy. Early in 2022, Verbatim changed the formulation of their M-DISC branded Blu-rays. These new discs could be written at a faster rate than the previous ones – 6× speed instead of 4×. The new discs also had different colouration and markings compared with older version. Later in the year customers accused Verbatim of selling an inferior product and deceptive marketing. Verbatim responded that the new discs were a further development of the older discs and should have the same longevity, and that the technical changes therein were responsible for the altered appearance and higher write speeds. The updated M-DISC currently sold on the market uses the same metal ablative layer (MABL) metal oxide inorganic recording layer used in many of Verbatim's regular Blu-ray products. == Durability claims == The original M-DISC DVD+R was tested according to ISO/IEC 10995:2011 and ECMA-379 with a projected rated lifespan of several hundred years in archival use. The glassy carbon layers, in theory if preserved correctly in an environment like a salt mine, could store the data for over 10,000 years before going outside of readable specifications. However, the polycarbonate plastics, which are commonly used by almost all optical media and heavily in CBRN and ballistic protective equipment due to their optical, physical impact and chemical resistant properties, have a lifespan rating of only around 1000 years before degradation. Verbatim Japan claims that M-DISCs now use a titanium layer to prevent moisture ingression and to provide environmental stability. M-DISCs sold in Japan are advertised to have a projected lifespan of 100 years or more based on internal ISO/IEC 16963 testing, while other regional Verbatim websites claim that M-DISCs have a projected lifespan of "several hundred years" based on ISO/IEC 16963 testing. == Durability testing == In 2009, testing was done by the US Department of Defense (DoD) producing the China Lake Report testing Millenniata's M-Disk DVD to current market offerings from Delkin, MAM-A, Mitsubishi, Taiyo Yuden and Verbatim with all brands using organic dyes failing to pass the series of accelerated aging tests. From 2010 to 2012, the French National Laboratory of Metrology and Testing (LNE) used high-temperature accelerated aging testing, at 90 °C (194 °F) and 85% relative humidity inside a CLIMATS Excal 5423-U, for 250 to 1000 hours with a mix of inorganic DVD+R discs from MPO, Verbatim, Maxell, Syylex and DataTresor. The summary of the tests states that Syylex Glass Master Disc was rated for 1000+ hours, DataTresor Disc 250 hours+ and M-Disk under 250 hours. The Syylex disc was a custom-ordered product that could not be burned in a consumer player when they were still purchaseable from Syylex before their bankruptcy, so it was not truly in the same category as the others. In 2016, a consumer Mol Smith did real world stress testing on the 25 GB BD-R M-Disc alongside TDK's standard BD-R 25 GB disc using a copied movie, which demonstrated the reliability of M-Disc's molding compared to standard discs; after 60 days of outdoor direct exposure the M-Disk was played without error, while the TDK disc was physically destroyed. In 2022, the NIST Interagency Report NIST IR 8387 listed the M-Disc as an acceptable archival format rated for 100+ years, citing the aforementioned 2009 and 2012 tests by the US Department of Defense and French National Laboratory of Metrology and Testing as sources. == Commercial support == While recorded discs are readable in conventional DVD and BD drives, M-disc DVDs can only be burned by drives with firmware that supports the slightly higher power mode that M-Disk requires for burning its inorganic layers, as such writing speed is typically 2× speed. Blu-ray M-discs can be both written and read in most standard Blu-ray drives and are certified by the Blu-ray Disc Association to meet all current standard specifications as of 2019. Typically, the M-Discs cost 1.5–3× the price of standard Blu-Ray discs with DVD M-Discs now having sparse availability. With the first-generation DVD M-DISCs, it was difficult to determine which was the writable side of the disc due to being near fully translucent, until coloring and later labels similar to that on standard DVD discs was added to discs to help distinguish the sides preventing user error. Asus, LG Electronics, Lite-On, Pioneer, Buffalo Technology, and Hitachi-LG produce drives that can record M-DISC media while Verbatim and Ritek produce M-DISC discs. == Adoption == The regional government of the U.S. state of Utah has used M-Disc since 2011. Some consumers and avid datahoarders have adopted the format for cold digital data storage. == Alternative technologies == === Optical === Syylex Glass Master Disc: these discs use etched glass and are only typically degradable by physical or chemical damage, but not by normal ageing inside an archival environment. Current BD 25 GB, BD-R DL 50 GB & BDXL 100 GB (three layer) and Sony's BDXL 128 GB (four layer) discs are rated for up to 50 years (Standard inorganic HTL discs). Sony's Optical Disc Archive, is an optical competitor to the LTO tape-based data storage system, currently with up to 5.5 TB cartridges of dual-sided 120mm discs, with desktop readers and automated rackmount standard archival systems allowing for large scale archival and data retrieval rated for an estimated 100+ years. Pioneer DM for Archive is a disc media and drive combination developed by Pioneer to meet the requirements laid out by the Japanese government for preservation of financial data for a minimum of 100 years. The discs use a MABL type recording layer and are manufactured with tight tolerances. Although burnable in any BD Writer, when burned in Pioneers DM for Archive writers using the DM Archiver software the media and burn quality meet ISO/IEC 18630 which defines the testing methods needed for ensuring media and burn quality. === Magnetic === Linear Tape-Open (LTO) is rated for up to 30 years in a climate-controlled environment and is currently in use by most industries, including broadcast and corporate digital data systems. The latest generation released in 2026 is LTO-10, it defines two unique cartridge types which can hold 30 TB or 40 TB each Hard disk drives are currently available up to 30 TB (HDD) capacity in 3.5-inch format and 5 TB in 2.5-inch laptop format. However, unlike optical media, they are limited to 5–25 years of operation lifespan due to inevitable mechanical failure or magnetic instability. == Gallery ==

    Read more →
  • Version space learning

    Version space learning

    Version space learning is a logical approach to machine learning, specifically binary classification. Version space learning algorithms search a predefined space of hypotheses, viewed as a set of logical sentences. Formally, the hypothesis space is a disjunction H 1 ∨ H 2 ∨ . . . ∨ H n {\displaystyle H_{1}\lor H_{2}\lor ...\lor H_{n}} (i.e., one or more of hypotheses 1 through n are true). A version space learning algorithm is presented with examples, which it will use to restrict its hypothesis space; for each example x, the hypotheses that are inconsistent with x are removed from the space. This iterative refining of the hypothesis space is called the candidate elimination algorithm, the hypothesis space maintained inside the algorithm, its version space. == The version space algorithm == In settings where there is a generality-ordering on hypotheses, it is possible to represent the version space by two sets of hypotheses: (1) the most specific consistent hypotheses, and (2) the most general consistent hypotheses, where "consistent" indicates agreement with observed data. The most specific hypotheses (i.e., the specific boundary SB) cover the observed positive training examples, and as little of the remaining feature space as possible. These hypotheses, if reduced any further, exclude a positive training example, and hence become inconsistent. These minimal hypotheses essentially constitute a (pessimistic) claim that the true concept is defined just by the positive data already observed: Thus, if a novel (never-before-seen) data point is observed, it should be assumed to be negative. (I.e., if data has not previously been ruled in, then it's ruled out.) The most general hypotheses (i.e., the general boundary GB) cover the observed positive training examples, but also cover as much of the remaining feature space without including any negative training examples. These, if enlarged any further, include a negative training example, and hence become inconsistent. These maximal hypotheses essentially constitute a (optimistic) claim that the true concept is defined just by the negative data already observed: Thus, if a novel (never-before-seen) data point is observed, it should be assumed to be positive. (I.e., if data has not previously been ruled out, then it's ruled in.) Thus, during learning, the version space (which itself is a set – possibly infinite – containing all consistent hypotheses) can be represented by just its lower and upper bounds (maximally general and maximally specific hypothesis sets), and learning operations can be performed just on these representative sets. After learning, classification can be performed on unseen examples by testing the hypothesis learned by the algorithm. If the example is consistent with multiple hypotheses, a majority vote rule can be applied. == Historical background == The notion of version spaces was introduced by Mitchell in the early 1980s as a framework for understanding the basic problem of supervised learning within the context of solution search. Although the basic "candidate elimination" search method that accompanies the version space framework is not a popular learning algorithm, there are some practical implementations that have been developed (e.g., Sverdlik & Reynolds 1992, Hong & Tsang 1997, Dubois & Quafafou 2002). A major drawback of version space learning is its inability to deal with noise: any pair of inconsistent examples can cause the version space to collapse, i.e., become empty, so that classification becomes impossible. One solution of this problem is proposed by Dubois and Quafafou that proposed the Rough Version Space, where rough sets based approximations are used to learn certain and possible hypothesis in the presence of inconsistent data.

    Read more →
  • Software-defined mobile network

    Software-defined mobile network

    Software-defined mobile networking (SDMN) is an approach to the design of mobile networks where all protocol-specific features are implemented in software, maximizing the use of generic and commodity hardware and software in both the core network and radio access network (RAN). == History == Through the 20th century, telecommunications technology was driven by hardware development, with most functions implemented in special-purpose equipment. In the early 2000s, generally available CPUs became cheap enough to enable commercial software-defined radio (SDR) technology and softswitches. SDMN extends these trends into the design of mobile networks, moving nearly all network functions into software. The term "software-defined mobile network" first appeared in public literature in early 2014, used independently by Lime Microsystems and researchers from University of Oulu, Finland. == Limitations of hardware-based mobile networks == Mobile networks based on special-purpose hardware suffer from the following limitations: They have limited provisions for upgrades and usually must be replaced entirely when new standards are introduced. The individual components are not scalable in terms of performance and capacity, because the capacity of a component is fixed by the hardware implementation. Specialized equipment and its associated specialized software require vendor-specific training for the mobile operator's staff. Specialized hardware systems are usually supported and serviced by a single vendor, resulting in vendor lock-in. == Characteristics of SDMN designs == === Use of software-defined radio === SDR is an important element of SDMN, because it replaces protocol-specific radio hardware with protocol-agnostic digital transceivers. While many earlier digital radio systems used field-programmable gate arrays (FPGAs) or special-purposed digital signal processors (DSPs) for calculations on baseband radio waveforms, the SDMN approach moves all of the baseband processing into general-purpose CPUs. SDMN radio systems also use hardware with publicly-documented interfaces that is designed to be readily reproducible by multiple manufacturers. === Commodity components === SDMN designs avoid the use of components that are specialized as to their functions or that are available from only a single vendor. This is true of both the hardware and software elements of the network. === Software switching and transcoding === The telephony switches of SDMN networks are software-based, including software transcoding for speech codecs. === Centralized, distributed, or hybrid? === A new SDN architecture for wireless distribution systems (WDSs) is explored that eliminates the need for multi-hop flooding of route information and therefore enables WDNs to easily expand. The key idea is to split network control and data forwarding by using two separate frequency bands. The forwarding nodes and the SDN controller exchange link-state information and other network control signaling in one of the bands, while actual data forwarding takes place in the other band. == Advantages of SDMN == The SDMN approach has many advantages over hardware-based mobile network designs. Because SDMN hardware is protocol-agnostic, upgrades are software-only, even across technology generations. In the radio network, these changes can even be made on a site-by-site basis. Because SDMN hardware is designed to be easily sourced and reproduced: SDMN equipment can be serviced by a wider range of vendors, lowering maintenance costs. SDMN equipment can be manufactured anywhere in the world, lowering production costs. Because SDMN software is based on commodity operating systems and development tools: Support staff can be trained more quickly because they are already familiar with the underlying software systems. Many aspects of the SDMN can be monitored and managed with pre-existing tools, because they are already available in the commodity operating systems. Because SDMN network components run on general purpose computers, the network components can be scaled up in capacity by adding more computing power.

    Read more →
  • Fansly

    Fansly

    Fansly is a subscription-based social media platform that allows content creators to monetize exclusive content, including photos, videos, live streams, and direct messages. Operated by Select Media LLC, the platform is headquartered in Baltimore, Maryland. While the platform hosts a variety of content genres, it is primarily known for adult content and is frequently compared to OnlyFans. == History == Fansly was launched in 2020 by Micheal Etelis under Select Media LLC, which was incorporated in February 2020. The platform also operates through CY Media LTD, registered in Kamares, Cyprus, established in May 2021. The company has remained privately held with no disclosed external funding rounds or official valuation, operating as a bootstrapped entity. Based on Fansly's social media presence, which was created in November 2020, the platform did not begin gaining traction until early 2021 when creators started to become concerned about potential content policy changes at OnlyFans. In August 2021, OnlyFans announced it would ban sexually explicit content effective October 2021, citing pressure from banks involved in its payment processing. Although OnlyFans reversed the decision six days later, the announcement triggered a massive influx of users to Fansly; the platform received nearly 4,000 new creator applications in a single hour, causing its servers to crash from the surge in traffic. By August 21, 2021, Fansly had reached 2.1 million users. == Features and business model == Fansly operates as a B2C marketplace, taking a 20% commission on all transactions conducted on the platform, with creators retaining the remaining 80%. This commission rate is the same as that charged by its main competitor, OnlyFans. A distinguishing feature of Fansly is its tiered subscription model, which allows creators to set multiple subscription levels at different price points, each offering different perks such as exclusive content, chat access, or custom requests. By contrast, OnlyFans historically relied on a single-tier subscription model. Revenue streams on the platform include recurring subscriptions, one-time pay-per-view content purchases, tips, paid messaging, and live-streaming fees. The platform also features an algorithmic "For You" feed that helps users discover new creators, addressing a limitation of competitors that lack internal content promotion mechanisms. Additional features include content watermarking, geolocation blocking to control where content is visible, two-factor authentication, community polls, 24-hour stories, and social media integration with platforms such as Twitter and Twitch. Payouts are processed within one to two business days and support multiple methods, including bank transfers, Skrill, Paxum, and cryptocurrency. In December 2025, Fansly expanded its live-streaming capabilities, introducing ticketed access, private list gating, configurable chat permissions, stream goals, and interactive device integration. == Controversies == === OnlyFans anti-competitive allegations === In August 2022, a series of lawsuits were filed in the United States alleging that OnlyFans had bribed employees of Meta Platforms to place Instagram accounts of creators who also sold content on competitor platforms, including Fansly, onto a terrorist blacklist. The lawsuits alleged that adult performers had traffic driven away from their Instagram accounts after being falsely tagged as terror-related. OnlyFans denied awareness of such activity. The plaintiffs withdrew the bribery claim in July 2023, and the case was dismissed in August 2023. === Privacy class action === In June 2025, Select Media LLC (operating as Fansly) was the subject of a digital privacy class action lawsuit filed in Massachusetts District Court. The lawsuit alleged that the platform secretly collected and shared users' sensitive viewing data with Google and other third parties without consent. The case was brought on behalf of an estimated class of over 10,000 users across multiple states.

    Read more →
  • Content strategy

    Content strategy

    Content strategy guides the planning, development, and management of content. It is a recognized field in user experience design, and it also draws from adjacent disciplines such as information architecture, content management, business analysis, digital marketing, and technical communication. == Definitions == Content strategy has been described as planning for "the creation, publication, and governance of useful, usable content." It has also been called "a repeatable system that defines the entire editorial content development process for a website development project." In a 2007 article titled "Content Strategy: The Philosophy of Data," Rachel Lovinger describes the goal of content strategy as using "words and data to create unambiguous content that supports meaningful, interactive experiences." Here, she also provided the analogy that "content strategy is to copywriting as information architecture is to design." She encourages content strategists and collaborators to engage in early discussions about content meaning, models, and tools, to make sure strategy is integrated from the start rather than as an afterthought. The Content Strategy Alliance combines Kevin Nichols' definition with Kristina Halvorson's and defines content strategy as "getting the right content to the right user at the right time through strategic planning of content creation, delivery, and governance." == Practitioners == Content strategists are often familiar with a wide range of approaches, techniques, and tools. The perspectives that content strategists bring also depend heavily on their professional training and education. For instance, some specialize in "front-end strategy," which includes developing personas, journey mapping the user experience, aligning business strategy and user needs, developing a brand strategy, exploring different channels, and creating style guidelines and search engine optimization (SEO) guidelines. Others specialize in "back-end strategy," which includes creating content models, planning taxonomies and metadata, structuring content management systems, and building systems to support content reuse. Both roles involve addressing workflow and governance issues. Many organizations and individuals tend to confuse content strategists with editors. However, content strategy is "about more than just the written word," according to Washington State University associate professor Brett Atwood. For example, Atwood indicates that a practitioner needs to also "consider how content might be re-distributed and/or re-purposed in other channels of delivery." It has also been proposed that the content strategist performs the role of a curator. Just as a museum curator sifts through a collection of content and identifies key pieces that can be juxtaposed against each other to create meaning and spur excitement, a content strategist "must approach a business’s content as a medium that needs to be strategically selected and placed to engage the audience, convey a message, and inspire action."

    Read more →
  • AI content watermarking

    AI content watermarking

    AI content watermarking is the process of embedding imperceptible yet detectable signals into content generated by artificial intelligence systems, such as text, images, audio, or video. The technique allows the content to be traced and identified as machine-generated without compromising its quality for the end user. AI watermarking has emerged as a key approach to address growing concerns about misinformation, deepfakes, copyright infringement, and the traceability of synthetic content in the context of the rapid development of generative artificial intelligence. Unlike traditional visible watermarks used in photography, AI content watermarks are typically invisible to humans and can only be detected and deciphered algorithmically. The concept is distinct from the watermarking of AI models themselves (to prevent model theft) and from the watermarking of training data (to combat unauthorized data use). Modern AI watermarking schemes are typically formalized as a pair of algorithms, an embedding (or generation) algorithm and a detection algorithm, sharing a secret key, whose performance is evaluated along three competing axes: quality (the watermark must not noticeably degrade outputs), detectability (the watermark must be statistically distinguishable from unwatermarked content), and robustness (the watermark must persist under adversarial or incidental modifications). == Background == Digital watermarking has been used for decades to protect physical and digital media, from paper currency to photographs. Classical schemes typically embedded a fixed bit-string into a fixed cover signal, with robustness criteria defined against a small fixed set of distortions such as JPEG compression or additive Gaussian noise. The rapid advancement of generative AI in the early 2020s, however, created a new and qualitatively different demand: rather than protecting a single artifact, watermarks for AI content must be embedded automatically across an open-ended distribution of generated outputs while remaining robust to a much wider class of adversarial transformations, including paraphrasing, image regeneration via diffusion models, and re-recording. Large image generation models such as DALL-E, Stable Diffusion, and Midjourney, along with large language models like ChatGPT, made it possible to produce highly realistic synthetic text, images, audio, and video at scale, raising significant ethical and security concerns. In July 2023, the Biden administration secured voluntary commitments from leading AI companies, including OpenAI, Alphabet, Meta, and Amazon, to develop watermarking and other provenance technologies to help users identify AI-generated content. == Formal definitions and design goals == Most modern AI watermarking schemes can be formalized as a pair of algorithms ( W m , D e t e c t ) {\displaystyle ({\mathsf {Wm}},{\mathsf {Detect}})} parameterized by a secret key k {\displaystyle k} . The embedding algorithm W m {\displaystyle {\mathsf {Wm}}} takes a generative model M {\displaystyle M} (and optionally a prompt) and returns a watermarked output x {\displaystyle x} ; the detection algorithm D e t e c t ( x , k ) {\displaystyle {\mathsf {Detect}}(x,k)} outputs a real-valued score (typically a p-value or log-likelihood ratio) used to decide whether x {\displaystyle x} was produced by the watermarked generator. The literature evaluates such schemes along several largely conflicting criteria: Criteria for evaluation include imperceptibility or quality preservation, measured for text via perplexity and human preference judgments, and for images and audio via metrics such as PSNR, SSIM, LPIPS, or PESQ. Detectability is typically expressed as the true positive rate at a fixed false positive rate (e.g. 1% or 10^-6), or as the number of tokens or pixels needed to reach a given confidence level. Robustness refers to the requirement that the watermark should survive expected modifications like JPEG or MP3 compression, cropping, noise, paraphrasing, or machine translation. Distortion-freeness is a stronger property requiring that the marginal distribution of any single watermarked output be statistically identical to the unwatermarked model's distribution. Schemes due to Aaronson, Christ et al., and Kuditipudi et al. are distortion-free in this sense, while the original Kirchenbauer et al. scheme is not. Forgery resistance or unforgeability means an adversary without the secret key should be unable to produce content that passes detection. == Techniques == AI watermarking techniques vary significantly depending on the type of content being watermarked. At its core, the process involves two main stages: embedding (or encoding) the watermark, and detection. There are two primary methods for embedding: watermarking during content generation, which requires access to the AI model itself but is generally more robust, and post-generation watermarking, which can be applied to content from any source, including closed-source models. Watermarks can be broadly classified as visible, including overt marks such as logos or text overlays, or imperceptible, which are detectable only by algorithms. They can also be classified by durability: robust watermarks are designed to withstand common transformations such as compression, cropping, and re-encoding, while fragile watermarks are easily destroyed by any alteration, making them useful for tamper detection. A further axis distinguishes zero-bit watermarks, which only signal "this content was generated by model M," from multi-bit watermarks, which embed an arbitrary payload (such as a user identifier) that can be recovered at detection time. === Text === Text watermarking is considered one of the most challenging modalities because natural language offers relatively limited redundancy compared to images or audio. Modern approaches for large language models alter the autoregressive sampling process so that some statistical signature is left in the choice of tokens, while leaving the surface form of the text unchanged. The literature distinguishes three main families of generation-time text watermarks. Logit-biasing schemes (e.g. KGW) add a fixed bias δ {\displaystyle \delta } to a pseudorandomly selected subset of vocabulary logits before softmax sampling. Reweighting or sampling-based schemes (e.g. SynthID-Text) compose multiple pseudorandom tournaments over the model's full distribution. Distortion-free schemes based on the Gumbel-max trick or inverse transform sampling (Aaronson 2022; Kuditipudi et al. 2023; Christ et al. 2024) preserve the marginal output distribution of the model. ==== KGW: token-probability shifting ==== The pioneering "green list / red list" scheme of Kirchenbauer et al. (KGW), introduced at ICML 2023, is the foundation for most subsequent text watermarks. At each decoding step t {\displaystyle t} , a pseudorandom function (PRF) keyed by a secret k {\displaystyle k} is applied to a context window of h {\displaystyle h} previous tokens to deterministically partition the vocabulary V {\displaystyle V} of size N {\displaystyle N} into a "green list" G ⊂ V {\displaystyle G\subset V} of size γ N {\displaystyle \gamma N} and its complement, the "red list" R = V ∖ G {\displaystyle R=V\setminus G} , where γ ∈ ( 0 , 1 ) {\displaystyle \gamma \in (0,1)} (typically γ = 1 / 2 {\displaystyle \gamma =1/2} ) is the green fraction. A logits processor then increments every green-list logit by a fixed bias δ > 0 {\displaystyle \delta >0} before softmax: ℓ v ′ = ℓ v + δ ⋅ 1 [ v ∈ G ] {\displaystyle \ell '_{v}=\ell _{v}+\delta \cdot \mathbf {1} [v\in G]} so that, after sampling, green tokens are over-represented but generation is not constrained to green tokens alone; high-entropy positions tolerate the bias gracefully, while low-entropy positions (where one token dominates the logits) override the watermark and preserve correctness on factual content. Detection requires only the secret key and the candidate text, not the language model itself. The detector recomputes the partition g ( ⋅ ) {\displaystyle g(\cdot )} for each token, counts the number of green hits | G | hits {\displaystyle |G|_{\text{hits}}} in a sequence of length T {\displaystyle T} , and computes a one-proportion z-test statistic: z = | G | hits − γ T T γ ( 1 − γ ) {\displaystyle z={\frac {|G|_{\text{hits}}-\gamma T}{\sqrt {T\gamma (1-\gamma )}}}} Under the null hypothesis that the text was written by an unwatermarked source (human or another model), the green-hit count is approximately binomially distributed with mean γ T {\displaystyle \gamma T} ; a large positive z {\displaystyle z} rejects the null hypothesis. The original paper reports that fewer than 25 watermarked tokens are sufficient to detect a watermark with a false positive rate below 10^-5 on the OPT-1.3B model. A follow-up study by the same group documented robustness under temperature sampling, top-p (nucleus) sampling, and human paraphrasing, and proposed sliding-window

    Read more →
  • FaceApp

    FaceApp

    FaceApp is a photo and video editing application for iOS and Android developed by FaceApp Technology Limited, a company based in Cyprus. The app generates highly realistic transformations of human faces in photographs by using neural networks. The app can transform a face to make it smile, look younger, look older, or change gender. == History == FaceApp was launched on iOS in January 2017 and on Android in February 2017. It was developed by Yaroslav Goncharov, a former executive at Yandex, and created by the Russian company Wireless Lab. == Features == There are multiple options to manipulate the photo uploaded such as editor options of adding an impression, make-up, smiles, hair colors, hairstyles, glasses, age or beards. Filters, lens blur and backgrounds along with overlays, tattoos, and vignettes are also a part of the app. The gender change transformations of FaceApp have attracted particular interest from the LGBT and transgender communities, due to their ability to realistically simulate the appearance of a person as the opposite gender. == Criticism == In 2017, FaceApp faced criticism for a "hot" filter that appeared to lighten users' skin tones, prompting accusations of racial bias. The feature was briefly renamed "spark" before being removed. Founder Yaroslav Goncharov attributed the issue to training data bias and apologized. In August of that year, more criticism arose when it featured "ethnicity filters" depicting "White", "Black", "Asian", and "Indian". The filters were immediately removed from the app. In 2019, FaceApp faced criticism over its handling of user data, including concerns that it stored users' photos on its servers and could use them for commercial purposes. Founder Yaroslav Goncharov stated that images were processed on cloud servers like Google Cloud Platform and Amazon Web Services, not transferred to Russia, and were temporarily stored only to support editing functions before being deleted. U.S. Senator Chuck Schumer raised concerns about data privacy and called for an FBI investigation.

    Read more →
  • Texas House Bill 20

    Texas House Bill 20

    An Act Relating to censorship of or certain other interference with digital expression, including expression on social media platforms or through electronic mail messages, also known as Texas House Bill 20 (HB20), is a Texas anti-deplatforming law enacted on September 9, 2021. It prohibits large social media platforms from removing, moderating, or labeling posts made by users in the state of Texas based on their "viewpoints", unless considered illegal under federal law or otherwise falling into exempted categories. It also requires them to make various public disclosures relating to their business practices (including the impact of algorithmic and moderation decisions on the content that is delivered to users). The bill is part of a wider array of Republican-backed legislation seeking to prohibit the censorship of political speech, based on allegations that the moderation policies of large social media platforms are not politically neutral. It has been challenged in NetChoice, LLC v. Paxton, and is currently the subject of a circuit split between the Fifth Circuit, and a decision by the Eleventh Circuit that struck down a similar bill in the state of Florida. In September 2023, the U.S. Supreme Court agreed to hear NetChoice v. Paxton jointly with NetChoice v. Moody on questions of whether the Florida and Texas state laws are in compliance with the 1st Amendment. == Content == The law applies to "social media platforms" that serve users in the state of Texas, and have more than 50 million monthly active users in the United States. They are defined as any public internet website or application that allows users to "communicate with other users for the primary purpose of posting information, comments, messages, or images", excluding internet service providers, electronic mail, and services where communication features are "incidental to, directly related to, or dependent on" content that is pre-selected by the operator. In the bill, to "censor" is defined as to "block, ban, remove, deplatform, demonetize, de-boost, restrict, deny equal access or visibility to, or otherwise discriminate against" expression. The law prohibits social media platforms from "censoring on the basis of user viewpoint, user expression, or the ability of a user to receive the expression of others", or on the basis of a user's geographic location in Texas. This includes removal or labeling posts with warnings and disclaimers. Social media platforms may only censor content if it is unlawful, they are "specifically authorized" to do so by federal law, based on requests from "an organization with the purpose of preventing the sexual exploitation of children or protecting survivors of sexual abuse from ongoing harassment", or "directly incites" criminal activity or contains threats of violence against persons based on protected categories. It is disputed over whether this provision is actually enforceable, as it may be preempted by Section 230 of the Communications Decency Act (which states that the operators of interactive computer services are not responsible for the actions of their users). Social media platforms must make public disclosures regarding the algorithmic techniques and moderation polices that are used to determine the content provided to users, must publish a compliant acceptable use policy (AUP), and must publish a biannual transparency report containing specific details on all actions made by the service regarding the moderation of users and content. The law also prohibits email providers from "intentionally imped[ing] the transmission of another person's electronic mail message based on the content." == Legislative history == Texas Governor Greg Abbott signed the bill into law on September 9, 2021. Democrat-proposed amendments excluding Holocaust denial, terrorism content, and vaccine misinformation from the bill were rejected. Following a suit by the industry groups Computer & Communications Industry Association (CCIA) and NetChoice, NetChoice, LLC v. Paxton, the bill was blocked by U.S. District Judge Robert Pitman in December 2021, on First Amendment grounds. Texas appealed to the United States Court of Appeals for the Fifth Circuit. Judges Edith Jones, Andrew Oldham, and Leslie H. Southwick, lifted the injunction on May 11, 2022, but the decision was appealed to the Supreme Court which suspended the bill pending a full review in the Fifth Circuit. On September 16, 2022, the Fifth Circuit reversed the injunction, allowing the bill to take effect; Judge Oldham stated that the bill "chills censorship" and "does not chill speech", and accused the plaintiffs of "attempt[ing] to extract a freewheeling censorship right from the Constitution's free speech guarantee. The Platforms are not newspapers. Their censorship is not speech." Southwick dissented, stating that "we are in a new arena, a very extensive one, for speakers and for those who would moderate their speech. None of the precedents fit seamlessly." The CCIA and NetChoice requested a stay on the ruling and that the case be taken to the Supreme Court, arguing that the reversal conflicts with an Eleventh Circuit decision in NetChoice v. Moody which struck down a similar anti-moderation bill imposed by the state of Florida. On October 12, 2022, the Fifth Circuit granted the stay.

    Read more →
  • VibeOS

    VibeOS

    VibeOS is an operating system built from scratch entirely by generative artificial intelligence, using code produced through prompts to Claude (vibe coding). It is capable of running on QEMU and was successfully tested on a Raspberry Pi Zero. It has been released under the MIT license. == Features == === Core === Custom kernel with cooperative multitasking (preemptive backup) FAT32 filesystem with long filename support Memory allocator, process scheduler, interrupt handling GIC-400 (QEMU) and BCM2836/BCM2835 (Pi) interrupt controllers Configurable boot (splash screen, boot target) === GUI === Desktop environment with draggable windows Menu bar, dock, window minimize/maximize/close Mouse and keyboard input Modern macOS-inspired aesthetic === Networking === Full TCP/IP stack (Ethernet, ARP, IP, ICMP, UDP, TCP) DNS resolver HTTP client TLS 1.2 with HTTPS support === Apps === Web browser with HTML/CSS rendering Terminal emulator with readline-style shell Text editor (vim clone) with syntax highlighting File manager with drag-and-drop Music player (MP3/WAV) Calculator, system monitor VibeCode IDE Doom port === Development === TCC (Tiny C Compiler) - compile C programs directly on VibeOS MicroPython interpreter with full kernel API bindings 60+ userspace programs (coreutils, games, GUI apps) === Hardware === Runs on Raspberry Pi Zero 2W USB keyboard and mouse via DWC2 driver SD card via EMMC driver 1920×1080 framebuffer == Further projects == There are other independent projects under the VibeOS name, including an independent development by Ben, also developed using vibe coding, aimed at creating a Unix-like operating system for educational purposes. Another project is Vib-OS, an operating system also built using vibe coding, capable of booting on a Raspberry Pi. It offers a desktop environment with a customizable wallpaper, a file manager, and a web browser currently in an early stage of development, a functional Doom port, among other features that are not very polished given the state of development.

    Read more →
  • Social software engineering

    Social software engineering

    Social software engineering (SSE) is a branch of software engineering that is concerned with the social aspects of software development and the developed software. SSE focuses on the socialness of both software engineering and developed software. On the one hand, the consideration of social factors in software engineering activities, processes and CASE tools is deemed to be useful to improve the quality of both development process and produced software. Examples include the role of situational awareness and multi-cultural factors in collaborative software development. On the other hand, the dynamicity of the social contexts in which software could operate (e.g., in a cloud environment) calls for engineering social adaptability as a runtime iterative activity. Examples include approaches which enable software to gather users' quality feedback and use it to adapt autonomously or semi-autonomously. SSE studies and builds socially-oriented tools to support collaboration and knowledge sharing in software engineering. SSE also investigates the adaptability of software to the dynamic social contexts in which it could operate and the involvement of clients and end-users in shaping software adaptation decisions at runtime. Social context includes norms, culture, roles and responsibilities, stakeholder's goals and interdependencies, end-users perception of the quality and appropriateness of each software behaviour, etc. The participants of the 1st International Workshop on Social Software Engineering and Applications (SoSEA 2008) proposed the following characterization: Community-centered: Software is produced and consumed by and/or for a community rather than focusing on individuals Collaboration/collectiveness: Exploiting the collaborative and collective capacity of human beings Companionship/relationship: Making explicit the various associations among people Human/social activities: Software is designed consciously to support human activities and to address social problems Social inclusion: Software should enable social inclusion enforcing links and trust in communities Thus, SSE can be defined as "the application of processes, methods, and tools to enable community-driven creation, management, deployment, and use of software in online environments". One of the main observations in the field of SSE is that the concepts, principles, and technologies made for social software applications are applicable to software development itself as software engineering is inherently a social activity. SSE is not limited to specific activities of software development. Accordingly, tools have been proposed supporting different parts of SSE, for instance, social system design or social requirements engineering. Consequently vertical market software, such as software development tools, engineering tools, marketing tools or software that helps users in a decision-making process can profit from social components. Such vertical social software differentiates strongly in its user-base from traditional social software such as Yammer.

    Read more →
  • AMiner (database)

    AMiner (database)

    AMiner (formerly ArnetMiner) is a free online service used to index, search, and mine big scientific data. == Overview == AMiner (ArnetMiner) is designed to search and perform data mining operations against academic publications on the Internet, using social network analysis to identify connections between researchers, conferences, and publications. This allows it to provide services such as expert finding, geographic search, trend analysis, reviewer recommendation, association search, course search, academic performance evaluation, and topic modeling. AMiner was created as a research project in social influence analysis, social network ranking, and social network extraction. A number of peer-reviewed papers have been published arising from the development of the system. It has been in operation for more than three years, and has indexed 130,000,000 researchers and more than 265 million publications. The research was funded by the Chinese National High-tech R&D Program and the National Science Foundation of China. AMiner is commonly used in academia to identify relationships between and draw statistical correlations about research and researchers. It has attracted more than 10 million independent IP accesses from 220 countries and regions. The product has been used in Elsevier's SciVerse platform, and academic conferences such as SIGKDD, ICDM, PKDD, WSDM. == Operation == AMiner automatically extracts the researcher profile from the web. It collects and identifies the relevant pages, then uses a unified approach to extract data from the identified documents. It also extracts publications from online digital libraries using heuristic rules. It integrates the extracted researchers’ profiles and the extracted publications. It employs the researcher name as the identifier. A probabilistic framework has been proposed to deal with the name ambiguity problem in the integration. The integrated data is stored into a researcher network knowledge base (RNKB). The principal other product in the area are Google Scholar, Elsevier's Scirus, and the open source project CiteSeer. == History == It was initiated and created by professor Jie Tang from Tsinghua University, China. It was first launched in March 2006. The following provide a list of updates in the past years: March 2006, Version 0.1, Functions include researcher profiling, expert search, conference search, and publication search. The system was developed in Perl; August 2006, Version 1.0, The system was re-implemented in Java; July 2007, Version 2.0, New functions include researcher interest mining, association search, survey paper finding (unavailable now); April 2008, Version 3.0, New functions include query understanding, new GUI, and search log analysis; November 2008, Version 4.0, New functions include graph search, topic modeling, NSF/NSFC funding information extraction; April 2009, Version 5.0, New functions include Profile edition, open API service, Bole search, course search (unavailable now); December 2009, Version 6.0, New functions include academic performance evaluation, user feedback, conference analysis; May 2010, Version 7.0, New functions include name disambiguation, paper-reviewer recommendation, ArnetPage creation; March 2012, Version II, renamed as AMiner, rewrote all the codes and redesign the GUI. New functions include: geographic search, ArnetAPP platform. June 2014, Version II, renamed as AMiner, rewrote all the codes and redesign the GUI. New functions include: geographic search, ArnetAPP platform. December 2015, a completely new version got online. May 2017, professional version got online. April 2018, New functions include Trend Analysis, a deep learning based Name Disambiguation == Resources == AMiner published several datasets for academic research purpose, including Open Academic Graph, DBLP+citation (a data set augmenting citations into the DBLP data from Digital Bibliography & Library Project), Name Disambiguation, Social Tie Analysis. For more available datasets and source codes for research, please refer to.

    Read more →
  • Facebook

    Facebook

    Facebook is an American social networking service owned by the American technology conglomerate Meta Platforms. It was founded in 2004 by Mark Zuckerberg, along with his Harvard College roommates and fellow students Eduardo Saverin, Andrew McCollum, Dustin Moskovitz, and Chris Hughes. The name Facebook derives from the face book directories often given to American university students. The service was initially limited to Harvard students before gradually expanding to other universities in North America. Since 2006, Facebook has permitted registration by individuals aged 13 and older, with the exception of South Korea, Spain, and Quebec, where the minimum age is 14. As of December 2023, Facebook reported approximately 3.07 billion monthly active users worldwide. As of July 2025, it was ranked as the third-most-visited website in the world, with 23 percent of its traffic originating from the United States. It was the most downloaded mobile application of the 2010s. Facebook can be accessed from devices with Internet connectivity, such as personal computers, tablets and smartphones. After registering, users can create a profile revealing personal information about themselves. They can post text, photos and multimedia which are shared either publicly or exclusively with other users who have agreed to be their friend, depending on privacy settings. Users can also communicate directly with each other with Messenger, edit messages (within 15 minutes after sending), join common-interest groups, and receive notifications on the activities of their Facebook friends and the pages they follow. Facebook has often been criticized over issues such as user privacy (as with the Facebook–Cambridge Analytica data scandal), political manipulation (as with the 2016 U.S. elections) and mass surveillance. The company has also been subject to criticism over its psychological effects such as addiction and low self-esteem, and over content such as fake news, conspiracy theories, copyright infringement, and hate speech. Commentators have accused Facebook of willingly facilitating the spread of such content, as well as overemphasizing its number of users to appeal to advertisers. == History == The history of Facebook traces its growth from a college networking site to a global social networking service. While attending Phillips Exeter in the early 2000s, Zuckerberg met Kris Tillery. Tillery, a one-time project collaborator with Zuckerberg, would create a school-based social networking project called Photo Address Book. Photo Address Book was a digital face book, created through a linked database composed of student information derived from the official records of the Exeter Student Council. The database contained linkages such as name, dorm-specific landline numbers, and student headshots. Mark Zuckerberg built a website called "Facemash" in 2003 while attending Harvard University. The site was comparable to Hot or Not and used photos from online face books, asking users to choose the 'hotter' person". Zuckerberg was reported and faced expulsion, but the charges were dropped. A "face book" is a student directory featuring photos and personal information. In January 2004, Zuckerberg coded a new site known as "TheFacebook", stating, "It is clear that the technology needed to create a centralized Website is readily available ... the benefits are many." Zuckerberg met with Harvard student Eduardo Saverin, and each agreed to invest $1,000. On February 4, 2004, Zuckerberg launched "TheFacebook". Membership was initially restricted to students of Harvard College. Dustin Moskovitz, Andrew McCollum, and Chris Hughes joined Zuckerberg to help manage the growth of the site. It became available successively to most universities in the US and Canada. In 2004, Napster co-founder Sean Parker became company president and the company moved to Palo Alto, California. PayPal co-founder Peter Thiel gave Facebook its first investment. In 2005, the company dropped "the" from its name after purchasing the domain name Facebook.com. In 2006, Facebook opened to everyone at least or only 13 years old with a valid email address. Facebook introduced key features like the News Feed, which became central to user engagement. By late 2007, Facebook had 100,000 pages on which companies promoted themselves. Facebook had surpassed MySpace in global traffic and became the world's most popular social media platform. Microsoft announced that it had purchased a 1.6% share of Facebook for $240 million ($373 million in 2025 dollars), giving Facebook an implied value of around $15 billion ($23.3 billion in 2025 dollars). Facebook focused on generating revenue through targeted advertising based on user data, a model that drove its rapid financial growth. In 2012, Facebook went public with one of the largest IPOs in tech history. Acquisitions played a significant role in Facebook's dominance. In 2012, it purchased Instagram, followed by WhatsApp and Oculus VR in 2014, extending its influence beyond social networking into messaging and virtual reality. The Facebook–Cambridge Analytica data scandal in 2018 revealed misuse of user data to influence elections, sparking global outcry and leading to regulatory fines and hearings. Facebook's role in global events, including its use in organizing movements like the Arab Spring and its impact on events like the Rohingya genocide in Myanmar, highlighted its dual nature as a tool for both empowerment and harm. In 2021, Facebook rebranded as Meta, reflecting its shift toward building the "metaverse" and focusing on virtual reality and augmented reality technologies. == Features == Facebook does not officially publish a maximum character limit for posts; however, user posts can be lengthy, with unofficial sources suggesting a high character limit. Posts may also include images and videos. According to Facebook's official business documentation, videos can be up to 240 minutes long and 10 GB in file size, with supported resolutions up to 1080p. Users can "friend" users, both sides must agree to being friends. Posts can be changed to be seen by everyone (public), friends, people in a certain group (group) or by selected friends (private). Users can join groups. Groups are composed of persons with shared interests. For example, they might go to the same sporting club, live in the same suburb, have the same breed of pet or share a hobby. Posts posted in a group can be seen only by those in a group, unless set to public. Users are able to buy, sell, and swap things on Facebook Marketplace or in a Buy, Swap and Sell group. Facebook users may advertise events, which can be offline, on a website other than Facebook, or on Facebook. == Website == === Technical aspects === The site's primary color is blue as Zuckerberg is red–green colorblind, a realization that occurred after a test taken around 2007. Facebook was initially built using PHP, a popular scripting language designed for web development. PHP was used to create dynamic content and manage data on the server side of the Facebook application. Zuckerberg and co-founders chose PHP for its simplicity and ease of use, which allowed them to quickly develop and deploy the initial version of Facebook. As Facebook grew in user base and functionality, the company encountered scalability and performance challenges with PHP. In response, Facebook engineers developed tools and technologies to optimize PHP performance. One of the most significant was the creation of the HipHop Virtual Machine (HHVM). This significantly improved the performance and efficiency of PHP code execution on Facebook's servers. The site upgraded from HTTP to the more secure HTTPS in January 2011. ==== 2012 architecture ==== Facebook is developed as one monolithic application. According to an interview in 2012 with Facebook build engineer Chuck Rossi, Facebook compiles into a 1.5 GB binary blob which is then distributed to the servers using a custom BitTorrent-based release system. Rossi stated that it takes about 15 minutes to build and 15 minutes to release to the servers. The build and release process has zero downtime. Changes to Facebook are rolled out daily. Facebook used a combination platform based on HBase to store data across distributed machines. Using a tailing architecture, events are stored in log files, and the logs are tailed. The system rolls these events up and writes them to storage. The user interface then pulls the data out and displays it to users. Facebook handles requests as AJAX behavior. These requests are written to a log file using Scribe (developed by Facebook). Data is read from these log files using Ptail, an internally built tool to aggregate data from multiple Scribe stores. It tails the log files and pulls data out. Ptail data are separated into three streams and sent to clusters in different data centers (Plugin impression, News feed impressions, Actions (plugin + news feed)). Puma is used to manage periods of high data

    Read more →