AI Code Base

AI Code Base — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • OpenFog Consortium

    OpenFog Consortium

    The OpenFog Consortium (sometimes stylized as Open Fog Consortium) was a consortium of high tech industry companies and academic institutions across the world aimed at the standardization and promotion of fog computing in various capacities and fields. The consortium was founded by Cisco Systems, Intel, Microsoft, Princeton University, Dell, and ARM Holdings in 2015 and now has 57 members across the North America, Asia, and Europe, including Forbes 500 companies and noteworthy academic institutions. The OpenFog consortium merged with the Industrial Internet Consortium, now the Industry IoT Consortium, on January 31, 2019. == History == OpenFog was created on November 19, 2015, by ARM Holdings, Cisco Systems, Dell, Intel, Microsoft, and Princeton University. The idea for a consortium centered on the advancement and dissemination of fog computing was thought up by Helder Antunes, a Cisco executive with a history in IoT, Mung Chiang, then a Princeton University professor and now President of Purdue University, and Dr. Tao Zhang, a Cisco Distinguished Engineer and CIO for the IEEE Communications Society then and now a manager at the National Institute of Standards and Technologies (NIST). The project was executed from concept to launch by Armando Pereira at PVentures Consulting, a Silicon Valley–based high-tech consulting firm. OpenFog released its reference architecture for fog computing on February 13, 2017. The Fog World Congress 2017, with Dr. Tao Zhang as its General Chair, was hosted in October 2017 by OpenFog, in conjunction with the IEEE Communications Society, as the first congress devoted to fog computing. == Administration == The OpenFog Consortium was governed by its board of directors, which is chaired by Cisco Senior Director Helder Antunes. The board of directors is made up of 11 seats, each representing one of the following companies and institutions: ARM, AT&T, Cisco, Dell, Intel, Microsoft, Princeton University, IEEE, GE, ZTE and Shanghai Tech University. The consortium's general membership comprised 13 academic members: Aalto University, Arizona State University, California Institute of Technology, Georgia State University, National Chiao Tung University, National Taiwan University, Shanghai Research Centre for Wireless Communication, Chinese University of Hong Kong, University of Colorado Boulder, University of Southern California, University of Pisa, Vanderbilt University, Wayne State University, and 20 additional members: Hitachi, Internet Initiative Japan, Itochu, Kii, Nebbiolo, PrismTech, NEC, NGD Systems, NTT Communications, OSIsoft, Real-time Innovations, relayr, Sakura Internet, Stichting imec Nederland, Toshiba, TTT Tech, Fujitsu, FogHorn Systems, TTTech and MARSEC. == Published work == The OpenFog Consortium published the white paper, "OpenFog Reference Architecture". This document outlines the eight pillars of an OpenFog architecture:Security; Scalability; Open; Autonomy; Programmability; RAS (reliability, availability and serviceability); Agility; and Hierarchy. It also incorporates a glossary for fog computing terms. In July 2018, the IEEE Standards Association announced it had adopted the OpenFog Reference Architecture as the first standard for fog computing.

    Read more →
  • NIS2 Directive

    NIS2 Directive

    The Directive (EU) 2022/2555, commonly known as NIS2 is a directive of the European Union aimed at protecting digital infrastructure, in particular critical infrastructure. It broadened the sectors covered by EU network and information security rules and updated incident reporting and oversight compared to the NIS1. Member States were required to transpose NIS2 by 17 October 2024, and the earlier NIS Directive was repealed on 18 October 2024. Only 23 Member States have fully implemented the measures contained with the NIS Directive. Infringement proceedings against them to enforce the Directive have not taken place, and they are not expected to take place in the near future. This failed implementation has led to the fragmentation of cybersecurity capabilities across the EU, with differing standards, incident reporting requirements and enforcement requirements being implemented in different Member States. From the EFTA countries (to April 2026) only Liechtenstein has fully transposed the NIS2 Directive. While the EFTA commission is conducting preparations to transpose the directive into its legislation. == National implementations == === Czech Republic === It is implemented through the Act No. 264/2025 Coll. also called Zákon o kybernetické bezpečnosti (Cybersecurity law) and through another five implementing regulations. The transposing legislation came into force on November 1st, 2025. === Germany === It is implemented through the Gesetz zur Umsetzung der NIS-2-Richtlinie und zur Regelung wesentlicher Grundzüge des Informationssicherheitsmanagements in der Bundesverwaltung. === Ireland === It is implemented through the National Cyber Security Bill. === The Netherlands === It is implemented through the Cyberbeveiligingswet (Cbw). === Slovakia === It is implemented through via an amendment of the Act No. 69/2018 Coll. also called Zákon o kybernetickej bezpečnosti a o zmene a doplnení niektorých zákonov (Law on Cybersecurity and change and amendment of certain laws). It came into force on November 1st, 2025. === Spain === It is implemented through the Esquema Nacional de Seguridad (ENS).

    Read more →
  • Clipmap

    Clipmap

    In computer graphics, clipmapping is a method of clipping a mipmap to a subset of data pertinent to the geometry being displayed. This is useful for loading as little data as possible when memory is limited, such as on a graphics processing unit. The technique is used for LODing in NVIDIA’s implementation of voxel cone tracing. The high-resolution levels of the mipmapped scene representation are clipped to a region near the camera, while lower resolution levels are clipped further away. == MegaTexture == MegaTexture is a clipmap implementation developed by id Software. It was introduced in their id Tech 4 engine and also appeared in id Tech 5 and id Tech 6 before being removed in id Tech 7. MegaTexture is a texture allocation technique that uses a single, extremely large texture rather than repeating multiple smaller textures. It is also featured in Splash Damage's game Enemy Territory: Quake Wars, and was developed by id Software former technical director John Carmack. MegaTexture employs a single large texture space for static terrain. The texture is stored on removable media or a computer's hard drive and streamed as needed, allowing large amounts of detail and variation over a large area with comparatively little RAM usage. Depending on the pixel resolution per square meter, covering a large area could require several gigabytes of memory. However, RAM is also filled by the rest of the game and the underlying operating system, limiting the amount available for texturing. As the player moves around the game, different sections of the MegaTexture are loaded into memory. They are then scaled to the correct size and applied to the 3D models of the terrain. Id has presented a more advanced technique that builds upon the MegaTexture idea and virtualizes both the geometry and the textures to obtain unique geometry down to the equivalent of the texel: the sparse voxel octree (SVO). It works by raycasting the geometry represented by voxels (instead of triangles) stored in an octree. The goal is to stream parts of the octree into video memory, going further down along the tree for nearby objects to give them more details, and to use higher level, larger voxels for farther objects, which give an automatic level of detail (LOD) system for both geometry and textures at the same time. The geometric detail that can be obtained using this method is nearly infinite, which removes the need for faking 3-dimensional details with techniques such as normal mapping. Despite that most voxel rendering tests use very large amounts of memory (up to several GB), Jon Olick of id Software claimed the technology is able to compress such SVO to 1.15 bits per voxel of position data. == Virtual texturing == Unlike clipmaps, which clip each mip level around a viewpoint-dependent clipcenter and therefore work best for terrain, virtual texturing preprocesses texture data into equally sized tiles that can be streamed for arbitrary textured geometry. Rage, powered by the id Tech 5 engine, uses a more advanced technique called virtual texturing. Textures can measure up to 128000×128000 pixels and are also used for in-game models and sprites, etc. and not just the terrain. Wolfenstein: The New Order and the 2016 version of Doom also use these. Carmageddon: Reincarnation also uses virtual texturing, though unlike id's virtual texturing system, which is designed for unique texture-mapping everywhere, their system is designed to use storage space sparingly while still offering good blend of texture variation and resolution.

    Read more →
  • TikTok

    TikTok

    TikTok is a social media and short-form online video platform. It hosts user-submitted videos, which range in duration from three seconds to 60 minutes. It can be accessed through a mobile app or through its website. Since its launch, TikTok has become one of the world's most popular social media platforms, using recommendation algorithms to connect content creators and influencers with new audiences. In April 2020, TikTok surpassed two billion mobile downloads worldwide. The popularity of TikTok has allowed viral trends in food, fashion, and music to take off and increase the platform's cultural impact worldwide. TikTok has come under scrutiny due to data privacy violations, mental health concerns, misinformation, offensive content, addictive algorithm, its role during the Gaza war, and, following its 2026 divestiture in the U.S., alleged censorship of criticism of Donald Trump and discussions of Jeffrey Epstein. While TikTok remains accessible to users in most countries, a minority of countries (including India and Afghanistan) have implemented full or partial bans. Many other countries limit TikTok's use on government-issued devices for security or privacy reasons. == Corporate structure == TikTok Ltd was incorporated in the Cayman Islands in the Caribbean and is based in both Singapore and Los Angeles. It owns entities which are based respectively in Australia (which also runs the New Zealand business), United Kingdom (also owns subsidiaries in the European Union), and Singapore (owns operations in Southeast Asia and India). A spin-off company, TikTok USDS Joint Venture LLC was formed on 22 January 2026 to handle TikTok and other ByteDance properties in the United States, Oracle Corporation, MGX Fund Management Limited, Silver Lake each holding a 15% stake, ByteDance holds a 19.9% stake and the remaining 35.1% is shared between Dell Technologies founder Michael Dell and Vastmere Strategic Investments. Its parent company, Beijing-based ByteDance, is owned by founders and Chinese investors, other global investors, and employees. One of ByteDance's main domestic subsidiaries is owned by Chinese state funds and entities through a 1% golden share. Employees have reported that multiple overlaps exist between TikTok and ByteDance in terms of personnel management and product development. TikTok says that since 2020, its US-based CEO is responsible for making important decisions, and has downplayed its China connection. == History == === Douyin === Douyin (Chinese: 抖音; pinyin: Dǒuyīn; lit. 'Shaking Sound') was launched on 20 September 2016, by ByteDance, originally under the name A.me, before changing its name to Douyin in December 2016. Douyin was developed in nearly 7 months and within a year had 100 million users, with more than one billion videos viewed every day. While TikTok and Douyin share a similar user interface, the platforms operate separately. Douyin includes an in-video search feature that can search by people's faces for more videos of them, along with other features such as buying, booking hotels, and making geo-tagged reviews. === TikTok === ByteDance planned on Douyin expanding overseas. The founder of ByteDance, Zhang Yiming, stated that "China is home to only one-fifth of Internet users globally. If we don't expand on a global scale, we are bound to lose to peers eyeing the four-fifths. So, going global is a must." ByteDance created TikTok as an overseas version of Douyin. TikTok was launched in the international market in September 2017. On 9 November 2017, ByteDance spent nearly $1 billion to purchase Musical.ly, a startup headquartered in Shanghai with an overseas office in Santa Monica, California. Musical.ly was a social media video platform that allowed users to create short lip-sync and comedy videos, initially released in August 2014. TikTok merged with Musical.ly on 2 August 2018 with existing accounts and data consolidated into one app, keeping the title TikTok. On 23 January 2018, the TikTok app ranked first among free application downloads on app stores in Thailand and other countries. TikTok has been downloaded more than 130 million times in the United States and has reached 2 billion downloads worldwide, according to data from mobile research firm Sensor Tower (those numbers exclude Android users in China). In the United States, Jimmy Fallon, Tony Hawk, and other celebrities began using the app in 2018. Other celebrities like Jennifer Lopez, Jessica Alba, Will Smith, and Justin Bieber joined TikTok. In January 2019, TikTok allowed creators to embed merchandise sale links into their videos. On 3 September 2019, TikTok and the US National Football League (NFL) announced a multi-year partnership. The agreement came just two days before the NFL's 100th season kick-off at Soldier Field in Chicago where TikTok hosted activities for fans in honor of the deal. The partnership entails the launch of an official NFL TikTok account, which is to bring about new marketing opportunities such as sponsored videos and hashtag challenges. In July 2020, TikTok, excluding Douyin, reported close to 800 million monthly active users worldwide after less than four years of existence. In May 2021, TikTok appointed Shou Zi Chew as their new CEO who assumed the position from interim CEO Vanessa Pappas, following the resignation of Kevin A. Mayer on 27 August 2020. In September 2021, TikTok reported that it had reached 1 billion users. In 2021, TikTok earned $4 billion in advertising revenue. In October 2022, TikTok was reported to be planning an expansion into the e-commerce market in the US, following the launch of TikTok Shop in the United Kingdom. The company posted job listings for staff for a series of order fulfillment centers in the US and was reportedly planning to start the new live shopping business before the end of the year. The Financial Times reported that TikTok will launch a video gaming channel, but the report was denied in a statement to Digiday, with TikTok instead aiming to be a social hub for the gaming community. According to data from app analytics group Sensor Tower, advertising on TikTok in the US grew by 11% in March 2023, with companies including Pepsi, DoorDash, Amazon, and Apple among the top spenders. According to estimates from research group Insider Intelligence, TikTok is projected to generate $14.15 billion in revenue in 2023, up from $9.89 billion in 2022. In March 2024, The Wall Street Journal reported that TikTok's growth in the US had stagnated. ==== Plans to sell TikTok's US operations ==== Since at least 2020, following calls to ban TikTok in the country, the Committee on Foreign Investment in the United States (CFIUS) has been investigating the company's 2017 merger with Musical.ly but has not finalized any of its negotiations with TikTok, such as the Project Texas proposal, waiting instead for Congress to act. In January 2025, Chinese officials began preliminary talks about potentially selling TikTok's US operations to Elon Musk if the app faced an impending ban due to national security concerns. While Beijing preferred TikTok remain under ByteDance's control, the sale could happen through a competitive process or with US government involvement. One possibility involved Musk's platform, X, taking over TikTok's US business. The move came ahead of a Supreme Court case that upheld the constitutionality of a law that would force a sale or ban of TikTok in the US by 19 January 2025, due to national security concerns regarding its ties to China. Other potential buyers included Project Liberty's "The People's Bid For TikTok" consortium of Frank McCourt with Kevin O'Leary, Steven Mnuchin, MrBeast and Bobby Kotick, the seriousness of these potential buyers was unclear. The day before the impending ban, California-based conversational search engine company Perplexity AI submitted a bid for a merger with TikTok US. On 14 September 2025, the Wall Street Journal reported the US and China have reached the "framework of a deal" for the US operations of TikTok to be sold to a consortium of investors in the US including close Trump ally Larry Ellison of Oracle. The deal was completed by 22 January 2026, with a consortium of investors—including Oracle, Silver Lake, MGX, and others including the personal investment entity for Michael Dell—owning more than 80% of the new venture. ByteDance retained 19.9% ownership. Under the deal, the app would remain the same, and the algorithm would be adjusted over time to favor American topics for those users. === Expansion in other markets === TikTok was downloaded over 104 million times on Apple's App Store during the first half of 2018, according to data provided to CNBC by Sensor Tower. After merging with musical.ly in August, downloads increased and TikTok subsequently became the most downloaded app in the US in October 2018, which musical.ly had done once before. In February 2019, TikTok, together with Douyin, hit one billion downloads globally, excluding Android

    Read more →
  • MeeMix

    MeeMix

    MeeMix Ltd is a company specializing in personalizing media-related content recommendations, discovery and advertising for the telecommunication industry, founded in 2006. On January 1, 2008, MeeMix launched meemix.com, a public personalized internet radio serving as an online testbed for the development of music taste-prediction technologies. Subsequently, MeeMix released in 2009 a line of Business-to-business commercial services intended to personalize media recommendations, discovery and advertising. MeeMix hybrid taste-prediction technology relies on integrating machine learning algorithms, digital signal processing, behavior analysis, metadata analysis and collaborative filtering, and is provided via API web service. In August 2009, MeeMix was announced as Innovator Nominee in the GSM Association’s Mobile Innovation Grand Prix worldwide contest. As of 2013, MeeMix no longer features internet radios on meemix.com. On Sep 28, 2014, meemix.com went offline.

    Read more →
  • Order-independent transparency

    Order-independent transparency

    Order-independent transparency (OIT) is a class of techniques in rasterisational computer graphics for rendering transparency in a 3D scene, which do not require rendering geometry in sorted order for alpha compositing. == Description == Commonly, 3D geometry with transparency is rendered by blending (using alpha compositing) all surfaces into a single buffer (think of this as a canvas). Each surface occludes existing color and adds some of its own color depending on its alpha value, a ratio of light transmittance. The order in which surfaces are blended affects the total occlusion or visibility of each surface. For a correct result, surfaces must be blended from farthest to nearest or nearest to farthest, depending on the alpha compositing operation, over or under. Ordering may be achieved by rendering the geometry in sorted order, for example sorting triangles by depth, but can take a significant amount of time, not always produce a solution (in the case of intersecting or circularly overlapping geometry) and the implementation is complex. Instead, order-independent transparency sorts geometry per-pixel, after rasterisation. For exact results this requires storing all fragments before sorting and compositing. == History == The A-buffer is a computer graphics technique introduced in 1984 which stores per-pixel lists of fragment data (including micro-polygon information) in a software rasteriser, REYES, originally designed for anti-aliasing but also supporting transparency. More recently, depth peeling in 2001 described a hardware accelerated OIT technique. With limitations in graphics hardware the scene's geometry had to be rendered many times. A number of techniques have followed, to improve on the performance of depth peeling, still with the many-pass rendering limitation. For example, Dual Depth Peeling (2008). In 2009, two significant features were introduced in GPU hardware/drivers/Graphics APIs that allowed capturing and storing fragment data in a single rendering pass of the scene, something not previously possible. These are, the ability to write to arbitrary GPU memory from shaders and atomic operations. With these features a new class of OIT techniques became possible that do not require many rendering passes of the scene's geometry. The first was storing the fragment data in a 3D array, where fragments are stored along the z dimension for each pixel x/y. In practice, most of the 3D array is unused or overflows, as a scene's depth complexity is typically uneven. To avoid overflow the 3D array requires large amounts of memory, which in many cases is impractical. Two approaches to reducing this memory overhead exist. Packing the 3D array with a prefix sum scan, or linearizing, removed the unused memory issue but requires an additional depth complexity computation rendering pass of the geometry. The "Sparsity-aware" S-Buffer, Dynamic Fragment Buffer, "deque" D-Buffer, Linearized Layered Fragment Buffer all pack fragment data with a prefix sum scan and are demonstrated with OIT. Storing fragments in per-pixel linked lists provides tight packing of this data and in late 2011, driver improvements reduced the atomic operation contention overhead making the technique very competitive. == Exact OIT == Exact, as opposed to approximate, OIT accurately computes the final color, for which all fragments must be sorted. For high depth complexity scenes, sorting becomes the bottleneck. One issue with the sorting stage is local memory limited occupancy, in this case a SIMT attribute relating to the throughput and operation latency hiding of GPUs. Backwards memory allocation (BMA) groups pixels by their depth complexity and sorts them in batches to improve the occupancy and hence performance of low depth complexity pixels in the context of a potentially high depth complexity scene. Up to a 3× overall OIT performance increase is reported. Sorting is typically performed in a local array, however performance can be improved further by making use of the GPU's memory hierarchy and sorting in registers, similarly to an external merge sort, especially in conjunction with BMA. == Approximate OIT == Approximate OIT techniques relax the constraint of exact rendering to provide faster results. Higher performance can be gained from not having to store all fragments or only partially sorting the geometry. A number of techniques also compress, or reduce, the fragment data. These include: Stochastic Transparency: draw in a higher resolution in full opacity but discard some fragments. Downsampling will then yield transparency. Adaptive Transparency, a two-pass technique where the first constructs a visibility function which compresses on the fly (this compression avoids having to fully sort the fragments) and the second uses this data to composite unordered fragments. Intel's pixel synchronization avoids the need to store all fragments, removing the unbounded memory requirement of many other OIT techniques. Weighted Blended Order-Independent Transparency replaced the over operator with a commutative approximation. Feeding depth information into the weight produces visually-acceptable occlusion. == OIT in Hardware == The Sega Dreamcast games console included hardware support for automatic OIT.

    Read more →
  • International Clinical Trials Registry Platform

    International Clinical Trials Registry Platform

    The International Clinical Trials Registry Platform (ICTRP) is a platform for the registration of clinical trials operated by the World Health Organization. The ICTRP combines data from multiple cooperating clinical trials registries to generate a global view of clinical trials worldwide, with a search portal that allows access to the entire dataset. It requires a minimum standard set of database fields, the WHO Trial Registration Data Set, to be present for a trial to be registered. All entries are given a Universal Trial Number (UTN) that identifies them uniquely. The organization has sought to assist various national governments in establishing their own clinical trials databases. It combines data from the following primary registries and data providers: Australian New Zealand Clinical Trials Registry (ANZCTR) Brazilian Clinical Trials Registry (ReBec) Chinese Clinical Trial Registry (ChiCTR) Clinical Research Information Service (CRiS), Republic of Korea ClinicalTrials.gov Clinical Trials Information System (CTIS), European Medicines Agency Clinical Trials Registry - India (CTRI) Cuban Public Registry of Clinical Trials (RPCEC) EU Clinical Trials Register (EU-CTR) German Clinical Trials Register (DRKS) Iranian Registry of Clinical Trials (IRCT) ISRCTN (UK) International Traditional Medicine Clinical Trial Registry (ITMCTR) Japan Registry of Clinical Trials (jRCT) Japan Primary Registries Network (JPRN) Lebanese Clinical Trials Registry (LBCTR) Overview of Medical Research in the Netherlands (OMON) Thai Clinical Trials Registry (TCTR) Pan African Clinical Trial Registry (PACTR) Peruvian Clinical Trial Registry (REPEC) Sri Lanka Clinical Trials Registry (SLCTR)

    Read more →
  • Snap (computer graphics)

    Snap (computer graphics)

    In computer graphics, snapping allows an object to be easily positioned in alignment with grid lines, guide lines or another object, by causing it to automatically jump to an exact position when the user drags it to the proximity of the desired location. Some CAD software provides a "Snap" pull-down menu with diverse options as preferences for the practice of the operation. In Windows, with the "snap windows" option enabled, snapping a window against the top (or side) edge of the screen causes it to change into full screen (or half-screen for multitasking). Software snapping is analogous to hardware detents which serve to indicate discrete values or steps of an input device.

    Read more →
  • Feature hashing

    Feature hashing

    In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly (after a modulo operation), rather than looking the indices up in an associative array. In addition to its use for encoding non-numeric values, feature hashing can also be used for dimensionality reduction. This trick is often attributed to Weinberger et al. (2009), but there exists a much earlier description of this method published by John Moody in 1989. == Motivation == === Motivating example === In a typical document classification task, the input to the machine learning algorithm (both during learning and classification) is free text. From this, a bag of words (BOW) representation is constructed: the individual tokens are extracted and counted, and each distinct token in the training set defines a feature (independent variable) of each of the documents in both the training and test sets. Machine learning algorithms, however, are typically defined in terms of numerical vectors. Therefore, the bags of words for a set of documents is regarded as a term-document matrix where each row is a single document, and each column is a single feature/word; the entry i, j in such a matrix captures the frequency (or weight) of the j'th term of the vocabulary in document i. (An alternative convention swaps the rows and columns of the matrix, but this difference is immaterial.) Typically, these vectors are extremely sparse—according to Zipf's law. The common approach is to construct, at learning time or prior to that, a dictionary representation of the vocabulary of the training set, and use that to map words to indices. Hash tables and tries are common candidates for dictionary implementation. E.g., the three documents John likes to watch movies. Mary likes movies too. John also likes football. can be converted, using the dictionary to the term-document matrix ( John likes to watch movies Mary too also football 1 1 1 1 1 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 0 0 0 0 0 1 1 ) {\displaystyle {\begin{pmatrix}{\textrm {John}}&{\textrm {likes}}&{\textrm {to}}&{\textrm {watch}}&{\textrm {movies}}&{\textrm {Mary}}&{\textrm {too}}&{\textrm {also}}&{\textrm {football}}\\1&1&1&1&1&0&0&0&0\\0&1&0&0&1&1&1&0&0\\1&1&0&0&0&0&0&1&1\end{pmatrix}}} (Punctuation was removed, as is usual in document classification and clustering.) The problem with this process is that such dictionaries take up a large amount of storage space and grow in size as the training set grows. On the contrary, if the vocabulary is kept fixed and not increased with a growing training set, an adversary may try to invent new words or misspellings that are not in the stored vocabulary so as to circumvent a machine learned filter. To address this challenge, Yahoo! Research attempted to use feature hashing for their spam filters. Note that the hashing trick isn't limited to text classification and similar tasks at the document level, but can be applied to any problem that involves large (perhaps unbounded) numbers of features. === Mathematical motivation === Mathematically, a token is an element t {\displaystyle t} in a finite (or countably infinite) set T {\displaystyle T} . Suppose we only need to process a finite corpus, then we can put all tokens appearing in the corpus into T {\displaystyle T} , meaning that T {\displaystyle T} is finite. However, suppose we want to process all possible words made of the English letters, then T {\displaystyle T} is countably infinite. Most neural networks can only operate on real vector inputs, so we must construct a "dictionary" function ϕ : T → R n {\displaystyle \phi :T\to \mathbb {R} ^{n}} . When T {\displaystyle T} is finite, of size | T | = m ≤ n {\displaystyle |T|=m\leq n} , then we can use one-hot encoding to map it into R n {\displaystyle \mathbb {R} ^{n}} . First, arbitrarily enumerate T = { t 1 , t 2 , . . , t m } {\displaystyle T=\{t_{1},t_{2},..,t_{m}\}} , then define ϕ ( t i ) = e i {\displaystyle \phi (t_{i})=e_{i}} . In other words, we assign a unique index i {\displaystyle i} to each token, then map the token with index i {\displaystyle i} to the unit basis vector e i {\displaystyle e_{i}} . One-hot encoding is easy to interpret, but it requires one to maintain the arbitrary enumeration of T {\displaystyle T} . Given a token t ∈ T {\displaystyle t\in T} , to compute ϕ ( t ) {\displaystyle \phi (t)} , we must find out the index i {\displaystyle i} of the token t {\displaystyle t} . Thus, to implement ϕ {\displaystyle \phi } efficiently, we need a fast-to-compute bijection h : T → { 1 , . . . , m } {\displaystyle h:T\to \{1,...,m\}} , then we have ϕ ( t ) = e h ( t ) {\displaystyle \phi (t)=e_{h(t)}} . In fact, we can relax the requirement slightly: It suffices to have a fast-to-compute injection h : T → { 1 , . . . , n } {\displaystyle h:T\to \{1,...,n\}} , then use ϕ ( t ) = e h ( t ) {\displaystyle \phi (t)=e_{h(t)}} . In practice, there is no simple way to construct an efficient injection h : T → { 1 , . . . , n } {\displaystyle h:T\to \{1,...,n\}} . However, we do not need a strict injection, but only an approximate injection. That is, when t ≠ t ′ {\displaystyle t\neq t'} , we should probably have h ( t ) ≠ h ( t ′ ) {\displaystyle h(t)\neq h(t')} , so that probably ϕ ( t ) ≠ ϕ ( t ′ ) {\displaystyle \phi (t)\neq \phi (t')} . At this point, we have just specified that h {\displaystyle h} should be a hashing function. Thus we reach the idea of feature hashing. == Algorithms == === Feature hashing (Weinberger et al. 2009) === The basic feature hashing algorithm presented in (Weinberger et al. 2009) is defined as follows. First, one specifies two hash functions: the kernel hash h : T → { 1 , 2 , . . . , n } {\displaystyle h:T\to \{1,2,...,n\}} , and the sign hash ζ : T → { − 1 , + 1 } {\displaystyle \zeta :T\to \{-1,+1\}} . Next, one defines the feature hashing function: ϕ : T → R n , ϕ ( t ) = ζ ( t ) e h ( t ) {\displaystyle \phi :T\to \mathbb {R} ^{n},\quad \phi (t)=\zeta (t)e_{h(t)}} Finally, extend this feature hashing function to strings of tokens by ϕ : T ∗ → R n , ϕ ( t 1 , . . . , t k ) = ∑ j = 1 k ϕ ( t j ) {\displaystyle \phi :T^{}\to \mathbb {R} ^{n},\quad \phi (t_{1},...,t_{k})=\sum _{j=1}^{k}\phi (t_{j})} where T ∗ {\displaystyle T^{}} is the set of all finite strings consisting of tokens in T {\displaystyle T} . Equivalently, ϕ ( t 1 , . . . , t k ) = ∑ j = 1 k ζ ( t j ) e h ( t j ) = ∑ i = 1 n ( ∑ j : h ( t j ) = i ζ ( t j ) ) e i {\displaystyle \phi (t_{1},...,t_{k})=\sum _{j=1}^{k}\zeta (t_{j})e_{h(t_{j})}=\sum _{i=1}^{n}\left(\sum _{j:h(t_{j})=i}\zeta (t_{j})\right)e_{i}} ==== Geometric properties ==== We want to say something about the geometric property of ϕ {\displaystyle \phi } , but T {\displaystyle T} , by itself, is just a set of tokens, we cannot impose a geometric structure on it except the discrete topology, which is generated by the discrete metric. To make it nicer, we lift it to T → R T {\displaystyle T\to \mathbb {R} ^{T}} , and lift ϕ {\displaystyle \phi } from ϕ : T → R n {\displaystyle \phi :T\to \mathbb {R} ^{n}} to ϕ : R T → R n {\displaystyle \phi :\mathbb {R} ^{T}\to \mathbb {R} ^{n}} by linear extension: ϕ ( ( x t ) t ∈ T ) = ∑ t ∈ T x t ζ ( t ) e h ( t ) = ∑ i = 1 n ( ∑ t : h ( t ) = i x t ζ ( t ) ) e i {\displaystyle \phi ((x_{t})_{t\in T})=\sum _{t\in T}x_{t}\zeta (t)e_{h(t)}=\sum _{i=1}^{n}\left(\sum _{t:h(t)=i}x_{t}\zeta (t)\right)e_{i}} There is an infinite sum there, which must be handled at once. There are essentially only two ways to handle infinities. One may impose a metric, then take its completion, to allow well-behaved infinite sums, or one may demand that nothing is actually infinite, only potentially so. Here, we go for the potential-infinity way, by restricting R T {\displaystyle \mathbb {R} ^{T}} to contain only vectors with finite support: ∀ ( x t ) t ∈ T ∈ R T {\displaystyle \forall (x_{t})_{t\in T}\in \mathbb {R} ^{T}} , only finitely many entries of ( x t ) t ∈ T {\displaystyle (x_{t})_{t\in T}} are nonzero. Define an inner product on R T {\displaystyle \mathbb {R} ^{T}} in the obvious way: ⟨ e t , e t ′ ⟩ = { 1 , if t = t ′ , 0 , else. ⟨ x , x ′ ⟩ = ∑ t , t ′ ∈ T x t x t ′ ⟨ e t , e t ′ ⟩ {\displaystyle \langle e_{t},e_{t'}\rangle ={\begin{cases}1,{\text{ if }}t=t',\\0,{\text{ else.}}\end{cases}}\quad \langle x,x'\rangle =\sum _{t,t'\in T}x_{t}x_{t'}\langle e_{t},e_{t'}\rangle } As a side note, if T {\displaystyle T} is infinite, then the inner product space R T {\displaystyle \mathbb {R} ^{T}} is not complete. Taking its completion would get us to a Hilbert space, which allows well-behaved infinite sums. Now we have an inner product space, with enough structure to describe the geometry of the feature hashing function ϕ : R T → R n {\displaystyle \phi :\ma

    Read more →
  • Adrozek

    Adrozek

    Adrozek is malware that injects fake ads into online search results. Microsoft announced the malware threat on 10 December 2020, and noted that many different browsers are affected, including Google Chrome, Microsoft Edge, Mozilla Firefox and Yandex Browser. The malware was first detected in May 2020 and, at its peak in August 2020, controlled over 30,000 devices a day. But during the December 2020 announcement, Microsoft claimed "hundreds of thousands" of infected devices worldwide between May and September 2020. According to Microsoft, if not detected and blocked, Adrozek adds browser extensions, modifies a specific DLL per target browser, and changes browser settings to insert additional, unauthorized ads into web pages, often on top of legitimate ads from search engines. For each user tricked into clicking on the fake ads, the scammers earn affiliate advertising dollars. The malware has been observed to extract device data and, in some cases, steal credentials, sending them to remote servers. Users may unintentionally install the malware because of a drive-by download, by visiting a tampered website, opening an e-mail attachment, or clicking on a deceptive link or a deceptive pop-up window. The main malware program is downloaded to the “Programs Files” folder using file names such as Audiolava.exe, QuickAudio.exe, and converter.exe. According to PC Magazine, a good way to avoid, or mitigate, infection by Adrozek is to keep browser and related software programs up to date.

    Read more →
  • Comparison of color models in computer graphics

    Comparison of color models in computer graphics

    This article provides introductory information about the RGB, HSV, and HSL color models from a computer graphics (web pages, images) perspective. An introduction to colors is also provided to support the main discussion. == Basics of color == === Primary colors and hue === First, "color" refers to the human brain's subjective interpretation of combinations of a narrow band of wavelengths of light. For this reason, the definition of "color" is not based on a strict set of physical phenomena. Therefore, even basic concepts like "primary colors" are not clearly defined. For example, traditional "Painter's Colors" use red, blue, and yellow as the primary colors, "Printer's Colors" use cyan, yellow, and magenta, and "Light Colors" use red, green, and blue. "Light colors", more formally known as additive colors, are formed by combining red, green, and blue light. This article refers to additive colors and refers to red, green, and blue as the primary colors. Hue is a term describing a pure color, that is, a color not modified by tinting or shading (see below). In additive colors, hues are formed by combining two primary colors. When two primary colors are combined in equal intensities, the result is a "secondary color". === Color wheel === A color wheel is a tool that provides a visual representation of the relationships between all possible hues. The primary colors are arranged around a circle at equal (120 degree) intervals. (Warning: Color wheels frequently depict "Painter's Colors" primary colors, which leads to a different set of hues than additive colors.) The illustration shows a simple color wheel based on the additive colors. Note that the position (top, right) of the starting color, typically red, is arbitrary, as is the order of green and blue (clockwise, counter-clockwise). The illustration also shows the secondary colors, yellow, cyan, and magenta, located halfway between (60 degrees) the primary colors. == Complementary color == The complement of a hue is the hue that is opposite it (180 degrees) on the color wheel. Using additive colors, mixing a hue and its complement in equal amounts produces white. === Tints and shades === The following discussion uses an illustration involving three projectors pointing to the same spot on a screen. Each projector is capable of generating one hue. The "intensities" of each projector are "matched" and can be equally adjusted from zero to full. (Note: "Intensity" is used here in the same sense as the RGB color model. The subject of matching, or "gamma correction", is beyond the level of this article.) A shade is produced by "dimming" a maximum chroma color. Painters refer to this as "adding black". In our illustration, one projector is set to full intensity, a second is set to some intensity between zero and full, and third is set to zero. "Dimming" is accomplished by decreasing each projector's intensity setting to the same fraction of its start setting. In the shade example, with any fully shaded hue, that all three projectors are set to zero intensity, resulting in black. A tint is produced by "lightening" a maximum chroma color. Painters refer to this as "adding white". In our illustration, one projector is set to full intensity, a second is set to some intensity between zero and full, and third is set to zero. "Lightening" is accomplished by increasing each projector's intensity setting by the same fraction from its start setting to full. In the tinting example, note that the third projector is now contributing. When the hue is fully lightened, all three projectors are each at full intensity, and the result is white. Note an attribute of the total intensity in the additive model. If full intensity for one projector is 1, then a primary color has a combined intensity of 1. A secondary color has a total intensity of 2. White has a total intensity of 3. Tinting, or "adding white", increases the total intensity of the hue. While this is simply a fact, the HSL model will take this fact into account in its design. === Tones === Tone is a general term, typically used by painters, to refer to the effects of reducing the "colorfulness" of a maximum chroma color; painters refer to it as "adding gray". Note that gray is not a color or even a single concept but refers to all the range of values between black and white where all three primary colors are equally represented. The general term is provided as more specific terms have conflicting definitions in different color models. Thus, shading takes a hue toward black, tinting takes a hue towards white, and tones cover the range between. == Choosing a color model == No one color model is necessarily "better" than another. Typically, the choice of a color model is dictated by external factors, such as a graphics tool or the need to specify colors according to the CSS2 or CSS3 standard. The following discussion only describes how the models function, centered on the concepts of hue, shade, tint, and tone. === RGB === The RGB model's approach to colors is important because: It directly reflects the physical properties of "Truecolor" displays As of 2011, most graphic cards define pixel values in terms of the colors red, green, and blue. The typical range of intensity values for each color, 0–255, is based on taking a binary number with 32 bits and breaking it up into four bytes of 8 bits each. 8 bits can hold a value from 0 to 255. The fourth byte is used to specify the "alpha", or the opacity, of the color. Opacity comes into play when layers with different colors are stacked. If the color in the top layer is less than fully opaque (alpha < 255), the color from underlying layers "shows through". In the RGB model, hues are represented by specifying one color as full intensity (255), a second color with a variable intensity, and the third color with no intensity (0). The following provides some examples using red as the full-intensity and green as the partial-intensity colors; blue is always zero: Shades are created by multiplying the intensity of each primary color by 1 minus the shade factor, in the range 0 to 1. A shade factor of 0 does nothing to the hue, a shade factor of 1 produces black: new intensity = current intensity (1 – shade factor) The following provides examples using orange: Tints are created by modifying each primary color as follows: the intensity is increased so that the difference between the intensity and full intensity (255) is decreased by the tint factor, in the range 0 to 1. A tint factor of 0 does nothing, a tint factor of 1 produces white: new intensity = current intensity + (255 – current intensity) tint factor The following provides examples using orange: Tones are created by applying both a shade and a tint. The order in which the two operations are performed does not matter, with the following restriction: when a tint operation is performed on a shade, the intensity of the dominant color becomes the "full intensity"; that is, the intensity value of the dominant color must be used in place of 255. The following provides examples using orange: === HSV === The HSV, or HSB, model describes colors in terms of hue, saturation, and value (brightness). Note that the range of values for each attribute is arbitrarily defined by various tools or standards. Be sure to determine the value ranges before attempting to interpret a value. Hue corresponds directly to the concept of hue in the Color Basics section. The advantages of using hue are The angular relationship between tones around the color circle is easily identified Shades, tints, and tones can be generated easily without affecting the hue Saturation corresponds directly to the concept of tint in the Color Basics section, except that full saturation produces no tint, while zero saturation produces white, a shade of gray, or black. Value corresponds directly to the concept of intensity in the Color Basics section. Pure colors are produced by specifying a hue with full saturation and value Shades are produced by specifying a hue with full saturation and less than full value Tints are produced by specifying a hue with less than full saturation and full value Tones are produced by specifying a hue and both less than full saturation and value White is produced by specifying zero saturation and full value, regardless of hue Black is produced by specifying zero value, regardless of hue or saturation Shades of gray are produced by specifying zero saturation and between zero and full value The advantage of HSV is that each of its attributes corresponds directly to the basic color concepts, which makes it conceptually simple. The perceived disadvantage of HSV is that the saturation attribute corresponds to tinting, so desaturated colors have increasing total intensity. For this reason, the CSS3 standard plans to support RGB and HSL but not HSV. === HSL === The HSL model describes colors in terms of hue, saturation, and lightness (also called luminance). (Note: the definition of sa

    Read more →
  • Texture atlas

    Texture atlas

    In computer graphics, a texture atlas (also called a spritesheet or an image sprite in 2D game development) is an image containing multiple smaller images, usually packed together to reduce overall dimensions. An atlas can consist of uniformly-sized images or images of varying dimensions. A sub-image is drawn using custom texture coordinates to pick it out of the atlas. == Benefits == In an application where many small textures are used frequently, it is often more efficient to store the textures in a texture atlas which is treated as a single unit by the graphics hardware. This reduces both the disk I/O overhead and the overhead of a context switch by increasing memory locality. Careful alignment may be needed to avoid bleeding between sub textures when used with mipmapping and texture compression. In web development, images are packed into a sprite sheet to reduce the number of image resources that need to be fetched in order to display a page. == Gallery ==

    Read more →
  • Powerset (company)

    Powerset (company)

    Powerset was an American company based in San Francisco, California, that, in 2006, was developing a natural language search engine for the Internet. On July 1, 2008, Powerset was acquired by Microsoft for an estimated $100 million (~$143 million in 2024). Powerset was working on building a natural language search engine that could find targeted answers to user questions (as opposed to keyword based search). For example, when confronted with a question like "Which U.S. state has the highest income tax?", conventional search engines ignore the question phrasing and instead do a search on the keywords "state", "highest", "income", and "tax". Powerset on the other hand, attempts to use natural language processing to understand the nature of the question and return pages containing the answer. The company was in the process of "building a natural language search engine that reads and understands every sentence on the Web". The company has licensed natural language technology from PARC, the former Xerox Palo Alto Research Center. On May 11, 2008, the company unveiled a tool for searching a fixed subset of English Wikipedia using conversational phrases rather than keywords. Acquisition by Microsoft: One significant milestone in Powerset's history was its acquisition by Microsoft on July 1, 2008, for an estimated $100 million. This acquisition was part of Microsoft's broader strategy to enhance its search capabilities and compete more effectively with other search engine providers, particularly Google. Natural Language Search Engine: Powerset's primary focus was on developing a natural language search engine capable of understanding and interpreting user queries in a more human-like manner. Instead of simply matching keywords, Powerset aimed to comprehend the meaning behind the words, allowing for more accurate and contextually relevant search results. Technology and Partnerships: Powerset had licensed natural language technology from PARC, the Xerox Palo Alto Research Center. This technology likely played a crucial role in the development of Powerset's NLP capabilities. Wikipedia Search Tool: In May 2008, Powerset unveiled a search tool that allowed users to search a fixed subset of English Wikipedia using conversational phrases rather than traditional keywords. This demonstrated the potential of Powerset's NLP technology in providing more precise and relevant search results. == Powerlabs == In a form of beta testing, Powerset opened an online community called Powerlabs on September 17, 2007. Business Week said: "The company hopes the site will marshal thousands of people to help build and improve its search engine before it goes public next year." Said The New York Times: "[Powerset Labs] goes far beyond the 'alpha' or 'beta' testing involved in most software projects, when users put a new product through rigorous testing to find its flaws. Powerset doesn’t have a product yet, but rather a collection of promising natural language technologies, which are the fruit of years of research at Xerox PARC." Powerlabs' initial search results are taken from Wikipedia. == Notable people == Barney Pell (born March 18, 1968, in Hollywood, California) was co-founder and CEO of Powerset. Pell received his Bachelor of Science degree in symbolic systems from Stanford University in 1989, where he graduated Phi Beta Kappa and was a National Merit Scholar. Pell received a PhD in computer science from Cambridge University in 1993, where he was a Marshall Scholar. He has worked at NASA, as chief strategist and vice president of business development at StockMaster.com (acquired by Red Herring in March, 2000) and at Whizbang! Labs. Prior to joining Powerset, Pell was an Entrepreneur-in-Residence at Mayfield Fund, a venture capital firm in Silicon Valley. Pell is also a founder of Moon Express, Inc., a U.S. company awarded a $10M commercial lunar contract by NASA and a competitor in the Google Lunar X PRIZE. Steve Newcomb was the COO and co-founder of Powerset. Prior to joining Powerset, he was a co-founder of Loudfire, General Manager at Promptu, and was on the board of directors at Jaxtr. He left Powerset in October 2007 to form Virgance, a social startup incubator. Lorenzo Thione (born in Como, Italy) was the product architect and co-founder of Powerset. Prior to joining Powerset, he worked at FXPAL in natural language processing and related research fields. Thione earned his master's degree in software engineering from the University of Texas at Austin. Ronald Kaplan, former manager of research in Natural Language Theory and Technology at PARC, served as the company's CTO and CSO. Ryan Ferrier is a member of the founding team of Powerset. He managed personnel and internal operations. After 2008 he went on to co-found Serious Business, which made Facebook applications and was later bought by Zynga. Another Powerset alumnus, Alex Le, became CTO of Serious Business and went on to become an executive producer at Zynga when it bought the company. Siqi Chen founded a stealth startup in mobile computing after leaving Powerset. Tom Preston-Werner worked at Powerset and left after the acquisition to found GitHub. == Investors == Powerset attracted a wide range of investors, many of whom had considerable experience in the venture capital field. The company received $12.5 million (~$18.2 million in 2024) in Series A funding during November 2007, co-led by the venture capital firms Foundation Capital and The Founders Fund. Among the better-known investors: Esther Dyson, founding chairman of ICANN, founder of the newsletter Release 1.0 and editor at Cnet Peter Thiel, founder and former CEO of PayPal Luke Nosek, founder of PayPal Todd Parker. Managing Partner, Hidden River Ventures Reid Hoffman, executive vice president of PayPal and founder of LinkedIn First Round Capital, seed-stage venture firm

    Read more →
  • Continuous Exposure Management

    Continuous Exposure Management

    Continuous Exposure Management (CEM) is a cybersecurity approach that provides continuous, real-time monitoring, assessment, and prioritization of an organization’s security vulnerabilities and exposures. CEM focuses on identifying and mitigating risks by analyzing attack paths and providing recommendations, ensuring organizations maintain a resilient cybersecurity posture. == Overview == CEM platforms enable organizations to detect and remediate cybersecurity exposures, such as vulnerabilities, misconfigurations and weak credentials, across their entire ecosystem, including on-premises, cloud environments, and hybrid infrastructures. By simulating potential attack scenarios and mapping attack paths, these platforms help organizations understand how exposures could be exploited and which ones pose the greatest risk to critical assets. The XM Cyber Continuous Exposure Management platform, for example, integrates automated attack path mapping and contextual risk analysis, allowing security teams to prioritize remediation efforts effectively. In 2023, the platform uncovered over 40 million exposures affecting 11.5 million critical business entities. As cyber threats evolve, CEM platforms are becoming indispensable for modern enterprises. According to Gartner, organizations implementing continuous exposure management are three times less likely to experience a breach by 2026. In addition to risk mapping and simulation, some CEM approaches incorporate automated security validation to verify the exploitability of identified vulnerabilities. Platforms such as Pentera utilize automated security testing to emulate real-world adversary behavior across the network, identifying how security gaps could be leveraged to gain access to critical assets. This process aims to move beyond theoretical risk assessments by providing empirical evidence of exposure, allowing security teams to focus remediation efforts on validated attack vectors. By integrating this validation phase into the broader exposure management lifecycle, organizations can refine their prioritization strategies based on the actual effectiveness of their existing security controls and the proven reachability of their most sensitive data. == Key features == CEM platforms are designed to address the dynamic nature of cybersecurity risks through the following features: Attack Path Simulation: Continuously maps attack paths to critical assets, highlighting exploitable exposures and chokepoints. Risk Prioritization: Focuses on exposures with the highest impact on critical assets, ensuring efficient allocation of resources. Remediation Guidance: Provides clear, actionable recommendations to resolve exposures and strengthen defenses. Integration with Existing Tools: Seamlessly works with Security Information and Event Management (SIEM), ticketing, and Security Orchestration, Automation, and Response (SOAR) systems. Real-time Monitoring: Offers continuous visibility into exposures, ensuring that new ones are quickly identified and addressed.

    Read more →
  • IEEE Transactions on Visualization and Computer Graphics

    IEEE Transactions on Visualization and Computer Graphics

    IEEE Transactions on Visualization and Computer Graphics is a peer-reviewed scientific journal published by the IEEE Computer Society. It covers subjects related to computer graphics and visualization techniques, systems, software, hardware, and user interface issues. TVCG has been considered the top journal in the field of visualization. Since 2011, TVCG has allowed authors to present recently accepted papers at partner conferences. These include: IEEE Visualization (VIS), including VAST, InfoVis, and SciVis. IEEE Virtual Reality Conference (IEEE VR) IEEE International Symposium on Mixed and Augmented Reality (ISMAR) ACM Symposium on Interactive 3D Graphics and Games (I3D) IEEE Pacific Visualization Conference (IEEE PacificVis) ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA) Eurographics Symposium on Geometry Processing (SGP) Pacific Graphics Conference (PG) Eurovis - The EG and VGTC Conference on Visualization Graphics Interfaces (GI)

    Read more →