Latent Dirichlet allocation

Latent Dirichlet allocation

In natural language processing, latent Dirichlet allocation (LDA) is a generative statistical model that explains how a collection of text documents can be described by a set of unobserved "topics." For example, given a set of news articles, LDA might discover that one topic is characterized by words like "president", "government", and "election", while another is characterized by "team", "game", and "score". It is one of the most common topic models. The LDA model was first presented as a graphical model for population genetics by J. K. Pritchard, M. Stephens and P. Donnelly in 2000. The model was subsequently applied to machine learning by David Blei, Andrew Ng, and Michael I. Jordan in 2003. Although its most frequent application is in modeling text corpora, it has also been used for other problems, such as in clinical psychology, social science, and computational musicology. The core assumption of LDA is that documents are represented as a random mixture of latent topics, and each topic is characterized by a probability distribution over words. The model is a generalization of probabilistic latent semantic analysis (pLSA), differing primarily in that LDA treats the topic mixture as a Dirichlet prior, leading to more reasonable mixtures and less susceptibility to overfitting. Learning the latent topics and their associated probabilities from a corpus is typically done using Bayesian inference, often with methods like Gibbs sampling or variational Bayes. == History == In the context of population genetics, LDA was proposed by J. K. Pritchard, M. Stephens and P. Donnelly in 2000. LDA was applied in machine learning by David Blei, Andrew Ng and Michael I. Jordan in 2003. == Overview == === Population genetics === In population genetics, the model is used to detect the presence of structured genetic variation in a group of individuals. The model assumes that alleles carried by individuals under study have origin in various extant or past populations. The model and various inference algorithms allow scientists to estimate the allele frequencies in those source populations and the origin of alleles carried by individuals under study. The source populations can be interpreted ex-post in terms of various evolutionary scenarios. In association studies, detecting the presence of genetic structure is considered a necessary preliminary step to avoid confounding. === Clinical psychology, mental health, and social science === In clinical psychology research, LDA has been used to identify common themes of self-images experienced by young people in social situations. Other social scientists have used LDA to examine large sets of topical data from discussions on social media (e.g., tweets about prescription drugs). Additionally, supervised Latent Dirichlet Allocation with covariates (SLDAX) has been specifically developed to combine latent topics identified in texts with other manifest variables. This approach allows for the integration of text data as predictors in statistical regression analyses, improving the accuracy of mental health predictions. One of the main advantages of SLDAX over traditional two-stage approaches is its ability to avoid biased estimates and incorrect standard errors, allowing for a more accurate analysis of psychological texts. In the field of social sciences, LDA has proven to be useful for analyzing large datasets, such as social media discussions. For instance, researchers have used LDA to investigate tweets discussing socially relevant topics, like the use of prescription drugs and cultural differences in China. By analyzing these large text corpora, it is possible to uncover patterns and themes that might otherwise go unnoticed, offering valuable insights into public discourse and perception in real time. === Musicology === In the context of computational musicology, LDA has been used to discover tonal structures in different corpora. === Machine learning === One application of LDA in machine learning – specifically, topic discovery, a subproblem in natural language processing – is to discover topics in a collection of documents, and then automatically classify any individual document within the collection in terms of how "relevant" it is to each of the discovered topics. A topic is considered to be a set of terms (i.e., individual words or phrases) that, taken together, suggest a shared theme. For example, in a document collection related to pet animals, the terms dog, spaniel, beagle, golden retriever, puppy, bark, and woof would suggest a DOG_related theme, while the terms cat, siamese, Maine coon, tabby, manx, meow, purr, and kitten would suggest a CAT_related theme. There may be many more topics in the collection – e.g., related to diet, grooming, healthcare, behavior, etc. that we do not discuss for simplicity's sake. (Very common, so called stop words in a language – e.g., "the", "an", "that", "are", "is", etc., – would not discriminate between topics and are usually filtered out by pre-processing before LDA is performed. Pre-processing also converts terms to their "root" lexical forms – e.g., "barks", "barking", and "barked" would be converted to "bark".) If the document collection is sufficiently large, LDA will discover such sets of terms (i.e., topics) based upon the co-occurrence of individual terms, though the task of assigning a meaningful label to an individual topic (i.e., that all the terms are DOG_related) is up to the user, and often requires specialized knowledge (e.g., for collection of technical documents). The LDA approach assumes that: The semantic content of a document is composed by combining one or more terms from one or more topics. Certain terms are ambiguous, belonging to more than one topic, with different probability. (For example, the term training can apply to both dogs and cats, but are more likely to refer to dogs, which are used as work animals or participate in obedience or skill competitions.) However, in a document, the accompanying presence of specific neighboring terms (which belong to only one topic) will disambiguate their usage. Most documents will contain only a relatively small number of topics. In the collection, e.g., individual topics will occur with differing frequencies. That is, they have a probability distribution, so that a given document is more likely to contain some topics than others. Within a topic, certain terms will be used much more frequently than others. In other words, the terms within a topic will also have their own probability distribution. When LDA machine learning is employed, both sets of probabilities are computed during the training phase, using Bayesian methods and an expectation–maximization algorithm. LDA is a generalization of older approach of probabilistic latent semantic analysis (pLSA), The pLSA model is equivalent to LDA under a uniform Dirichlet prior distribution. pLSA relies on only the first two assumptions above and does not care about the remainder. While both methods are similar in principle and require the user to specify the number of topics to be discovered before the start of training (as with k-means clustering) LDA has the following advantages over pLSA: LDA yields better disambiguation of words and a more precise assignment of documents to topics. Computing probabilities allows a "generative" process by which a collection of new "synthetic documents" can be generated that would closely reflect the statistical characteristics of the original collection. Unlike LDA, pLSA is vulnerable to overfitting especially when the size of corpus increases. The LDA algorithm is more readily amenable to scaling up for large data sets using the MapReduce approach on a computing cluster. == Model == With plate notation, which is often used to represent probabilistic graphical models (PGMs), the dependencies among the many variables can be captured concisely. The boxes are "plates" representing replicates, which are repeated entities. The outer plate represents documents, while the inner plate represents the repeated word positions in a given document; each position is associated with a choice of topic and word. The variable names are defined as follows: M denotes the number of documents N is number of words in a given document (document i has N i {\displaystyle N_{i}} words) α is the parameter of the Dirichlet prior on the per-document topic distributions β is the parameter of the Dirichlet prior on the per-topic word distribution θ i {\displaystyle \theta _{i}} is the topic distribution for document i φ k {\displaystyle \varphi _{k}} is the word distribution for topic k z i j {\displaystyle z_{ij}} is the topic for the j-th word in document i w i j {\displaystyle w_{ij}} is the specific word. The fact that W is grayed out means that words w i j {\displaystyle w_{ij}} are the only observable variables, and the other variables are latent variables. As proposed in the original paper, a sparse Dirichlet prior can be used to model the to

Application performance engineering

Application performance engineering is a method to develop and test application performance in various settings, including mobile computing, the cloud, and conventional information technology (IT). == Methodology == According to the American National Institute of Standards and Technology, nearly four out of every five dollars spent on the total cost of ownership of an application is directly attributable to finding and fixing issues post-deployment. A full one-third of this cost could be avoided with better software testing. Application performance engineering attempts to test software before it is published. While practices vary among organizations, the method attempts to emulate the real-world conditions that software in development will confront, including network deployment and access by mobile devices. Techniques include network virtualization.

Pepper (cryptography)

In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate using another meachanism, such as a Hardware Security Module. Note that the National Institute of Standards and Technology refers to this value as a secret key rather than a pepper. A pepper is similar in concept to a salt or an encryption key. It is like a salt in that it is a randomized value that is added to a password hash, and it is similar to an encryption key in that it should be kept secret. A pepper performs a comparable role to a salt or an encryption key, but while a salt is not secret (merely unique) and can be stored alongside the hashed output, a pepper is secret and must not be stored with the output. The hash and salt are usually stored in a database, but, if stored, a pepper must be stored separately to prevent it from being obtained by the attacker in case of a database breach. == History == The idea of a site- or service-specific salt (in addition to a per-user salt) has a long history, with Steven M. Bellovin proposing a local parameter in a Bugtraq post in 1995. In 1996 Udi Manber also described the advantages of such a scheme, terming it a secret salt. However, he suggested not storing the value of the secret salt, but instead rediscovering it by trial and error at password verification time. The term pepper has been used, by analogy to salt, but with a variety of meanings. For example, when discussing a challenge-response scheme, pepper has been used for a salt-like quantity, though not used for password storage; it has been used for a data transmission technique where a pepper must be guessed; and even as a part of jokes. The term pepper was proposed for a secret or local parameter stored separately from the password in a discussion of protecting passwords from rainbow table attacks. This usage did not immediately catch on: for example, Fred Wenzel added support to Django password hashing for storage based on a combination of bcrypt and HMAC with separately stored nonces, without using the term. Usage has since become more common. == Types == There are multiple different types of pepper: A shared secret that is common to all users. A randomly-selected number that must be re-discovered on every password input. These mechanisms could be combined with password salting, iterated hashing or even one another. == Shared-secret pepper == Bellovin and Webster suggest prepend a shared secret to the password before hashing, which allows easy use of existing hash functions. For example, consider two users to be added to a database. This table contains two combinations of username and password. The password is not saved, and the 8-byte (64-bit) 44534C70C6883DE2 pepper is saved in a safe place separate from the output values of the hash, in this case SHA256. Unlike the salt, the pepper does not provide protection to users who use the same password, but protects against dictionary attacks, unless the attacker has the pepper value available. Since the same pepper is not shared between different applications, an attacker is unable to reuse the hashes of one compromised database to another. A complete scheme for saving passwords may include both salt and pepper use. For example, it has been suggested to combine the pepper by encrypting salted password hashes, which allows rotation of the pepper. In the case of a shared-secret pepper, a single compromised password (via password reuse or other attack) along with a user's salt can lead to an attack to discover the pepper, rendering it ineffective. If an attacker knows a plaintext password and a user's salt, as well as the algorithm used to hash the password, then discovering the pepper can be a matter of brute forcing the values of the pepper. This is why NIST recommends the secret value be at least 112 bits, so that discovering it by exhaustive search is prohibitively expensive. The pepper must be generated anew for every application it is deployed in, otherwise a breach of one application would result in lowered security of another application. Without knowledge of the pepper, other passwords in the database will be far more difficult to extract from their hashed values, as the attacker would need to guess the password as well as the pepper. A pepper adds security to a database of salts and hashes because unless the attacker is able to obtain the pepper, cracking even a single hash is intractable, no matter how weak the original password. Even with a list of (salt, hash) pairs, an attacker must also guess the secret pepper in order to find the password which produces the hash. The NIST specification for a secret salt suggests using a Password-Based Key Derivation Function (PBKDF) with an approved Pseudorandom Function such as HMAC with SHA-3 as the hash function of the HMAC. The NIST recommendation is also to perform at least 1000 iterations of the PBKDF, and a further minimum 1000 iterations using the secret salt in place of the non-secret salt. == Randomly-selected pepper that must be re-discovered == The aim of this mechanism is to slow down password the password verification step, thus slowing attacks. The aim is similar increasing the iteration count on bcrypt or Argon2, but the mechanism is different. The secret salt or pepper must be rediscovered by the verifier or attacker each time by guessing. In this situation, the password hashing function is calculated using both the password and the pepper. At password storage time, the pepper is chosen randomly from a range between 1 and R, the hash output is calculated using the password and the pepper. The hash output is stored with the username. The pepper is then discarded. At password verification time, the verifier is provided with a username and password to verify. The originally calculated hash is retrieved for the given username, and then the hash of the password and each value between 1 and R is calculated. If any of these hash values match the stored password hash, the password is considered valid. Note, the possible values of the pepper should not be tested in a fixed order known to an attacker, otherwise a timing attack may reveal the pepper. If the password is correct, the correct pepper will be found in R/2 hash evaluations on average. If the password is incorrect, all R values must be tested before the password can be rejected.

Social media and suicide

Since the rise of social media, there have been numerous cases of individuals being influenced towards committing suicide or self-harm through their use of social media, and even of individuals arranging to broadcast suicide attempts, some successful, on social media. Researchers have studied social media and suicide to determine what, if any, risks social media poses in terms of suicide, and to identify methods of mitigating such risks, if they exist. The search for a correlation has not yet uncovered a clear answer. == Background == Suicide is one of the leading causes of death worldwide, and as of 2020, the second leading cause of death in the United States for those aged 15–34. According to the Center for Disease Control and Prevention, suicide was the third leading cause of death among adolescents in the US, from 1999 to 2006. In 2020, people in the US had a suicide rate of 13.5 per 100,000. Suicide was a leading cause of death in the United States accounting for 48,183 deaths in 2021. Suicide rates increased by 30 per cent from 2000 to 2018 and declined in 2019 and 2020. Suicide remains a significant public health issue worldwide, despite prevention efforts and treatments. Suicide has been identified not only as an individual phenomenon but also as being influenced by social and environmental factors. There is growing evidence that online activity has influenced suicide-related behavior. The use of social media throughout the 21st century has grown exponentially. For this reason, there are a variety of sources that are accessible to the public in various forms, especially social media sites such as Facebook, Instagram, Twitter, YouTube, Snapchat, TikTok and many more. Although these platforms were intended to allow people to connect virtually, these platforms can lead to cyber-bullying, insecurity, and emotional distress, and sometimes may influence a person to attempt suicide. Bullying, whether on social media or elsewhere, physical or not, significantly increases victims' risk of suicidal behavior. Since social media was introduced some people have taken their lives as a result of cyberbullying. Furthermore, suicide rates among teenagers have increased from 2010 to 2022 as social media has become something that people interact with more throughout their day-to-day lives. Media algorithms tend to popularize videos and posts to inform the country of the rising trouble, which may create a popular appeal to the young and immature minds of teenagers. This is why, social media could provide higher risks with the promotion of different kinds of pro-suicidal sites, message boards, chat rooms, and forums. Moreover, the Internet not only reports suicide incidents but documents suicide methods (for example, suicide pacts, an agreement between two or more people to kill themselves at a particular time and often by the same lethal means). Therefore, the role the Internet plays, particularly social media, in suicide-related behavior is a topic of growing interest. == Cyberbullying == There is substantial evidence that the Internet and social media can influence suicide-related behavior. Such evidence includes an increase in exposure to graphic content. A research study conducted by Sameer Hinduja and Justin Patchin found a correlation between cyberbullying and suicide. According to their findings, cyber-bullying increases suicidal thoughts by 14.5 percent and suicide attempts by 8.7 percent. Particularly alarming is the fact that children and young people under 25 who are victims of cyberbullying are more than twice as likely to self-harm and engage in suicidal behavior. Overall, teen suicide rates have increased within the past decade.This presents a significant public health concern, with over 40,000 suicides in the United States and nearly one million worldwide annually. Adolescents involved in cyberbullying often downplay its seriousness by calling it a joke or blaming the victim. These moral disengagement strategies can normalize harmful behavior and reduce feelings of guilt. This normalization may increase emotional distress and contribute to risks like depression and suicidal thoughts. Recent data from the Centers for Disease Control and Prevention reveals that 14.9 per cent of teenagers have experienced online bullying, while 13.6 per cent of teenagers have seriously attempted suicide. Both of these incidents are in increasing numbers in the United States. Furthermore, in numerous recent incidents, cyber-bullying led the victim to commit suicide; this phenomenon is now known as cyberbullicide. Many parents and children are unaware of the dangers and potential legal consequences of cyberbullying. As a response, anti-bullying regulations implemented by schools aim to prevent any form of bullying, including through technology, and protect students from online harassment. While some states have enacted laws against cyberbullying, there are currently no federal regulations addressing this issue. == Social media's influence on suicide == The media may portray suicidal behavior or language which can potentially influence people to act on these suicidal ideation. This may include news reports of actual suicides that have occurred or television shows and films that reenact suicides. Some organizations have proposed guidelines about how the media should report suicide. There is evidence that compliance with the guidelines varies. Some research showed that it is unclear whether the guidelines have successfully reduced the number of suicides. On the contrary, other research studies stated that the guidelines have worked in some cases. == Impact of pro-suicidal sites, message boards, chat rooms and forums == Social media platforms have transformed traditional methods of communication by allowing instantaneous and interactive sharing of information created and controlled by individuals, groups, organizations, and governments. As of the third quarter of 2022, Facebook had 266 million monthly active users, between Canada and the US. An immense quantity of information on the topic of suicide is available on the Internet and via social media. The information available on social media on the topic of suicide can influence suicidal behavior, both negatively and positively. The social cognitive theory plays a vital role in suicide attempts influenced through social media. This theory is demonstrated when one is influenced by what they see through various processes that form into modeled behaviors. This can be shown when people post their suicide attempts online or promote suicidal behavior in general. Contributors to these social media platforms may also exert peer pressure and encourage others to take their own lives, idolize those who have killed themselves, and facilitate suicide pacts. These pro-suicidal sites reported the following. For example, on a Japanese message board in 2008, it was shared that people can kill themselves using hydrogen sulfide gas. Shortly afterwards, 220 people attempted suicide in this way, and 208 were successful. Biddle et al. conducted a systematic Web search of 12 suicide-associated terms (e.g., suicide, suicide methods, how to kill yourself, and best suicide methods) to analyze the search results, and found that pro-suicide sites and chat rooms that discussed general issues associated with suicide most often occurred within the first few hits of a search. In another study, 373 suicide-related websites were found using Internet search engines and examined. Among them, 31% were suicide-neutral, 29% were anti-suicide, and 11% were pro-suicide. Together, these studies have shown that obtaining pro-suicide information on the Internet, including detailed information on suicide methods, is very easy. While social media has been prevalent in young adult suicide, some young adults find comfort and solace through these platforms. Young adults are making connections with people in like situations that are helping them feel less lonely. Although the public opinion is that message boards are harmful, the following studies show how they point to suicide prevention and have positive influences. A study using content analysis analyzed all of the postings on the AOL Suicide Bulletin Board over 11 months and concluded that most contributions contained positive, empathetic, and supportive postings. Then, a multi-method study was able to demonstrate that the users of such forums experience a great deal of social support and only a small amount of social strain. Lastly, in the survey participants were asked to assess the extent of their suicidal thoughts on a 7-level scale (0, absolutely no suicidal thoughts, to 7, very strong suicidal thoughts) for the time directly before their first forum visit and at the time of the survey. The study found a significant reduction after using the forum. The study however cannot conclude the forum is the only reason for the decrease. Together, these studies show how forums can reduce the number of

Classora

Classora is a knowledge base for the Internet oriented to data analysis. From a practical point of view, Classora is a digital repository that stores structured information and allows it to be displayed in multiple formats: analytically, graphically, geographically (through maps); as well as carry out OLAP analysis. The information contained in Classora comes from public sources and is uploaded into the system through bots and ETL processes. The Knowledge Base has a commercial API for semantic enhancement, and an open web through which any user can access to part of the information collected (it also allows users to complete data and share opinions). Internally, Classora is organized into Knowledge Units and Reports. A «Knowledge Unit» is any element of the World about which information may be stored and presented in the form of a data sheet (a person, a company, a country, etc.) A «Report» is a group of Knowledge Units: a ranking of companies, a sport classification table, a survey about people, etc. In fact, one of the technical capabilities of Classora is that it allows the comparison of reports and knowledge units gathered from different sources, thereby generating an added value for the media in which this information is published: digital media, interactive TV, etc. == Key definitions == === Knowledge unit === The units of knowledge (also known as entries) in Classora are data sheets that have a certain semantic equivalence with the articles on the Wikipedia: they store information about any element of the world, be it a film, a country, a company or an animal. However, they differ from Wikipedia in that Classora stores structured information, enriched with a metadata layer; and therefore it is able to automatically interpret the meaning of each unit of knowledge. === Data report === A report is a group of units of knowledge in which the repetition of elements is not allowed. This definition includes any list, poll, ranking, etc.; and, in general, any consultation that involves more than one unit of knowledge. Classora excels at the reports management due to its visualization capabilities, being able to display data in the form of tables, graphs and maps. Types of reports: Sports scores: Sports competitions results sanctioned by the competent institution. Rankings and lists: All types of interesting and curious lists, whether they have an implicit order or not. Polls: Units of knowledge that are ranked according to users’ votes. Queries to the Knowledge Base: Questions from users using CQL. Networks of connections: automatically calculated from the reports and the taxonomy of each Knowledge Unit. === Organizational taxonomy === An organizational taxonomy (also referred to as entry type) is a data sheet that brings together the common attributes of a set of units of knowledge. For instance, the organizational taxonomy F1 Driver displays attributes such as date of debut, team, etc.; and the organizational taxonomy Football Club presents attributes such as city, stadium, etc. In Classora, taxonomies are hierarchically organized, so that they inherit attributes from their parent taxonomies. For instance, F1 Driver is a subsidiary taxonomy of Sportsperson, which is a subsidiary taxonomy of Person, which in turn is a subsidiary taxonomy of Organism. The simplest type of entry in Classora is Classora Object. All the other taxonomies are its subsidiaries and inherit its attributes. In fact, the only attribute Classora Object possesses is name (all units of knowledge are required to have one name at least). == Architecture of Classora == === Data Extraction Module === The Data Extraction Module consists of a set of robots coordinated by software that also manages the potential incidents. Most of the information available in Classora is automatically uploaded through those robots, which connect to the main online public sources to gather all types of data. There are three categories of robots: Extraction robots: responsible for the massive uploading of reports from official public sources (FIFA, CIA, IMF, Eurostat...). They are used for either absolute or incremental data uploading. Data scanner robots: responsible for looking for and updating the data of a unit of knowledge. They use specific sources to perform this task: Wikipedia, IMDB, World Bank, etc. Content aggregators: they don’t connect to external sources. Instead, they generate new information using Classora’s internal database. === Participatory Module === In Classora’s Open Website, Internet users may participate providing their knowledge as they would on the Wikipedia. There are different ways to participate: adding or correcting data in the Knowledge Base, voting in surveys (participatory rankings) and creating new Knowledge Units and Data Reports. === Connectivity Module === The Knowledge Base is designed to be embedded in multi-platform, multi-channel systems, thus enabling its integration into mobile devices, tablets, interactive TV, etc. This integration may be carried out through specific plugins (for navigators or other devices) or an API REST that provides content in XML or JSON formats. The API is divided into three blocks of operations. The first one is the block of general utility tools (ranging from autosuggest components about geographical hierarchies to operations to obtain the list of today’s celebrity birthdays, using CQL). The second one is the block of operations for widget generation (graphs, maps, rankings) using information from the knowledge base. Finally, there is a block of operations designed for the publication of free-source content. == Project statistics == As of April 2012, 2,000,000 Knowledge Units, 15,000 Reports, around 10,000 Maps and several million potential Comparative Analyses had been added to Classora. According to the site of web metrics Alexa, Classora Open Website is ranked at 100,557 globally and at 2,880 in the Spanish traffic ranking. Users spend an average of 9 ½ minutes in Classora.

Software component

A software component is a modular unit of software that encapsulates specific functionality. The desired characteristics of a component are reusability and maintainability. == Value == Components allow software developers to assemble software with reliable parts rather than writing code for every aspect. It makes implementation more like factory assembly than custom building. == Attributes == Desirable attributes of a component include but are not limited to: Cohesive – encapsulates related functionality Reusable Robust Substitutable – can be replaced by another component with the same interface Documented Tested == Third-party == Some components are built in-house by the same organization or team building the software system. Some are third-party, developed elsewhere and assembled into the software system. == Component-based software engineering == For large-scale systems, component-based development encourages a disciplined process to manage complexity. == Framework == Some components conform to a framework technology that allows them to be consumed in a well-known way. Examples include: CORBA, COM, Enterprise JavaBeans, and the .NET Framework. == Modeling == Component design is often modeled visually. In Unified Modeling Language (UML) 2.0 a component is shown as a rectangle, and an interface is shown as a lollipop to indicate a provided interface and as a socket to indicate consumption of an interface. == History == The idea of reusable software components was promoted by Douglas McIlroy in his presentation at the NATO Software Engineering Conference of 1968. (One goal of that conference was to resolve the so-called software crisis of the time.) In the 1970s, McIlroy put this idea into practice with the addition of the pipeline feature to the Unix operating system. Brad Cox refined the concept of a software component in the 1980s. He attempted to create an infrastructure and market for reusable third-party components by inventing the Objective-C programming language. IBM introduced System Object Model (SOM) in the early 1990s. Microsoft introduced Component Object Model (COM) in the early 1990s. Microsoft built many domain-specific component technologies on COM, including Distributed Component Object Model (DCOM), Object Linking and Embedding (OLE), and ActiveX.

Group key

In cryptography, a group key is a cryptographic key that is shared between a group of users. Typically, group keys are distributed by sending them to individual users, either physically, or encrypted individually for each user using either that user's pre-distributed private key. A common use of group keys is to allow a group of users to decrypt a broadcast message that is intended for that entire group of users, and no one else. For example, in the Second World War, group keys (known as "iodoforms", a term invented by a classically educated non-chemist, and nothing to do with the chemical of the same name) were sent to groups of agents by the Special Operations Executive. These group keys allowed all the agents in a particular group to receive a single coded message. In present-day applications, group keys are commonly used in conditional access systems, where the key is the common key used to decrypt the broadcast signal, and the group in question is the group of all paying subscribers. In this case, the group key is typically distributed to the subscribers' receivers using a combination of a physically distributed secure cryptoprocessor in the form of a smartcard and encrypted over-the-air messages.