AI Face Kiss Video

AI Face Kiss Video — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Supermind AI

    Supermind AI

    Supermind is a state-funded Chinese artificial intelligence platform that tracks scientists and researchers internationally. The platform is the flagship project of Shenzhen's International Science and Technology Information Center. It mines data from science and technology databases such as Springer, Wiley, Clarivate and Elsevier. It is intended to detect technological breakthroughs and to identify possible sources of talent as part of China's efforts to advance technologically. The platform also uses government data security and security intelligence organizations such as Peng Cheng Laboratory, the China National GeneBank, BGI Group and the Key Laboratory of New Technologies of Security Intelligence. According to Hong Kong-based Asia Times, the platform, "While not an overt espionage tool...may be used to identify key personnel who could be bribed, deceived or manipulated into divulging classified information". The Organisation for Economic Co-operation and Development (OECD) flagged the project as an incident, meaning it may be of interest to policymakers and other stakeholders. US technology group American Edge Project criticized the project as a global risk of China's security services using the platform to place agents in jobs with access to important information, recruit technical personnel, and identify targets for hacking operations.

    Read more →
  • Social media as a news source

    Social media as a news source

    Social media as a news source is defined as the use of online social media platforms such as Instagram, TikTok, and Facebook rather than the use of traditional media platforms like the newspaper or live TV to obtain news. Television had just begun to turn a nation of people who once listened to media content into watchers of media content between the 1950s and the 1980s when the popularity of social media had also begun creating a nation of media content creators. Almost half of Americans use social media as a news source, according to the Pew Research Center. As social media's role in news consumption grows, questions have emerged about its impact on knowledge, the formation of echo chambers, and the effectiveness of fact-checking efforts in combating misinformation. Social media platforms allow user-generated content and sharing content within one's own virtual network. Using social media as a news source allows users to engage with news in a variety of ways including: Consuming and discovering news Sharing or reposting news Posting one's own photos, videos, or reports of news (i.e., engage in citizen or participatory journalism) Commenting on news posts Using social media as a news source has become an increasingly popular way for people of all age groups to obtain current and important information. Just like many other new forms of technology there are going to be pros and cons. There are ways that social media positively affects the world of news and journalism but it is important to acknowledge that there are also ways in which social media has a negative effect on the news. With this accessibility, people now have more ways to consume false news, biased news, and even disturbing content. In 2019, the Pew Research Center created a poll that reported Americans are wary about the ways that social media sites share news and certain content. This wariness of accuracy grew as awareness that social media sites could be exploited by bad actors who concoct false narratives and fake news. == Relationship to traditional news sources == Unlike traditional news platforms such as newspapers and news shows, social media platforms allow people without professional journalistic backgrounds to create news and cover events that news agencies might not cover. Social media users may read a set of news that differs slightly from what newspaper editors prioritize in the print press. A 2019 study found that Facebook and Twitter users are more likely to share politics, public affairs, and visual media news. Typically social media users circulate more towards posting about negative news. A study of tweets found that while optimistic-sounding and neutral-sounding tweets were equally likely to express certainty or uncertainty, the pessimistic tweets were nearly twice as likely to appear certain of an outcome than uncertain. These results could imply that posts of a more pessimistic nature that are also written with an air of certainty are more likely to be shared or otherwise permeate groups on Twitter. A similar bias towards negativity has developed on Facebook, where internal memos revealed that an algorithm built to promote "meaningful social interaction" actually incentivized publishers to promote negative and sensational news. Biases towards negativity need to be considered when the utility of new media is addressed, as the potential for human opinion to overemphasize any particular news story is greater despite general improvement. In order to compete in this rapidly changing technological environment, there has been an upheaval of traditional news sources onto online spaces. The production and circulation of newspaper prints have continued to globally decline in accordance with the increasing presence of news outlets on social media. Prominent platforms such as Twitter and Facebook have been key in engaging users through the integration of journalistic news into their newsfeeds. This feature has now become a foundational part of these apps' interfaces. Social media incentivizes both legacy news brands and individual professional journalists to share their reporting and interact with audiences on social platforms to boost engagement. However, most people who consume news on social media report that accessing news is not their main motivation for being on social media, but rather, they see and consume news incidentally. Nonetheless, informational interviews reveal that these consumers rely on being informed through social media. Some news consumers attest that a news brand's participation in social media does not improve their trust in the brand and that more in-depth reporting and more transparency about biases would improve trust instead. == Use as a news source == Globally, data from 2020 shows that over 70% of adult participants from Kenya, South Africa, Chile, Bulgaria, Greece, and Argentina utilized social media for news while those from France, the UK, the Netherlands, Germany, and Japan were reportedly less than 40 percent. According to the Pew Research Center, 20% of adults in the United States in 2018 said they get their news from social media "often," compared to 16% who said they often get news from print newspapers, 26% who often get it from the radio, 33% who often get it from news websites, and 49% who often get it from TV. The same survey found that social media was the most popular way for American adults age 18–29 to get news, the second-to-last most popular way for Americans age 20–49 to get news, and the least popular way for American adults age 50-64 and 65+ to get the news. In 2019, the Pew Research Center found that over half of Americans (54%) either got their news "sometimes" or "often" from social media, and Facebook was the most popular social media site where American adults got their news. However, at least 50% off all respondents reported that the following were either a "very big problem" or a "moderately big problem" for getting news on social media: One-sided news (83%) Inaccurate news (81%) Censorship of the news (69%) Uncivil discussions about the news (69%) Harassment of journalists (57%) News organizations or personalities being banned (53%) Violent or disturbing news images or videos (51%) In a later survey from the same year, the Pew Research Center reported that 18% of American adults reported that the most common way they get news about politics and the election was from social media. Additional source information shows that from politics and the United States presidential election in 2016, the popularity of fake news had grown to global attention. With this information, the study explains that more than 60 percent of adults receive their news from social media, the most popular being Facebook. With the increase of fake news, and the large amount of adult participation on these social media sites, it made it much harder for those who were searching for news to find a source that they could find credible. Another study found that adult participants found their own friends on Facebook to be a more reliable source of information online compared to a professional news organization. Although, when news was posted by a news organization online, they were then found more reliable compared to when they are shared by their online friends. Showing that adult participants found that the news that was only posted on Facebook and social media was much more credible to them than compared to other forms of information spreading. The study further states that these outcomes have the potential explanation that the topic of the news article played a part in the ways they were affected. This could have affected the way adult participants interacted with the different news sources, such as their online friends compared to a news organization, prominently because depending on the story, they want to have the correct information about the news from the most credible source. === By young people === Social media platforms are some of the most easily accessible forms of news and with the growing generations, the technology is only going to grow. With that, the use of social media in younger generations is also going to grow alongside it. Technology in the hands of young kids can be a concern moving into the future. Globally, there is evidence that through social media, youth have become more directly involved in protests, social campaigns and generally, in the sharing of news across multiple platforms. The number of people who use social media platforms such as Twitter, Facebook, Instagram, or Snapchat as ways to seek information has increased significantly in recent years especially for people who are part of the younger generation.TikTok is a rapidly expanding platform that young adults can use to find news content on social media. TikTok is one of the sites that young adults and teens utilize to get news about trending themes and controversial topics. The younger generation accepts without hesitation the information that thei

    Read more →
  • G.9963

    G.9963

    Recommendation G.9963 is a home networking standard under development at the International Telecommunication Union standards sector, the ITU-T. It was begun in 2010 by ITU-T to add multiple-input and multiple-output (known as MIMO) capabilities to the G.hn standard originally defined in Recommendation G.9960. The standard is also known as "G.hn-mimo". As part of the family of G.hn standards, G.9963 was endorsed by the HomeGrid Forum.

    Read more →
  • Knapsack problem

    Knapsack problem

    The knapsack problem is the following problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items. The problem often arises in resource allocation where the decision-makers have to choose from a set of non-divisible projects or tasks under a fixed budget or time constraint, respectively. The knapsack problem has been studied for more than a century, with early works dating back to 1897. The subset sum problem is a special case of the decision and 0-1 problems where for each kind of item, the weight equals the value: w i = v i {\displaystyle w_{i}=v_{i}} . In the field of cryptography, the term knapsack problem is often used to refer specifically to the subset sum problem. The subset sum problem is one of Karp's 21 NP-complete problems. == Applications == Knapsack problems appear in real-world decision-making processes in a wide variety of fields, such as finding the least wasteful way to cut raw materials, selection of investments and portfolios, selection of assets for asset-backed securitization, and generating keys for the Merkle–Hellman and other knapsack cryptosystems. One early application of knapsack algorithms was in the construction and scoring of tests in which the test-takers have a choice as to which questions they answer. For small examples, it is a fairly simple process to provide the test-takers with such a choice. For example, if an exam contains 12 questions each worth 10 points, the test-taker need only answer 10 questions to achieve a maximum possible score of 100 points. However, on tests with a heterogeneous distribution of point values, it is more difficult to provide choices. Feuerman and Weiss proposed a system in which students are given a heterogeneous test with a total of 125 possible points. The students are asked to answer all of the questions to the best of their abilities. Of the possible subsets of problems whose total point values add up to 100, a knapsack algorithm would determine which subset gives each student the highest possible score. A 1999 study of the Stony Brook University Algorithm Repository showed that, out of 75 algorithmic problems related to the field of combinatorial algorithms and algorithm engineering, the knapsack problem was the 19th most popular and the third most needed after suffix trees and the bin packing problem. == Definition == The most common problem being solved is the 0-1 knapsack problem, which restricts the number x i {\displaystyle x_{i}} of copies of each kind of item to zero or one. Given a set of n {\displaystyle n} items numbered from 1 up to n {\displaystyle n} , each with a weight w i {\displaystyle w_{i}} and a value v i {\displaystyle v_{i}} , along with a maximum weight capacity W {\displaystyle W} , maximize ∑ i = 1 n v i x i {\displaystyle \sum _{i=1}^{n}v_{i}x_{i}} subject to ∑ i = 1 n w i x i ≤ W {\displaystyle \sum _{i=1}^{n}w_{i}x_{i}\leq W} and x i ∈ { 0 , 1 } {\displaystyle x_{i}\in \{0,1\}} . Here x i {\displaystyle x_{i}} represents the number of instances of item i {\displaystyle i} to include in the knapsack. Informally, the problem is to maximize the sum of the values of the items in the knapsack so that the sum of the weights is less than or equal to the knapsack's capacity. The bounded knapsack problem (BKP) removes the restriction that there is only one of each item, but restricts the number x i {\displaystyle x_{i}} of copies of each kind of item to a maximum non-negative integer value c {\displaystyle c} : maximize ∑ i = 1 n v i x i {\displaystyle \sum _{i=1}^{n}v_{i}x_{i}} subject to ∑ i = 1 n w i x i ≤ W {\displaystyle \sum _{i=1}^{n}w_{i}x_{i}\leq W} and x i ∈ { 0 , 1 , 2 , … , c } . {\displaystyle x_{i}\in \{0,1,2,\dots ,c\}.} The unbounded knapsack problem (UKP) places no upper bound on the number of copies of each kind of item and can be formulated as above except that the only restriction on x i {\displaystyle x_{i}} is that it is a non-negative integer. maximize ∑ i = 1 n v i x i {\displaystyle \sum _{i=1}^{n}v_{i}x_{i}} subject to ∑ i = 1 n w i x i ≤ W {\displaystyle \sum _{i=1}^{n}w_{i}x_{i}\leq W} and x i ∈ N . {\displaystyle x_{i}\in \mathbb {N} .} One example of the unbounded knapsack problem is given using the figure shown at the beginning of this article and the text "if any number of each book is available" in the caption of that figure. == Computational complexity == The knapsack problem is interesting from the perspective of computer science for many reasons: The decision problem form of the knapsack problem (Can a value of at least V be achieved without exceeding the weight W?) is NP-complete, thus there is no known algorithm that is both correct and fast (polynomial-time) in all cases. There is no known polynomial algorithm which can tell, given a solution, whether it is optimal (which would mean that there is no solution with a larger V). This problem is co-NP-complete. There is a pseudo-polynomial time algorithm using dynamic programming. There is a fully polynomial-time approximation scheme, which uses the pseudo-polynomial time algorithm as a subroutine, described below. Many cases that arise in practice, and "random instances" from some distributions, can nonetheless be solved exactly. There is a link between the "decision" and "optimization" problems in that if there exists a polynomial algorithm that solves the "decision" problem, then one can find the maximum value for the optimization problem in polynomial time by applying this algorithm iteratively while increasing the value of k. On the other hand, if an algorithm finds the optimal value of the optimization problem in polynomial time, then the decision problem can be solved in polynomial time by comparing the value of the solution output by this algorithm with the value of k. Thus, both versions of the problem are of similar difficulty. One theme in research literature is to identify what the "hard" instances of the knapsack problem look like, or viewed another way, to identify what properties of instances in practice might make them more amenable than their worst-case NP-complete behaviour suggests. The goal in finding these "hard" instances is for their use in public-key cryptography systems, such as the Merkle–Hellman knapsack cryptosystem. More generally, better understanding of the structure of the space of instances of an optimization problem helps to advance the study of the particular problem and can improve algorithm selection. Furthermore, notable is the fact that the hardness of the knapsack problem depends on the form of the input. If the weights and profits are given as integers, it is weakly NP-complete, while it is strongly NP-complete if the weights and profits are given as rational numbers. However, in the case of rational weights and profits it still admits a fully polynomial-time approximation scheme. === Unit-cost models === The NP-hardness of the Knapsack problem relates to computational models in which the size of integers matters (such as the Turing machine). In contrast, decision trees count each decision as a single step. Dobkin and Lipton show an 1 2 n 2 {\displaystyle {1 \over 2}n^{2}} lower bound on linear decision trees for the knapsack problem, that is, trees where decision nodes test the sign of affine functions. This was generalized to algebraic decision trees by Steele and Yao. If the elements in the problem are real numbers or rationals, the decision-tree lower bound extends to the real random-access machine model with an instruction set that includes addition, subtraction and multiplication of real numbers, as well as comparison and either division or remaindering ("floor"). This model covers more algorithms than the algebraic decision-tree model, as it encompasses algorithms that use indexing into tables. However, in this model all program steps are counted, not just decisions. An upper bound for a decision-tree model was given by Meyer auf der Heide who showed that for every n there exists an O(n4)-deep linear decision tree that solves the subset-sum problem with n items. Note that this does not imply any upper bound for an algorithm that should solve the problem for any given n. == Solving == Several algorithms are available to solve knapsack problems, based on the dynamic programming approach, the branch and bound approach or hybridizations of both approaches. === Dynamic programming in-advance algorithm === The unbounded knapsack problem (UKP) places no restriction on the number of copies of each kind of item. Besides, here we assume that x i > 0 {\displaystyle x_{i}>0} m [ w ′ ] = max ( ∑ i = 1 n v i x i ) {\displaystyle m[w']=\max \left(\sum _{i=1}^{n}v_{i}x_{i}\right)} subject to ∑

    Read more →
  • FMLLR

    FMLLR

    In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplication operation with a transformation matrix. In some literature, fMLLR is also known as the Constrained Maximum Likelihood Linear Regression (cMLLR). == Overview == fMLLR transformations are trained in a maximum likelihood sense on adaptation data. These transformations may be estimated in many ways, but only maximum likelihood (ML) estimation is considered in fMLLR. The fMLLR transformation is trained on a particular set of adaptation data, such that it maximizes the likelihood of that adaptation data given a current model-set. This technique is a widely used approach for speaker adaptation in HMM-based speech recognition. Later research also shows that fMLLR is an excellent acoustic feature for DNN/HMM hybrid speech recognition models. The advantage of fMLLR includes the following: the adaptation process can be performed within a pre-processing phase, and is independent of the ASR training and decoding process. this type of adapted feature can be applied to deep neural networks (DNN) to replace traditionally used mel-spectrogram in end-to-end speech recognition models. fMLLR's speaker adaptation process leads to a significant performance boost for ASR models, hence outperforming other transform or features like MFCCs (Mel-Frequency Cepstral Coefficients) and FBANKs (Filter bank) coefficients. fMLLR features can be efficiently realized with speech toolkits like Kaldi. Major problem and disadvantage of fMLLR: when the amount of adaptation data is limited, the transformation matrices tends to easily overfit the given data. == Computing fMLLR transform == Feature transform of fMLLR can be easily computed with the open source speech tool Kaldi, the Kaldi script uses the standard estimation scheme described in Appendix B of the original paper, in particular the section Appendix B.1 "Direct method over rows". In the Kaldi formulation, fMLLR is an affine feature transform of the form x {\displaystyle x} → A {\displaystyle A} x {\displaystyle x} + b {\displaystyle +b} , which can be written in the form x {\displaystyle x} →W x ^ {\displaystyle {\hat {x}}} , where x ^ {\displaystyle {\hat {x}}} = [ x 1 ] {\displaystyle {\begin{bmatrix}x\\1\end{bmatrix}}} is the acoustic feature x {\displaystyle x} with a 1 appended. Note that this differs from some of the literature where the 1 comes first as x ^ {\displaystyle {\hat {x}}} = [ 1 x ] {\displaystyle {\begin{bmatrix}1\\x\end{bmatrix}}} . The sufficient statistics stored are: K = ∑ t , j , m γ j , m ( t ) Σ j m − 1 μ j m x ( t ) + {\displaystyle K=\sum _{t,j,m}\gamma _{j,m}(t)\textstyle \Sigma _{jm}^{-1}\mu _{jm}x(t)^{+}\displaystyle } where Σ j m − 1 {\displaystyle \textstyle \Sigma _{jm}^{-1}\displaystyle } is the inverse co-variance matrix. And for 0 ≤ i ≤ D {\displaystyle 0\leq i\leq D} where D {\displaystyle D} is the feature dimension: G ( i ) = ∑ t , j , m γ j , m ( t ) ( 1 σ j , m 2 ( i ) ) x ( t ) + x ( t ) + T {\displaystyle G^{(i)}=\sum _{t,j,m}\gamma _{j,m}(t)\left({\frac {1}{\sigma _{j,m}^{2}(i)}}\right)x(t)^{+}x(t)^{+T}\displaystyle } For a thorough review that explains fMLLR and the commonly used estimation techniques, see the original paper "Maximum likelihood linear transformations for HMM-based speech recognition ". Note that the Kaldi script that performs the feature transforms of fMLLR differs with by using a column of the inverse in place of the cofactor row. In other words, the factor of the determinant is ignored, as it does not affect the transform result and can causes potential danger of numerical underflow or overflow. == Comparing with other features or transforms == Experiment result shows that by using the fMLLR feature in speech recognition, constant improvement is gained over other acoustic features on various commonly used benchmark datasets (TIMIT, LibriSpeech, etc). In particular, fMLLR features outperform MFCCs and FBANKs coefficients, which is mainly due to the speaker adaptation process that fMLLR performs. In, phoneme error rate (PER, %) is reported for the test set of TIMIT with various neural architectures: As expected, fMLLR features outperform MFCCs and FBANKs coefficients despite the use of different model architecture. Where MLP (multi-layer perceptron) serves as a simple baseline, on the other hand RNN, LSTM, and GRU are all well known recurrent models. The Li-GRU architecture is based on a single gate and thus saves 33% of the computations over a standard GRU model, Li-GRU thus effectively address the gradient vanishing problem of recurrent models. As a result, the best performance is obtained with the Li-GRU model on fMLLR features. == Extract fMLLR features with Kaldi == fMLLR can be extracted as reported in the s5 recipe of Kaldi. Kaldi scripts can certainly extract fMLLR features on different dataset, below are the basic example steps to extract fMLLR features from the open source speech corpora Librispeech. Note that the instructions below are for the subsets train-clean-100,train-clean-360,dev-clean, and test-clean, but they can be easily extended to support the other sets dev-other, test-other, and train-other-500. These instruction are based on the codes provided in this GitHub repository, which contains Kaldi recipes on the LibriSpeech corpora to execute the fMLLR feature extraction process, replace the files under $KALDI_ROOT/egs/librispeech/s5/ with the files in the repository. Install Kaldi. Install Kaldiio. If running on a single machine, change the following lines in $KALDI_ROOT/egs/librispeech/s5/cmd.sh to replace queue.pl to run.pl: Change the data path in run.sh to your LibriSpeech data path, the directory LibriSpeech/ should be under that path. For example: Install flac with: sudo apt-get install flac Run the Kaldi recipe run.sh for LibriSpeech at least until Stage 13 (included), for simplicity you can use the modified run.sh. Copy exp/tri4b/trans. files into exp/tri4b/decode_tgsmall_train_clean_/ with the following command: Compute the fMLLR features by running the following script, the script can also be downloaded here: Compute alignments using: Apply CMVN and dump the fMLLR features to new .ark files, the script can also be downloaded here: Use the Python script to convert Kaldi generated .ark features to .npy for your own dataloader, an example Python script is provided:

    Read more →
  • Utah Social Media Regulation Act

    Utah Social Media Regulation Act

    S.B. 152 and H.B. 311, collectively known as the Utah Social Media Regulation Act, were social media regulation bills that were passed by the Utah State Legislature in March 2023. The bills would have collectively imposed restrictions on how social networking services serve minors in the state of Utah, including mandatory age verification and age restrictions, as well as restrictions on data collection and on algorithmic recommendations. The Act was intended to take effect in March 2024. However, following a lawsuit over the Act by NetChoice, a tech industry lobby group, the Utah attorney general stated in January 2024 that its implementation had been delayed to October 2024, but was likely to be repealed and amended. On September 10, 2024 Chief Judge Robert J. Shelby issued a written order granting a request from NetChoice for a preliminary injunction, meaning that Utah will be unable to enforce its social media law as litigation plays out. The law was appealed to the 10th Circuit on October 11, 2024 and is awaiting a decision. == Provisions == The Act comprises two bills, S.B. 152 and H.B. 311, which respectively regulate access to social network accounts registered to minors, and impose obligations on social networking services to follow design practices that protect the privacy of minors. The bills would apply to social networks with more than 5 million active users in the United States. Social networking services would've verified the age of all users in the state of Utah, or else their account must've been deleted. The Act does not specify a specific method of age verification. Users who are under 18 must have consent from a parent or guardian to open an account, and the parent must be able to have access to the account and its data for monitoring. Unless required to comply with state or federal law, social networks were prohibited from collecting data based on the activity of minors, and may've not displayed targeted advertising or algorithmic recommendations of content, users, or groups to minors. A social network must not allow minors to access the service between the hours of 10:30 p.m., and 6:30 a.m. without parental consent. H.B. 311 prohibits social networks from exposing features to minors that cause them to have an "addiction" to the platform; the service must perform quarterly audits, and may be sued by users for harms caused by providing "addictive" features; there is a rebuttable presumption of harm if the plaintiff is 16 or younger. The bills prescribed fines of $2,500 per-violation for violations of the provisions of S.B. 152, and up to $250,000 in liabilities (plus fines of $2,500 per-user) for violations of the addiction rules. == History == The two bills were passed in early-March 2023, and signed by Governor Spencer Cox on March 23, 2023. Cox cited studies linking social media addiction to increases in depression and suicide among youth. They were originally intended to take effect on March 1, 2024. In the wake of a lawsuit in Arkansas by the trade association NetChoice over a similar bill, state senator and bill author Mike McKell stated that he planned to introduce amendments when the legislature resumed in 2024. In December 2023, NetChoice filed a lawsuit in Utah seeking to block the Act, citing that its definition of a social network was too vague, and that it "restricts who can express themselves, what can be said, and when and how speech on covered websites can occur, down to the very hours of the day minors can use covered websites. The First Amendment, reinforced by decades of precedent, allows none of this." In regards to its age verification requirements, NetChoice argued that "it may not be enough to simply verify the age of whatever person may be listed on a form of identification (even if they have such a record) because that record may not accurately reflect who the individual actually is." The office of the attorney general stated that the state was "reviewing the lawsuit but remains intently focused on the goal of this legislation: Protecting young people from negative and harmful effects of social media use." In January 2024, Attorney General Sean Reyes asked the court to delay a hearing over the bill, stating that its effective date had been delayed to October 2024, and that the legislature planned to repeal and replace the bills. On September 10, 2024, Federal Chief Judge Robert Shelby granted a preliminary injunction to stop enforcement of the law as litigation continues. The law was later appealed on October 11, 2024, by the state of Utah and had a court hearing on the appeal on November 20, 2025.

    Read more →
  • KLJN Secure Key Exchange

    KLJN Secure Key Exchange

    Random-resistor-random-temperature Kirchhoff-law-Johnson-noise key exchange, also known as RRRT-KLJN or simply KLJN, is an approach for distributing cryptographic keys between two parties that claims to offer unconditional security. This claim, which has been contested, is significant, as the only other key exchange approach claiming to offer unconditional security is Quantum key distribution. The KLJN secure key exchange scheme was proposed in 2005 by Laszlo Kish and Granqvist. It has the advantage over quantum key distribution in that it can be performed over a metallic wire with just four resistors, two noise generators, and four voltage measuring devices---equipment that is low-priced and can be readily manufactured. It has the disadvantage that several attacks against KLJN have been identified which must be defended against. "Given that the amount of effort and funding that goes into Quantum Cryptography is substantial (some even mock it as a distraction from the ultimate prize which is quantum computing), it seems to me that the fact that classic thermodynamic resources allow for similar inherent security should give one pause," wrote Henning Dekant, the founder of the Quantum Computing Meetup, in April 2013. The Cybersecurity Curricula 2017, a joint project of the Association for Computing Machinery, the IEEE Computer Society, the Association for Information Systems, and the International Federation for Information Processing Technical Committee on Information Security Education (IFIP WG 11.8) recommends teaching the KLJN Scheme as part of teaching "Advanced concepts" in its knowledge unit on cryptography. == See Also/Further Reading ==

    Read more →
  • Social advertising (social relationships)

    Social advertising (social relationships)

    Social advertising is advertising that relies on social information or networks in generating, targeting, and delivering marketing communications. Many current examples of social advertising use a particular Internet service to collect social information, establish and maintain relationships with consumers, and for delivering communications. For example, the advertising platforms provided by Google, Twitter, and Facebook involve targeting and presenting ads based on relationships articulated on those same services. Social advertising can be part of a broader social media marketing strategy designed to connect with consumers. == Social targeting == Since a pair of consumers connected via a relationship are more likely to be similar than an unconnected pair, information about such relationships can be used to infer characteristics of consumers useful for targeting. For example, predictions of an individual's home location can be improved using geographic information about their peers. Existing advertising platforms can allow advertisers to explicitly target the peers (e.g., Facebook friends, Twitter followers) of consumers who have a known affiliation with their brand. Thus, one way social advertising is expected to be effective is because social networks encode information about unobserved characteristics of consumers, including their susceptibility to adopt a product and to influence their peers to adopt. Social advertisement targets audiences' demographics based on customers browsing histories. This helped companies understand users' interests and target a specific group of users. Whether it is location or personal interest, different categories of companies can make the consumers on social media rely heavily on their advertisements. This is one of the reasons why social advertising has grown over time. Targeting their audience to real life stakeholders generally increase the attention of the advertised deal which brings up more profits for companies. Subsequently, the psychological effects that social media gives off to its users play a huge role in advertisement companies keeping their customers online. One of the main reasons users rely on social media is because it's a source of entertainment that provides them with a feeling of inclusiveness. In making the customers feel the inclusiveness, social advertising targeting a specific group of users is presented as if these advertisements are customized for the users in their perspective making them feel the attention that they do not often feel in the real world. You can use Social signals checker tool to find more information about links. Social signals are metrics that measure how much people interact with your content on social media. From likes, to shares, to comments; each of these signals contributes to an overall number that tells search engines like Google how much people like your content. The more social signals your website gets, the more likely it is to rank higher in Google. The reason for this is two-fold. First, social media is used by millions of people every day, and if your content is being shared and interacted with on these sites, it shows that it’s worthy of being seen. And second, social media sites are highly trusted by Google. So if you can get your content seen and interacted with on these platforms, you’ll be off to a great start. == Social cues in advertisements == Social ads often include information about the affiliation of a peer with an advertised entity. For example, a social ad might indicate a friend has endorsed a product, highly rated a restaurant, or watched a particular film. In fact, some definitions make these personalized social signals a necessary condition for the advertising being social advertising. Inclusion of personalized social signals creates a channel for social influence. Experiments that remove peers' names or images from social advertisements provide evidence that their presence increases proximal outcomes (e.g., clicks on advertisements). This is technically how trends are started on social media. Since social media links a single profile to thousands of other accounts some being real-life friends or even acquaintances, the opinions and the bias a user has for other users who are also a customer of an advertisement on the feed can heavily affect whether to click on the advertisement or not. Once this pattern continues, the brand benefits from increased customers, profit, and attention. Social networking can spread rapidly because 71 percent of the world's population contributes and uses social media which means social advertising gives companies a better marketing technique than a physical poster advertisement. == Word of mouth == Advertisers often attempt to use word of mouth to affect consumers and their decisions to adopt products and services. Ads and other inducements targeted at a seed set of individuals can be designed to produce a larger cascade of adoption through influence. Businesses are also using social media to attempt to identify and persuade influential consumers to spread positive messages about their products or services. Consequently, not only on social platforms but also in physical settings, users start talking to each other. When individuals develop an intimate relationship with each other, it is quite heavily based on shared characteristics, interests, and personalities. If one social media user becomes a regular customer to a well-known company that advertises often, there is a higher chance that all the other people who have intimate relationships with that one customer will be exposed to the online advertisement more than another user who might be completely new to a brand that is being advertised on screen. In reality, this happens to not only one user but to most of the users which mean a single brand advertisement online can have to potential of being talked about between billions and trillions of people all around the globe. == Relationship marketing == To accurately conduct relationship marketing, businesses must develop and manage six marketplaces: internal, customer, referral, supplier, influencer and employee. To maintain relationship marketing, customers often see social media influencers getting free sponsorships or PR boxes just to advertise their products. At times, users who become customers through these social influencers will get a better deal than regular customers which stands as a very commonly used marketing technique. By doing this, users think they are receiving special treatment when in reality it very much benefits social influencers and brands. Especially for brands that are just starting, they use this marketing technique so that their names can be out there, and people will start talking, which is their initial goal.

    Read more →
  • History of natural language processing

    History of natural language processing

    The history of natural language processing describes the advances of natural language processing. There is some overlap with the history of machine translation, the history of speech recognition, and the history of artificial intelligence. == Early history == The history of machine translation dates back to the seventeenth century, when philosophers such as Leibniz and Descartes put forward proposals for codes which would relate words between languages. All of these proposals remained theoretical, and none resulted in the development of an actual machine. The first patents for "translating machines" were applied for in the mid-1930s. One proposal, by Georges Artsrouni, was simply an automatic bilingual dictionary using paper tape. The other proposal, by Peter Troyanskii, a Russian, was more detailed. Troyanskii’s proposal included both the bilingual dictionary and a method for dealing with grammatical roles between languages, based on Esperanto. == Logical period == In 1950, Alan Turing published his famous article "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence. This criterion depends on the ability of a computer program to impersonate a human in a real-time written conversation with a human judge, sufficiently well that the judge is unable to distinguish reliably — on the basis of the conversational content alone — between the program and a real human. In 1957, Noam Chomsky’s Syntactic Structures revolutionized Linguistics with 'universal grammar', a rule-based system of syntactic structures. The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem. However, real progress was much slower, and after the ALPAC report in 1966, which found that ten years long research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted until the late 1980s, when the first statistical machine translation systems were developed. Some notably successful NLP systems developed in the 1960s were SHRDLU, a natural language system working in restricted "blocks worlds" with restricted vocabularies. In 1969 Roger Schank introduced the conceptual dependency theory for natural language understanding. This model, partially influenced by the work of Sydney Lamb, was extensively used by Schank's students at Yale University, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner. In 1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input. Instead of phrase structure rules ATNs used an equivalent set of finite-state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years. During the 1970s many programmers began to write 'conceptual ontologies', which structured real-world information into computer-understandable data. Examples are MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units (Lehnert 1981). During this time, many chatterbots were written including PARRY, Racter, and Jabberwacky. == Statistical period == Up to the 1980s, most NLP systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in NLP with the introduction of machine learning algorithms for language processing. This was due both to the steady increase in computational power resulting from Moore's law and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing. Some of the earliest-used machine learning algorithms, such as decision trees, produced systems of hard if-then rules similar to existing hand-written rules. Increasingly, however, research has focused on statistical models, which make soft, probabilistic decisions based on attaching real-valued weights to the features making up the input data. The cache language models upon which many speech recognition systems now rely are examples of such statistical models. Such models are generally more robust when given unfamiliar input, especially input that contains errors (as is very common for real-world data), and produce more reliable results when integrated into a larger system comprising multiple subtasks. === Datasets === The emergence of statistical approaches was aided by both increase in computing power and the availability of large datasets. At that time, large multilingual corpora were starting to emerge. Notably, some were produced by the Parliament of Canada and the European Union as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. Many of the notable early successes occurred in the field of machine translation. In 1993, the IBM alignment models were used for statistical machine translation. Compared to previous machine translation systems, which were symbolic systems manually coded by computational linguists, these systems were statistical, which allowed them to automatically learn from large textual corpora. Though these systems do not work well in situations where only small corpora is available, so data-efficient methods continue to be an area of research and development. In 2001, a one-billion-word large text corpus, scraped from the Internet, referred to as "very very large" at the time, was used for word disambiguation. To take advantage of large, unlabelled datasets, algorithms were developed for unsupervised and self-supervised learning. Generally, this task is much more difficult than supervised learning, and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the World Wide Web), which can often make up for the inferior results. == Neural period == Neural language models were developed in 1990s. In 1990, the Elman network, using a recurrent neural network, encoded each word in a training set as a vector, called a word embedding, and the whole vocabulary as a vector database, allowing it to perform such tasks as sequence-predictions that are beyond the power of a simple multilayer perceptron. A shortcoming of the static embeddings was that they didn't differentiate between multiple meanings of homonyms. Yoshua Bengio developed the first neural probabilistic language model in 2000. Novel algorithms, availability of larger datasets and higher processing power made possible training of larger and larger language models. Attention mechanism was introduced by Bahdanau et al. in 2014. This work laid the foundations for the famous "Attention Is All You Need" paper that introduced the Transformer architecture in 2017. The concept of large language model (LLM) emerged in late 2010s. LLM is a language model trained with self-supervised learning on vast amount of text. Earliest public LLMs had hundreds of millions of parameters, but this number quickly rose to billion and even trillions. In recent years, advancements in deep learning and large language models have significantly enhanced the capabilities of natural language processing, leading to widespread applications in areas such as healthcare, customer service, and content generation. == Software ==

    Read more →
  • Internet Security Alliance

    Internet Security Alliance

    Internet Security Alliance (ISA) was founded in 2001 as a non-profit collaboration between Carnegie Mellon University's CyLab and Electronic Industries Alliance, a federation of trade associations. The Internet Security Alliance is focused on cyber security, acting as a forum for information sharing and leadership on information security, and lobbying for corporate security interests. == International operations == The Internet Security Alliance operates with a global membership to provide international security for its partners. The organization's membership includes companies located on four continents, and the Executive Committee always includes at least one non-U.S.-based company. The Internet Security Alliance believes that international communication is crucial for long-term greater information security, as it allows for a more realistic approach to addressing the many challenges faced by users of the Internet. == Publications == Published in 2009, The Financial Impact of Cyber Risk is the first known guidance document to attempt to approach the financial impact of cyber risks from the perspective of core business functions. It claims to provide guidance to CFOs and their colleagues responsible for legal issues, business operations and technology, privacy and compliance, risk assessment and insurance, and corporate communications.

    Read more →
  • Sysomos

    Sysomos

    Sysomos Inc. is a Toronto-based social media analytics company owned by Outside Insight market leaders Meltwater. The company developed text analytics and machine learning technologies for user generated content, and served 80% of the top agencies and Fortune 500. == History == Sysomos was founded by Nilesh Bansal and Nick Koudas. The company is a spinoff of the University of Toronto research project BlogScope. The BlogScope project, which started in 2005, resulted in creation of the underlying content aggregation and analysis engine commercialized by Sysomos. The company raised venture capital in 2008 and was acquired by Marketwire in 2010. The company's original flagship product, Media Analysis Platform (MAP), mines and analyzes content from social media or user-generated content to create a picture of media coverage. Sysomos launched its flagship offering MAP in Sept 2007, followed by addition of Heartbeat to its product suite in 2009. In addition to the two main products, the company released FourWhere, a free location-based social search service that mashes up Foursquare in March 2010. The company also offers Sysomos Heartbeat which provides social media monitoring and engagement capabilities to communication professionals, brand managers and customer support groups. In 2013, Heartbeat was extended to add publishing components to deliver a complete end-to-end social media marketing platform. On July 6, 2010, it was announced that Marketwire, a press release distribution company, had acquired Sysomos. After the acquisition, Sysomos founders Nick Koudas and Nilesh Bansal, left Sysomos to start Aislelabs. In February 2015, Sysomos split from Marketwired, as an independent company, and appointed Adnan Ahmed as the new CEO. In March 2015, newly independent Sysomos launched a redesign for its Heartbeat product and a new API for its MAP product. In the same year, the company acquired Expion. In September 2016, Peter Heffring was announced as the new CEO. In April 2017, Sysomos showcased a new unified platform offering new insights. In April 2018, media monitoring firm Meltwater announced it had acquired Sysomos. The CEO of Sysomos, Peter Heffring, said the company will continue to operate as an independent unit of Meltwater. Heffring will run the social analytics division of Meltwater. == Reports == Inside Twitter series of reports is the most extensive third-party survey on Twitter's growth and demographics. Another extensive survey regarding the top 5% of most active Twitter users found that over 25% of all tweets are machine created. The report also confirms Twitter's international growth. Inside Facebook Pages report found that only four percent of pages have more than 10,000 fans, 0.76% of pages have more than 100,000 fans, and 0.05% of pages (or 297 in total) have more than a million fans. Inside YouTube reports focus more on video hosting services and YouTube.

    Read more →
  • Personal web page

    Personal web page

    Personal web pages are World Wide Web pages created by an individual to contain content of a personal nature rather than content pertaining to a company, organization or institution. Personal web pages are primarily used for informative or entertainment purposes but can also be used for personal career marketing (by containing a list of the individual's skills, experience and a CV), social networking with other people with shared interests, or as a space for personal expression. These terms do not usually refer to just a single "page" or HTML file, but to a website—a collection of webpages and related files under a common URL or Web address. In strictly technical terms, a site's actual home page (index page) often only contains sparse content with some catchy introductory material and serves mostly as a pointer or table of contents to the more content-rich pages inside, such as résumés, family, hobbies, family genealogy, a web log/diary ("blog"), opinions, online journals and diaries or other writing, examples of written work, digital audio sound clips, digital video clips, digital photos, or information about a user's other interests. Many personal pages only include information of interest to friends and family of the author. However, some webpages set up by hobbyists or enthusiasts of certain subject areas can be valuable topical web directories. == History == In the 1990s, most Internet service providers (ISPs) provided a free small personal, user-created webpage along with free Usenet News service. These were all considered part of full Internet service. Also several free web hosting services such as GeoCities provided free web space for personal web pages. These free web hosting services would typically include web-based site management and a few pre-configured scripts to easily integrate an input form or guestbook script into the user's site. Early personal web pages were often called "home pages" and were intended to be set as a default page in a web browser's preferences, usually by their owner. These pages would often contain links, to-do lists, and other information their author found useful. In the days when search engines were in their infancy, these pages (and the links they contained) could be an important resource in navigating the web. Since the early 2000s, the rise of blogging and the development of user friendly web page designing software made it easier for amateur users who did not have computer programming or website designer training to create personal web pages. Some website design websites provided free ready-made blogging scripts, where all the user had to do was input their content into a template. At the same time, a personal web presence became easier with the increased popularity of social networking services, some with blogging platforms such as LiveJournal and Blogger. These websites provided an attractive and easy-to-use content management system for regular users. Most of the early personal websites were Web 1.0 style, in which a static display of text and images or photos was displayed to individuals who came to the page. About the only interaction that was possible on these early websites was signing the virtual "guestbook". With the collapse of the dot-com bubble in the late 1990s, the ISP industry consolidated, and the focus of web hosting services shifted away from the surviving ISP companies to independent Internet hosting services and to ones with other affiliations. For example, many university departments provided personal pages for professors and television broadcasters provided them for their on-air personalities. These free webpages served as a perquisite ("perk") for staff, while at the same time boosting the Web visibility of the parent organization. Web hosting companies either charge a monthly fee, or provide service that is "free" (advertising based) for personal web pages. These are priced or limited according to the total size of all files in bytes on the host's hard drive, or by bandwidth, (traffic), or by some combination of both. For those customers who continue to use their ISP for these services, national ISPs commonly continue to provide both disk space and help including ready-made drop-in scripts. With the rise of Web 2.0-style websites, both professional websites and user-created, amateur websites tended to contain interactive features, such as "clickable" links to online newspaper articles or favourite websites, the option to comment on content displayed on the website, the option to "tag" images, videos or links on the site, the option of "clicking" on an image to enlarge it or find out more information, the option of user participation for website guests to evaluate or review the pages, or even the option to create new user-generated content for others to see. A key difference between Web 1.0 personal webpages and Web 2.0 personal pages was while the former tended to be created by hackers, computer programmers and computer hobbyists, the latter were created by a much wider variety of users, including individuals whose main interests lay in hobbies or topics outside of computers (e.g., indie music fans, political activists, and social entrepreneurs). == Motivations == In a study done by Zinkhan, participants had four main reasons to create personal web pages. First, people use personal web pages as a portrayal of self, in a sense marketing themselves, since creators have the freedom to portray their own identities. Second, personal web pages are a way to interact with people who have similar interests as the creator, possible employers, or colleagues. Third, personal web pages can gain social acceptance with groups that the creator is interested in depending on the information that the creator reveals about themselves. Fourth, personal web pages can give creators a sense of connection to the world since these web pages are public and a way to introduce oneself to other people around the globe. People may maintain personal web pages to serve as a showcase for their skills in professional life, creative skills or self promotion of their business, charity or band. The use of personal web pages to display an individual's professional life has become more common in the 21st century. Mary Madden, an expert researcher on privacy and technology, did a study that found a tenth of American jobs require Personal web pages that advertise an individual online. Personal web pages have become a source of initial impression of possible employees used by employers. It can also be used to express opinions on issues ranging from news and politics to movies. Others may use their personal web page as a communication method. For example, an aspiring artist might give out business cards with their personal web page, and invite people to visit their page and see their artwork, "like" their page or sign their guestbook. A personal web page gives the owner generally more control on presence in search results and how they wish to be viewed online. It also allows more freedom in types and quantity of content than a social network profile offers, and can link various social media profiles with each other. It can be used to correct the record on something, or clear up potential confusion between you and someone with the same name. In the 2010s, some amateur writers, bands and filmmakers release digital versions of their stories, songs and short films online, with the aim of gaining an audience and becoming more well-known. While the huge number of aspiring artists posting their work online makes it unlikely for individuals and groups to become popular via the Internet, there are a small number of YouTube stars who were unknown until their online performances garnered them a huge audience. == Sites of academics == Academic professionals (especially at the college and university level), including professors and researchers, are often given online space for creating and storing personal web documents, including personal web pages, CVs and a list of their books, academic papers and conference presentations, on the websites of their employers. This goes back to the early decade of the World Wide Web and its original purpose of providing a quick and easy way for academics to share research papers and data. Researchers may have a personal website to share more information about themselves, about their academic activities and for sharing (unpublished) results of their research. This has been noted as part of the success of open-access repositories such as arXiv.

    Read more →
  • Data remanence

    Data remanence

    Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment (e.g., thrown in refuse containers or lost). Various techniques have been developed to counter data remanence. These techniques are classified as clearing, purging/sanitizing, or destruction. Specific methods include overwriting, degaussing, encryption, and media destruction. Effective application of countermeasures can be complicated by several factors, including media that are inaccessible, media that cannot effectively be erased, advanced storage systems that maintain histories of data throughout the data's life cycle, and persistence of data in memory that is typically considered volatile. Several standards exist for the secure removal of data and the elimination of data remanence. == Causes == Many operating systems, file managers, and other software provide a facility where a file is not immediately deleted when the user requests that action. Instead, the file is moved to a holding area (i.e. the "trash"), making it easy for the user to undo a mistake. Similarly, many software products automatically create backup copies of files that are being edited, to allow the user to restore the original version, or to recover from a possible crash (autosave feature). Even when an explicit deleted file retention facility is not provided or when the user does not use it, operating systems do not actually remove the contents of a file when it is deleted unless they are aware that explicit erasure commands are required, like on a solid-state drive. (In such cases, the operating system will issue the Serial ATA TRIM command or the SCSI UNMAP command to let the drive know to no longer maintain the deleted data.) Instead, they simply remove the file's entry from the file system directory because this requires less work and is therefore faster, and the contents of the file—the actual data—remain on the storage medium. The data will remain there until the operating system reuses the space for new data. In some systems, enough filesystem metadata are also left behind to enable easy undeletion by commonly available utility software. Even when undelete has become impossible, the data, until it has been overwritten, can be read by software that reads disk sectors directly. Computer forensics often employs such software. Likewise, reformatting, repartitioning, or reimaging a system is unlikely to write to every area of the disk, though all will cause the disk to appear empty or, in the case of reimaging, empty except for the files present in the image, to most software. Finally, even when the storage media is overwritten, physical properties of the media may permit recovery of the previous contents. In most cases however, this recovery is not possible by just reading from the storage device in the usual way, but requires using laboratory techniques such as disassembling the device and directly accessing/reading from its components. § Complications below gives further explanations for causes of data remanence. == Countermeasures == There are three levels commonly recognized for eliminating remnant data: === Clearing === Clearing is the removal of sensitive data from storage devices in such a way that there is assurance that the data may not be reconstructed using normal system functions or software file/data recovery utilities. The data may still be recoverable, but not without special laboratory techniques. Clearing is typically an administrative protection against accidental disclosure within an organization. For example, before a hard drive is re-used within an organization, its contents may be cleared to prevent their accidental disclosure to the next user. === Purging === Purging or sanitizing is the physical rewrite of sensitive data from a system or storage device done with the specific intent of rendering the data unrecoverable at a later time. Purging, proportional to the sensitivity of the data, is generally done before releasing media beyond control, such as before discarding old media, or moving media to a computer with different security requirements. === Destruction === The storage media is made unusable for conventional equipment. Effectiveness of destroying the media varies by medium and method. Depending on recording density of the media, and/or the destruction technique, this may leave data recoverable by laboratory methods. Conversely, destruction using appropriate techniques is the most secure method of preventing retrieval. == Specific methods == === Overwriting === A common method used to counter data remanence is to overwrite the storage media with new data. This is often called wiping or shredding a disk or file, by analogy to common methods of destroying print media, although the mechanism bears no similarity to these. Because such a method can often be implemented in software alone, and may be able to selectively target only part of the media, it is a popular, low-cost option for some applications. Overwriting is generally an acceptable method of clearing, as long as the media is writable and not damaged. The simplest overwrite technique writes the same data everywhere—often just a pattern of all zeros. At a minimum, this will prevent the data from being retrieved simply by reading from the media again using standard system functions. The UEFI in modern machines may offer an ATA class disk erase function as well. The ATA-6 standard governs secure erases specifications. Bitlocker is whole disk encryption and illegible without the key. Writing a fresh GPT allows a new file system to be established. Blocks will set empty but LBA read is illegible. New data will be unaffected and work fine. In an attempt to counter more advanced data recovery techniques, specific overwrite patterns and multiple passes have often been prescribed. These may be generic patterns intended to eradicate any trace signatures; an example is the seven-pass pattern 0xF6, 0x00, 0xFF, , 0x00, 0xFF, , sometimes erroneously attributed to US standard DOD 5220.22-M. One challenge with overwriting is that some areas of the disk may be inaccessible, due to media degradation or other errors. Software overwrite may also be problematic in high-security environments, which require stronger controls on data commingling than can be provided by the software in use. The use of advanced storage technologies may also make file-based overwrite ineffective (see the related discussion below under § Complications). There are specialized machines and software that are capable of doing overwriting. The software can sometimes be a standalone operating system specifically designed for data destruction. There are also machines specifically designed to wipe hard drives to the department of defense specifications DOD 5220.22-M. Writing zero to each block on hard disks and SSDs has the advantage of affording the firmware to deploy spare blocks when bad blocks are identified. Bitlocker has the advantage that data is illegible without the key. Seatools and other tools can erase disks with zero which is typical to revive old consumer class disks but they can wipe server disks albeit slowly. Modern 28TB and larger disks have an enormous number of LBA48 blocks. 40TB and 60TB disks will take proportionately longer times to wipe. ==== Feasibility of recovering overwritten data ==== Peter Gutmann investigated data recovery from nominally overwritten media in the mid-1990s. He suggested magnetic force microscopy may be able to recover such data, and developed specific patterns, for specific drive technologies, designed to counter such. These patterns have come to be known as the Gutmann method. Gutmann's belief in the possibility of data recovery is based on many questionable assumptions and factual errors that indicate a low level of understanding of how hard drives work. Daniel Feenberg, an economist at the private National Bureau of Economic Research, claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend". He also points to the "18+1⁄2-minute gap" Rose Mary Woods created on a tape of Richard Nixon discussing the Watergate break-in. Erased information in the gap has not been recovered, and Feenberg claims doing so would be an easy task compared to recovery of a modern high density digital signal. As of November 2007, the United States Department of Defense considers overwriting acceptable for clearing magnetic media within the same security area/

    Read more →
  • G.9970

    G.9970

    G.9970 (also known as G.hnta) is a Recommendation developed by ITU-T that describes the generic transport architecture for home networks and their interfaces to a provider's access network. G.9970 was developed by Study Group 15, Question 1. G.9970 received Consent on December 12, 2008 and was Approved on January 13, 2009. == Relationship with G.hn == G.9970 (G.hnta) and G.9960 (G.hn) are two ITU-T Recommendations that address home networking in a complementary manner. While G.9970 addresses layer 3 (network layer) of the home network architecture, G.9960 addresses layers 1 (physical layer) and 2 (data link layer).

    Read more →
  • Honey encryption

    Honey encryption

    Honey encryption is a type of data encryption that "produces a ciphertext, which, when decrypted with an incorrect key as guessed by the attacker, presents a plausible-looking yet incorrect plaintext." == Creators == Ari Juels and Thomas Ristenpart of the University of Wisconsin, the developers of the encryption system, presented a paper on honey encryption at the 2014 Eurocrypt cryptography conference. == Method of protection == A brute-force attack involves repeated decryption with random keys; this is equivalent to picking random plaintexts from the space of all possible plaintexts with a uniform distribution. This is effective because even though the attacker is equally likely to see any given plaintext, most plaintexts are extremely unlikely to be legitimate i.e. the distribution of legitimate plaintexts is non-uniform. Honey encryption defeats such attacks by first transforming the plaintext into a space such that the distribution of legitimate plaintexts is uniform. Thus an attacker guessing keys will see legitimate-looking plaintexts frequently and random-looking plaintexts infrequently. This makes it difficult to determine when the correct key has been guessed. In effect, honey encryption "[serves] up fake data in response to every incorrect guess of the password or encryption key." The security of honey encryption relies on the fact that the probability of an attacker judging a plaintext to be legitimate can be calculated (by the encrypting party) at the time of encryption. This makes honey encryption difficult to apply in certain applications e.g. where the space of plaintexts is very large or the distribution of plaintexts is unknown. It also means that honey encryption can be vulnerable to brute-force attacks if this probability is miscalculated. For example, it is vulnerable to known-plaintext attacks: if the attacker has a crib that a plaintext must match to be legitimate, they will be able to brute-force even Honey Encrypted data if the encryption did not take the crib into account. == Example == An encrypted credit card number is susceptible to brute-force attacks because not every string of digits is equally likely. The number of digits can range from 13 to 19, though 16 is the most common. Additionally, it must have a valid IIN and the last digit must match the checksum. An attacker can also take into account the popularity of various services: an IIN from MasterCard is probably more likely than an IIN from Diners Club Carte Blanche. Honey encryption can protect against these attacks by first mapping credit card numbers to a larger space where they match their likelihood of legitimacy. Numbers with invalid IINs and checksums are not mapped at all (i.e. have probability 0 of legitimacy). Numbers from large brands like MasterCard and Visa map to large regions of this space, while less popular brands map to smaller regions, etc. An attacker brute-forcing such an encryption scheme would only see legitimate-looking credit card numbers when they brute-force, and the numbers would appear with the frequency the attacker would expect from the real world. == Application == Juels and Ristenpart aim to use honey encryption to protect data stored on password manager services. Juels stated that "password managers are a tasty target for criminals," and worries that "if criminals get a hold of a large collection of encrypted password vaults they could probably unlock many of them without too much trouble." Hristo Bojinov, CEO and founder of Anfacto, noted that "Honey Encryption could help reduce their vulnerability. But he notes that not every type of data will be easy to protect this way. … Not all authentication or encryption system yield themselves to being honeyed."

    Read more →