AI Coding Book

AI Coding Book — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Weak artificial intelligence

    Weak artificial intelligence

    Weak artificial intelligence (weak AI) is artificial intelligence that implements a limited part of the mind, or, as narrow AI, artificial narrow intelligence (ANI), is focused on one narrow task. Weak AI is contrasted with strong AI, which can be interpreted in various ways: Artificial general intelligence (AGI): a machine with the ability to apply intelligence to any problem, rather than just one specific problem. Artificial superintelligence (ASI): a machine with a vastly superior intelligence to the average human being. Artificial consciousness: a machine that has consciousness, sentience and mind (John Searle uses "strong AI" in this sense). Narrow AI can be classified as being "limited to a single, narrowly defined task. Most modern AI systems would be classified in this category." Artificial general intelligence is conversely the opposite. == Applications and risks == Some examples of narrow AI are AlphaGo, self-driving cars, robot systems used in the medical field, and diagnostic doctors. Narrow AI systems are sometimes dangerous if unreliable. And the behavior that it follows can become inconsistent. It could be difficult for the AI to grasp complex patterns and get to a solution that works reliably in various environments. This "brittleness" can cause it to fail in unpredictable ways. Narrow AI failures can sometimes have significant consequences. It could for example cause disruptions in the electric grid, damage nuclear power plants, cause global economic problems, and misdirect autonomous vehicles. Medicines could be incorrectly sorted and distributed. Also, medical diagnoses can ultimately have serious and sometimes deadly consequences if the AI is faulty or biased. Simple AI programs have already worked their way into society, oftentimes unnoticed by the public. Autocorrection for typing, speech recognition for speech-to-text programs, and vast expansions in the data science fields are examples. Narrow AI has also been the subject of some controversy, including resulting in unfair prison sentences, discrimination against women in the workplace for hiring, resulting in death via autonomous driving, among other cases. Despite being "narrow" AI, recommender systems are efficient at predicting user reactions based on their posts, patterns, or trends. For instance, TikTok's "For You" algorithm can determine a user's interests or preferences in less than an hour. Some other social media AI systems are used to detect bots that may be involved in propaganda or other potentially malicious activities. == Weak AI versus strong AI == John Searle contests the possibility of strong AI (by which he means conscious AI). He further believes that the Turing test (created by Alan Turing and originally called the "imitation game", used to assess whether a machine can converse indistinguishably from a human) is not accurate or appropriate for testing whether an AI is "strong". Scholars such as Antonio Lieto have argued that the current research on both AI and cognitive modelling are perfectly aligned with the weak-AI hypothesis (that should not be confused with the "general" vs "narrow" AI distinction) and that the popular assumption that cognitively inspired AI systems espouse the strong AI hypothesis is ill-posed and problematic since "artificial models of brain and mind can be used to understand mental phenomena without pretending that that they are the real phenomena that they are modelling" (as, on the other hand, implied by the strong AI assumption).

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • Cryptosystem

    Cryptosystem

    In cryptography, a cryptosystem is a suite of cryptographic algorithms needed to implement a particular security service, such as confidentiality (encryption). Typically, a cryptosystem consists of three algorithms: one for key generation, one for encryption, and one for decryption. The term cipher (sometimes cypher) is often used to refer to a pair of algorithms, one for encryption and one for decryption. Therefore, the term cryptosystem is most often used when the key generation algorithm is important. For this reason, the term cryptosystem is commonly used to refer to public key techniques; however both "cipher" and "cryptosystem" are used for symmetric key techniques. == Formal definition == Mathematically, a cryptosystem or encryption scheme can be defined as a tuple ( P , C , K , E , D ) {\displaystyle ({\mathcal {P}},{\mathcal {C}},{\mathcal {K}},{\mathcal {E}},{\mathcal {D}})} with the following properties. P {\displaystyle {\mathcal {P}}} is a set called the "plaintext space". Its elements are called plaintexts. C {\displaystyle {\mathcal {C}}} is a set called the "ciphertext space". Its elements are called ciphertexts. K {\displaystyle {\mathcal {K}}} is a set called the "key space". Its elements are called keys. E = { E k : k ∈ K } {\displaystyle {\mathcal {E}}=\{E_{k}:k\in {\mathcal {K}}\}} is a set of functions E k : P → C {\displaystyle E_{k}:{\mathcal {P}}\rightarrow {\mathcal {C}}} . Its elements are called "encryption functions". D = { D k : k ∈ K } {\displaystyle {\mathcal {D}}=\{D_{k}:k\in {\mathcal {K}}\}} is a set of functions D k : C → P {\displaystyle D_{k}:{\mathcal {C}}\rightarrow {\mathcal {P}}} . Its elements are called "decryption functions". For each e ∈ K {\displaystyle e\in {\mathcal {K}}} , there is d ∈ K {\displaystyle d\in {\mathcal {K}}} such that D d ( E e ( p ) ) = p {\displaystyle D_{d}(E_{e}(p))=p} for all p ∈ P {\displaystyle p\in {\mathcal {P}}} . Note; typically this definition is modified in order to distinguish an encryption scheme as being either a symmetric-key or public-key type of cryptosystem. == Examples == A classical example of a cryptosystem is the Caesar cipher. A more contemporary example is the RSA cryptosystem. Another example of a cryptosystem is the Advanced Encryption Standard (AES). AES is a widely used symmetric encryption algorithm that has become the standard for securing data in various applications. Paillier cryptosystem is another example used to preserve and maintain privacy and sensitive information. It is featured in electronic voting, electronic lotteries and electronic auctions.

    Read more →
  • Data Transformation Services

    Data Transformation Services

    Data Transformation Services (DTS) is a Microsoft database tool with a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases. DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe. DTS has been superseded by SQL Server Integration Services in later releases of Microsoft SQL Server though there was some backwards compatibility and ability to run DTS packages in the new SSIS for a time. == History == In SQL Server versions 6.5 and earlier, database administrators (DBAs) used SQL Server Transfer Manager and Bulk Copy Program, included with SQL Server, to transfer data. These tools had significant shortcomings, and many DBAs used third-party tools such as Pervasive Data Integrator to transfer data more flexibly and easily. With the release of SQL Server 7 in 1998, "Data Transformation Services" was packaged with it to replace all these tools. The concept, design, and implementation of the Data Transformation Services was led by Stewart P. MacLeod (SQL Server Development Group Program Manager), Vij Rajarajan (SQL Server Lead Developer), and Ted Hart (SQL Server Lead Developer). The goal was to make it easier to import, export, and transform heterogeneous data and simplify the creation of data warehouses from operational data sources. SQL Server 2000 expanded DTS functionality in several ways. It introduced new types of tasks, including the ability to FTP files, move databases or database components, and add messages into Microsoft Message Queue. DTS packages can be saved as a Visual Basic file in SQL Server 2000, and this can be expanded to save into any COM-compliant language. Microsoft also integrated packages into Windows 2000 security and made DTS tools more user-friendly; tasks can accept input and output parameters. DTS comes with all editions of SQL Server 7 and 2000, but was superseded by SQL Server Integration Services in the Microsoft SQL Server 2005 release in 2005. == DTS packages == The DTS package is the fundamental logical component of DTS; every DTS object is a child component of the package. Packages are used whenever one modifies data using DTS. All the metadata about the data transformation is contained within the package. Packages can be saved directly in a SQL Server, or can be saved in the Microsoft Repository or in COM files. SQL Server 2000 also allows a programmer to save packages in a Visual Basic or other language file (when stored to a VB file, the package is actually scripted—that is, a VB script is executed to dynamically create the package objects and its component objects). A package can contain any number of connection objects, but does not have to contain any. These allow the package to read data from any OLE DB-compliant data source, and can be expanded to handle other sorts of data. The functionality of a package is organized into tasks and steps. A DTS Task is a discrete set of functionalities executed as a single step in a DTS package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed. Data Transformation Services supplies a number of tasks that are part of the DTS object model and that can be accessed graphically through the DTS Designer or accessed programmatically. These tasks, which can be configured individually, cover a wide variety of data copying, data transformation and notification situations. For example, the following types of tasks represent some actions that you can perform by using DTS: executing a single SQL statement, sending an email, and transferring a file with FTP. A step within a DTS package describes the order in which tasks are run and the precedence constraints that describe what to do in the case damage or of failure. These steps can be executed sequentially or in parallel. Packages can also contain global variables which can be used throughout the package. SQL Server 2000 allows input and output parameters for tasks, greatly expanding the usefulness of global variables. DTS packages can be edited, password protected, scheduled for execution, and retrieved by version. == DTS tools == DTS tools packaged with SQL Server include the DTS wizards, DTS Designer, and DTS Programming Interfaces. === DTS wizards === The DTS wizards can be used to perform simple or common DTS tasks. These include the Import/Export Wizard and the Copy of Database Wizard. They provide the simplest method of copying data between OLE DB data sources. There is a great deal of functionality that is not available by merely using a wizard. However, a package created with a wizard can be saved and later altered with one of the other DTS tools. A Create Publishing Wizard is also available to schedule packages to run at certain times. This only works if SQL Server Agent is running; otherwise the package will be scheduled, but will not be executed. === DTS Designer === The DTS Designer is a graphical tool used to build complex DTS Packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize DTS Packages created with the DTS wizard. Each connection and task in DTS Designer is shown with a specific icon. These icons are joined with precedence constraints, which specify the order and requirements for tasks to be run. One task may run, for instance, only if another task succeeds (or fails). Other tasks may run concurrently. The DTS Designer has been criticized for having unusual quirks and limitations, such as the inability to visually copy and paste multiple tasks at one time. Many of these shortcomings have been overcome in SQL Server Integration Services, DTS's successor. === DTS Query Designer === A graphical tool used to build queries in DTS. === DTS Run Utility === DTS Packages can be run from the command line using the DTSRUN Utility. The utility is invoked using the following syntax: dtsrun /S server_name[\instance_name] { {/[~]U user_name [/[~]P password]} | /E } ] { {/[~]N package_name } | {/[~]G package_guid_string} | {/[~]V package_version_guid_string} } [/[~]M package_password] [/[~]F filename] [/[~]R repository_database_name] [/A global_variable_name:typeid=value] [/L log_file_name] [/W NT_event_log_completion_status] [/Z] [/!X] [/!D] [/!Y] [/!C] ] When passing in parameters which are mapped to Global Variables, you are required to include the typeid. This is rather difficult to find on the Microsoft site. Below are the TypeIds used in passing in these values.

    Read more →
  • Isotropic position

    Isotropic position

    In the fields of machine learning, the theory of computation, and random matrix theory, a probability distribution over vectors is said to be in isotropic position if its covariance matrix is proportional to the identity matrix. == Formal definitions == Let D {\textstyle D} be a distribution over vectors in the vector space R n {\textstyle \mathbb {R} ^{n}} . Then D {\textstyle D} is in isotropic position if, for vector v {\textstyle v} sampled from the distribution, E v v T = I d . {\displaystyle \mathbb {E} \,vv^{\mathsf {T}}=\mathrm {Id} .} A set of vectors is said to be in isotropic position if the uniform distribution over that set is in isotropic position. In particular, every orthonormal set of vectors is isotropic. As a related definition, a convex body K {\textstyle K} in R n {\textstyle \mathbb {R} ^{n}} is called isotropic if it has volume | K | = 1 {\textstyle |K|=1} , center of mass at the origin, and there is a constant α > 0 {\textstyle \alpha >0} such that ∫ K ⟨ x , y ⟩ 2 d x = α 2 | y | 2 , {\displaystyle \int _{K}\langle x,y\rangle ^{2}dx=\alpha ^{2}|y|^{2},} for all vectors y {\textstyle y} in R n {\textstyle \mathbb {R} ^{n}} ; here | ⋅ | {\textstyle |\cdot |} stands for the standard Euclidean norm.

    Read more →
  • Cryptographic nonce

    Cryptographic nonce

    In cryptography, a nonce is an arbitrary number that can be used just once in a cryptographic communication. It is often a random or pseudo-random number issued in an authentication protocol to ensure that each communication session is unique, and therefore that old communications cannot be reused in replay attacks. Nonces can also be useful as initialization vectors and in cryptographic hash functions. == Definition == A nonce is an arbitrary number used only once in a cryptographic communication, in the spirit of a nonce word. They are often random or pseudo-random numbers. Many nonces also include a timestamp to ensure exact timeliness, though this requires clock synchronisation between organisations. The addition of a client nonce ("cnonce") helps to improve the security in some ways as implemented in digest access authentication. To ensure that a nonce is used only once, it should be time-variant (including a suitably fine-grained timestamp in its value), or generated with enough random bits to ensure an insignificantly low chance of repeating a previously generated value. Some authors define pseudo-randomness (or unpredictability) as a requirement for a nonce. Nonce is a word dating back to Middle English for something only used once or temporarily (often with the construction "for the nonce"). It descends from the construction "then anes" ("the one [purpose]"). A false etymology claiming it to stand for "number used once" or similar is incorrect. == Usage == === Authentication === Authentication protocols may use nonces to ensure that old communications cannot be reused in replay attacks. For instance, nonces are used in HTTP digest access authentication to calculate an MD5 digest of the password. The nonces are different each time the 401 authentication challenge response code is presented, thus making replay attacks virtually impossible. The scenario of ordering products over the Internet can provide an example of the usefulness of nonces in replay attacks. An attacker could take the encrypted information and—without needing to decrypt—could continue to send a particular order to the supplier, thereby ordering products over and over again under the same name and purchase information. The nonce is used to give 'originality' to a given message so that if the company receives any other orders from the same person with the same nonce, it will discard those as invalid orders. A nonce may be used to ensure security for a stream cipher. Where the same key is used for more than one message and then a different nonce is used to ensure that the keystream is different for different messages encrypted with that key; often the message number is used. Secret nonce values are used by the Lamport signature scheme as a signer-side secret which can be selectively revealed for comparison to public hashes for signature creation and verification. === Hashing === Nonces are used in proof-of-work systems to vary the input to a cryptographic hash function so as to obtain a hash for a certain input that fulfils certain arbitrary conditions. In doing so, it becomes far more difficult to create a "desirable" hash than to verify it, shifting the burden of work onto one side of a transaction or system. For example, proof of work, using hash functions, was considered as a means to combat email spam by forcing email senders to find a hash value for the email (which included a timestamp to prevent pre-computation of useful hashes for later use) that had an arbitrary number of leading zeroes, by hashing the same input with a large number of values until a "desirable" hash was obtained. Similarly, the Bitcoin blockchain hashing algorithm can be tuned to an arbitrary difficulty by changing the required minimum/maximum value of the hash so that the number of bitcoins awarded for new blocks does not increase linearly with increased network computation power as new users join. This is likewise achieved by forcing Bitcoin miners to add nonce values to the value being hashed to change the hash algorithm output. As cryptographic hash algorithms cannot easily be predicted based on their inputs, this makes the act of blockchain hashing and the possibility of being awarded bitcoins something of a lottery, where the first "miner" to find a nonce that delivers a desirable hash is awarded bitcoins.

    Read more →
  • Consumer relationship system

    Consumer relationship system

    Consumer relationship systems (CRS) are specialized customer relationship management (CRM) software applications that are used to handle a company's dealings with its customers. Current consumer relationship systems integrate the software with telephone and call recording systems as well as with corporate systems for input and reporting. Customers can provide input from the company's website directly into the CRS. These systems are popular because they can deliver the 'voice of the consumer' that contributes to product quality improvement and that ultimately increases corporate profits. Consumer relationship systems that provide automated support as well as advanced systems may have artificial intelligence (AI) interfaces that can extract and analyse data collected, or handle basic questions and complaints. == History == The first CRS was developed in the 1980s. In 1981 Michael Wilke and Robert Thornton founded Wilke/Thornton, Inc in Columbus, Ohio, to develop new CRS software.

    Read more →
  • Social recruiting

    Social recruiting

    Social recruiting (social hiring or social media recruitment) is recruiting candidates by using social platforms as talent databases or for advertising. Social recruiting uses social media profiles, blogs, and other Internet sites to find information on candidates. It also uses social media to advertise jobs either through HR vendors or through crowdsourcing where job seekers and others share job openings within their online social networks. Social recruiting's effectiveness and return on investment have been difficult to determine, since applicants do not usually apply through the social channels which first attracted them. In May 2013, Maximum Employment Marketing Group released the Social Recruitment Monitor, which ranks the reach, engagement, and interactivity of employers' social recruiting efforts around the world. == Social recruitment software == The social recruitment software market (a form of e-recruitment) is often included in the wider talent management software sector. Bersin & Associates valued the wider talent management market at over $2bn in 2007. Social recruitment increasingly sits at an intersection of a number of fast-moving areas including social networking, recruitment and now cloud computing. Additionally, mobile recruiting has become another hot topic, especially with the rise in tablet and smartphone usage. In 2012, there was a rise of tech companies using social recruiting applications to find and screen applicants. As more companies saw value in filling jobs by putting them on the social platforms where millions of people spend at least 37 minutes daily, there developed a much larger focus on social recruiting among the talent acquisition community. By mid-2013, many major enterprise companies such as Pepsi, Gap, AIG, and Oracle had begun effectively utilizing social recruiting software, making it clear that large corporations were open to automating or streamlining (and ultimately investing in) their social recruiting processes.

    Read more →
  • Expectation propagation

    Expectation propagation

    Expectation propagation (EP) is a technique in Bayesian machine learning. EP finds approximations to a probability distribution. It uses an iterative approach that uses the factorization structure of the target distribution. It differs from other Bayesian approximation approaches such as variational Bayesian methods. More specifically, suppose we wish to approximate an intractable probability distribution p ( x ) {\displaystyle p(\mathbf {x} )} with a tractable distribution q ( x ) {\displaystyle q(\mathbf {x} )} . Expectation propagation achieves this approximation by minimizing the Kullback–Leibler divergence K L ( p | | q ) {\displaystyle \mathrm {KL} (p||q)} . Variational Bayesian methods minimize K L ( q | | p ) {\displaystyle \mathrm {KL} (q||p)} instead. If q ( x ) {\displaystyle q(\mathbf {x} )} is a Gaussian N ( x | μ , Σ ) {\displaystyle {\mathcal {N}}(\mathbf {x} |\mu ,\Sigma )} , then K L ( p | | q ) {\displaystyle \mathrm {KL} (p||q)} is minimized with μ {\displaystyle \mu } and Σ {\displaystyle \Sigma } being equal to the mean of p ( x ) {\displaystyle p(\mathbf {x} )} and the covariance of p ( x ) {\displaystyle p(\mathbf {x} )} , respectively; this is called moment matching. == Applications == Expectation propagation via moment matching plays a vital role in approximation for indicator functions that appear when deriving the message passing equations for TrueSkill.

    Read more →
  • Instagram egg

    Instagram egg

    The Instagram egg is a photo of an egg posted by the account @world_record_egg on the social media platform Instagram. It became a global phenomenon and an internet meme within days of its publication on 4 January 2019. It is the second most-liked Instagram post and was the most-liked Instagram post from 14 January 2019 until 20 December 2022, when it was overtaken by Lionel Messi's post showing him and his teammates celebrating after Argentina won the 2022 FIFA World Cup. The owner of the account was revealed to be Chris Godfrey, a British advertising creative, who later worked with his two friends Alissa Khan-Whelan and CJ Brown on a Hulu commercial featuring the egg, intended to raise mental health awareness. == Background == The photo was originally taken by Serghei Platanov, who then posted it to Shutterstock on 23 June 2015 with the title "eggs isolated on white background". == History == On 4 January 2019, the @world_record_egg account was created, and posted an image of a bird egg with the caption, "Let's set a world record together and get the most liked post on Instagram. Beating the current world record held by Kylie Jenner (18 million)! We got this." Jenner's previous record, the first photo of her daughter Stormi, had garnered a total of 18.4 million likes. The post quickly reached 18.4 million likes in just under 10 days, becoming the most-liked Instagram post at the time. It then continued to rise over 45 million likes in the next 48 hours, surpassing the "Despacito" music video and taking the world record for the most-liked online post (on any media platform) in history. After the account became verified on 14 January 2019, the post rose in popularity and likes, which snowballed into coverage in various media outlets. By 18 March 2019, the post had accumulated over 53.3 million likes, nearly three times the previous record of 18.4 million. It posted frequent updates for a few days in the form of Instagram Stories. Alongside the like tally, as of January 2023 the post has 3.8 million comments. Several individuals tried to claim that they were the account's creator, the claims being dismissed by "the egg" on Instagram direct messages. On 3 February 2019, the creator of the Instagram egg was revealed by Hulu and The New York Times to be Chris Godfrey, a British advertising creative. Alissa Khan-Whelan, his colleague, was also outed. On 18 January 2019, the account posted a second picture of an egg, almost identical to the first one apart from a small crack at the top left. As of 25 February 2019, the post accumulated 11.8 million likes. On 22 January 2019, the account posted a third picture of an egg, this time having two larger cracks. In less than 25 minutes, the post accumulated 1 million likes, and by 25 February 2019, it had accumulated 9.5 million likes. On 29 January 2019, a fourth picture of an egg was posted to the account which has another large crack on the right hand side, attracting 7.6 million likes by 25 February 2019. On 1 February 2019, a fifth picture of an egg was posted with stitching like that of a football, referencing the upcoming Super Bowl. That post had accumulated 6.5 million likes by 25 February 2019. The account promised that it would reveal what was inside the egg on 3 February, on the subscription video on demand service Hulu. The Hulu Instagram egg reveal was used to promote an animation about a mental health campaign. A caption from the clip read, "Recently I've started to crack, the pressure of social media is getting to me. If you're struggling too, talk to someone." The video was later posted on the @world_record_egg Instagram account, and this post received over 33 million views by May 2019. As of May 2020, it had received over 41 million views. On 16 July 2019, Chris Godfrey (the creator of the account) was listed as one of the top 25 most influential people on the internet. On 20 December 2022, the record for the most-liked Instagram post was surpassed by a post from Argentine footballer Lionel Messi, showing him and his teammates celebrating after winning the 2022 FIFA World Cup with their national team. The world record egg responded to being overtaken in likes by Messi with "Today [Lionel Messi] has taken the crown, for now. But I'm still left with one question… Who is the greatest of all time – Cristiano Ronaldo or Leo Messi?" The account sold to Dubai-based investor Mustafa El Fishawy in April 2024 for an undisclosed seven-figure sum. Reed Smith, who advised Godfrey, Brown, and Khan-Whelan in the transaction, stated they opted to sell it to "focus on new ventures." On 3 June, @world_record_egg posted an egg with the flag of Palestine in support of the country during the Gaza war; the post's caption described it as an "Egg for Peace" and hoped to "set a new world record together and get the most liked post on Instagram for a good cause." == Reception == In response to breaking the world record for the most-liked Instagram post, the account's owner wrote "This is madness. What a time to be alive." Hours later, Jenner posted a video on Instagram of her cracking open an egg and pouring its yolk onto the ground, with the caption: "Take that little egg." Pundits pontificated on the meaning of the egg picture's dominance over social media's "first family". As Vogue observed, tapping a heart pictogram is easy, and eggs are "lovable". More pointedly: [T]he attention economy is a scam based on requiring little to no labor from both producer and consumer despite commanding the most space, and therefore value, in our digital lives... but it very well could be: As a metaphor for the fragility of the influencer ecosystem, the egg has broken the Internet. The significance of the event and its massive republishing are a topic of discussion. A University of Westminster researcher of internet memes compared it to the movement to name a scientific research vessel in the United Kingdom as Boaty McBoatface. The Instagrammer's success is a rare victory for the unpaid viral campaign on social media. "There is a bit of an anti-celebrity revolt here – 'look what we can do with a simple egg'" The researcher suggests that the accomplishment of becoming such a widely heralded unpaid viral post may become increasingly rare, as social networks rely more on paid and business promotion. The post's spread has been characterized as a populist backlash against "consumerism" and is seen by some as a triumph of community over celebrity. However, propelled by their popular success, the creators promised to release 'egg-centric' memorabilia. Hundreds of games based on the Instagram egg have appeared on Apple's App Store. The creators of the Instagram egg also reached a deal to promote Hulu.

    Read more →
  • Tropical cryptography

    Tropical cryptography

    In tropical analysis, tropical cryptography refers to the study of a class of cryptographic protocols built upon tropical algebras. In many cases, tropical cryptographic schemes have arisen from adapting classical (non-tropical) schemes to instead rely on tropical algebras. The case for the use of tropical algebras in cryptography rests on at least two key features of tropical mathematics: in the tropical world, there is no classical multiplication (a computationally expensive operation), and the problem of solving systems of tropical polynomial equations has been shown to be NP-hard. == Basic Definitions == The key mathematical object at the heart of tropical cryptography is the tropical semiring ( R ∪ { ∞ } , ⊕ , ⊗ ) {\displaystyle (\mathbb {R} \cup \{\infty \},\oplus ,\otimes )} (also known as the min-plus algebra), or a generalization thereof. The operations are defined as follows for x , y ∈ R ∪ { ∞ } {\displaystyle x,y\in \mathbb {R} \cup \{\infty \}} : x ⊕ y = min { x , y } {\displaystyle x\oplus y=\min\{x,y\}} x ⊗ y = x + y {\displaystyle x\otimes y=x+y} It is easily verified that with ∞ {\displaystyle \infty } as the additive identity, these binary operations on R ∪ { ∞ } {\displaystyle \mathbb {R} \cup \{\infty \}} form a semiring.

    Read more →
  • Social media measurement

    Social media measurement

    Social media measurement, also called social media controlling, is the management practice of evaluating successful social media communications of brands, companies, or other organizations. Key performance indicators may be measured by extracting information from social media channels, such as blogs, wikis, micro-blogs such as Twitter, social networking sites, or video/photo sharing websites, forums from time to time. It is also used by companies to gauge current trends in the industry. The process first gathers data from different websites and then performs analysis based on different metrics like time spent on the page, click through rate, content share, comments, text analytics to identify positive or negative emotions about the brand. Some other social media metrics include share of voice, owned mentions, and earned mentions. The social media measurement process starts with defining a goal that needs to be achieved and defining the expected outcome of the process. The expected outcome varies per the goal and is usually measured by a variety of metrics. This is followed by defining possible social strategies to be used to achieve the goal. Then the next step is designing strategies to be used and setting up configuration tools that ease the process of collecting the data. In the next step, strategies and tools are deployed in real-time. This step involves conducting Quality Assurance tests of the methods deployed to collect the data. And in the final step, data collected from the system is analyzed and if the need arises, it is refined on the run time to enhance the methodologies used. The last step ensures that the result obtained is more aligned with the goal defined in the first step. == Data Acquisition == Acquiring data from social media is in demand of an exploring the user participation and population with the purpose of retrieving and collecting so many kinds of data(ex: comments, downloads etc.). There are several prevalent techniques to acquire data such as Network traffic analysis, Ad-hoc application and Crawling Network Traffic Analysis - Network traffic analysis is the process of capturing network traffic and observing it closely to determine what is happening in the network. It is primarily done to improve the performance, security and other general management of the network. However concerned about the potential tort of privacy on the Internet, network traffic analysis is always restricted by the government. Furthermore, high-speed links are not adaptable to traffic analysis because of the possible overload problem according to the packet sniffing mechanism Ad-hoc Application - Ad-hoc application is a kind of application that provides services and games to social network users by developing the APIs offered by social network companies (Facebook Developer Platform). The infrastructure of Ad-hoc application allows the user to interact with the interface layer instead of the application servers. The API provides a path for application to access information after the user login. Moreover, the size of the data set collected vary with the popularity of the social media platform i.e. social media platforms having high number of users will have more data than platforms having less user base. Scraping is a process in which the APIs collect online data from social media. The data collected from Scraping is in raw format. However, having access to these types of data is a bit difficult because of its commercial value. Crawling - Crawling is a process in which a web crawler creates indexes of all the words in a web-page, stores them, then follows all the hyperlinks and indexes on that page and again stores them. It is the most popular technique for data acquisition and is also well known for its easy operation based on prevalent Object-Orientated Programming Language (Java or Python etc.). And most important, social network companies (YouTube, Flicker, Facebook, Instagram, etc.) are friendly to crawling techniques by providing public APIs == Applications == === For branding === Monitoring social media allows researchers to find insights into a brand's overall visibility on social media, to measure the impact of campaigns, to identify opportunities for engagement, to assess competitor activity and share of voice, and to detect impending crises. It can also provide valuable information about emerging trends and what consumers and clients think about specific topics, brands or products. This is the work of a cross-section of groups that include market researchers, PR staff, marketing teams, social-engagement, and community staff, agencies and sales teams. Several different providers have developed tools to facilitate the monitoring of a variety of social media channels - from blogging to internet video to internet forums. This allows companies to track what consumers say about their brands and actions. Companies can then react to these conversations and interact with consumers through social media platforms. === In government === Apart from commercial applications, social media monitoring has become a pervasive technique applied by public organizations and governments. Monitoring is a tradition within the public sector, and social-media monitoring provides a real-time approach to detecting and responding to social developments. Governments have come to realize the need for strategies to cope with surprises from the rapid expansion of public issues. Sobkowicz introduced a framework with three blocks of social-media opinion tracking, simulating and forecasting. It includes: real-time detection of emotions, topics and opinions information-flow modelling and agent-based simulation modeling of opinion networks Bekkers introduced the application of social media monitoring in the Netherlands. Public organizations in the Netherlands (such as the Tax Agency and the Education Ministry) have started to use social media monitoring to obtain better insights into the sentiments of target groups. On the one hand, the public sector will be enabled to provide timely and efficient answers to the public by using social media monitoring techniques, but on the other hand, they also have to deal with concerns about ethical issues such as transparency and privacy. == Quantifying social media == Social media management software (SMMS) is an application program or software that facilitates an organization's ability to successfully engage in social media across different communication channels. SMMS is used to monitor inbound and outbound conversations, support customer interaction, audit or document social marketing initiatives and evaluate the usefulness of a social media presence. It can be difficult to measure all social media conversations. Due to privacy settings and other issues, not all social media conversations can be found and reported by monitoring tools. However, whilst social media monitoring cannot give absolute figures, it can be extremely useful for identifying trends and for benchmarking, in addition to the uses mentioned above. These findings can, in turn, influence and shape future business decisions. In order to access social media data (posts, Tweets, and meta-data) and to analyze and monitor social media, many companies use software technologies built for business. These range from in-platform analytics dashboards to dedicated third-party platforms, which offer more advanced capabilities including cross-platform audience intelligence, sentiment analysis, and trend detection at scale. == Location-based == Most social media networks allow users to add a location to their posts (reference all of our feeds). The location can be classified as either 'at-the-location' or 'about-the-location'. "'At-the-location' services can be defined as services where location-based content is created at the geographic location. 'About-the-location' services can be defined as services which are referring to a particular location but the content is not necessarily created in this particular physical place." The added information available from geotagged (link to Geotagging article) posts means that they can be displayed on a map. This means that a location can be used as the start of a social media search rather than a keyword or hashtag. This has major implications for disaster relief, event monitoring, safety and security professionals since a large portion of their job is related to tracking and monitoring specific locations. == Technologies used == Various monitoring platforms use different technologies for social media monitoring and measurement. These technology providers may connect to the API provided by social platforms that are created for 3rd party developers to develop their own applications and services that access data. Facebook's Graph API is one such API that social media monitoring solution products would connect to pull data from. Some social media monitoring and analytics companies use calls to data providers each time an end-user d

    Read more →
  • Instance (computer science)

    Instance (computer science)

    In computer science, an instance or token (from metalogic and metamathematics) is a specific occurrence of a software element that is based on a type definition. When created, an occurrence is said to have been instantiated, and both the creation process and the result of creation are called instantiation. == Examples == Chat AI instance In chat-based AI systems, an assistant can be invoked across many independent conversation sessions (often called a thread), each with its own message history. A specific execution of the assistant over that session may be represented as a run (an execution on a thread). Class instance In object-oriented programming, an object created from a class type. Each instance of a class shares the class-defined structure and behavior but has its own identity and state. Procedural instance In some contexts (including Simula), each procedure call can be viewed as an instance of that procedure—an activation with its own parameters and local variables. Computer instance In cloud computing and virtualization, an instance commonly refers to a provisioned virtual machine or virtual server with an allocated combination of compute, memory, network, and storage resources. Polygonal model In computer graphics, a model may be instanced so it can be drawn multiple times with different transforms and parameters, improving performance by reusing shared geometry data. Program instance In a POSIX-oriented operating system, a running process is an instance of a program. It can be instantiated via system calls such as fork() and exec(). Each executing process is an instance of a program it has been instantiated from.

    Read more →
  • BREACH

    BREACH

    BREACH (a backronym: Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext) is a security vulnerability against HTTPS when using HTTP compression. BREACH is built based on the CRIME security exploit. BREACH was announced at the August 2013 Black Hat USA conference by security researchers Angelo Prado, Neal Harris and Yoel Gluck. == Details == While the CRIME attack was presented as a general attack that could work effectively against a large number of protocols, only exploits against SPDY request compression and TLS compression were demonstrated and largely mitigated in browsers and servers. The CRIME exploits against HTTP compression has not been mitigated at all, even though the authors of CRIME have warned that this vulnerability might be even more widespread than SPDY and TLS compression combined. BREACH is an instance of the CRIME attack against HTTP compression—the use of gzip or DEFLATE data compression algorithms via the content-encoding option within HTTP by many web browsers and servers. Given this compression oracle, the rest of the BREACH attack follows the same general lines as the CRIME exploit, by performing an initial blind brute-force search to guess a few bytes, followed by divide-and-conquer search to expand a correct guess to an arbitrarily large amount of content. == Mitigation == BREACH exploits the compression in the underlying HTTP protocol. Therefore, turning off TLS compression makes no difference to BREACH, which can still perform a chosen-plaintext attack against the HTTP payload. As a result, clients and servers are either forced to disable HTTP compression completely (thus reducing performance), or to adopt workarounds to try to foil BREACH in individual attack scenarios, such as using cross-site request forgery (CSRF) protection. Another suggested approach is to disable HTTP compression whenever the referrer header indicates a cross-site request, or when the header is not present. This approach allows effective mitigation of the attack without losing functionality, only incurring a performance penalty on affected requests. Another approach is to add padding at the TLS, HTTP header, or payload level. Around 2013–2014, there was an IETF draft proposal for a TLS extension for length-hiding padding that, in theory, could be used as a mitigation against this attack. It allows the actual length of the TLS payload to be disguised by the insertion of padding to round it up to a fixed set of lengths, or to randomize the external length, thereby decreasing the likelihood of detecting small changes in compression ratio that is the basis for the BREACH attack. However, this draft has since expired without further action. A very effective mitigation is HTB (Heal-the-BREACH) that adds random-sized padding to compressed data, providing some variance in the size of the output contents. This randomness delays BREACH from guessing the correct characters in the secret token by a factor of 500 (10-byte max) to 500,000 (100-byte max). HTB protects all websites and pages in the server with minimal CPU usage and minimal bandwidth increase.

    Read more →
  • Upworthy

    Upworthy

    Upworthy is a media brand that focuses on positive storytelling. It was started in March 2012 by Eli Pariser, the former executive director of MoveOn, and Peter Koechley, the former managing editor of The Onion. One of Facebook's co-founders, Chris Hughes, was an early investor. At its peak between 2012 and 2014, it reached up to 100 million people a month. In 2017, the company was acquired by Good Worldwide. == History == Upworthy was launched in 2012 with a focus on aggregating positive content, which aligned with Facebook's algorithm. Originally, Upworthy curators searched the internet for existing content to feature on the site. Once selected as an option, curators brainstormed different headlines and shareable images for the content, and tested it with a small sample of Upworthy's visitors before sharing it on the site. The site popularized a clickbait style of two-phrase headlines. The company simplifies issues that are controversial by nature, which are presented from a politically liberal point of view and are heavily fact-checked for accuracy. In June 2013, an article in Fast Company called Upworthy "the fastest growing media site of all time". It had 8.7 million unique monthly visitors in the first six months, and in November 2013, had a high of 87 million unique visitors in a single month. In 2013, Facebook changed its algorithm, leading to a significant decline in readers from that platform. Upworthy fired one round of writers in 2015, and another in 2016, after an unionization effort by some of the staff. The union involved, the Writers Guild of America, East, has organized several online "viral" news publishers. In January 2017, Upworthy was acquired by media company GOOD Worldwide. The newsrooms of the two organizations would merge as part of the acquisition. About 20 staffers were laid off as part of the merger. In March 2020, Upworthy saw a 65% increase in Instagram followers and a 47% increased interest in positive content on-site page views as a result of increased interest in positive content during the COVID-19 pandemic. In January 2023, National Geographic Books bought Good People: Stories From the Best of Humanity from Upworthy, with a publication date of September 3, 2024. The book is described as "a heartwarming collection of first-person tales that will provide comfort and inspiration to anyone who could use a little dose of joy right now". It was created by two senior Upworthy team members, Gabriel Reilich and Lucia Knell, and features 101 stories from Upworthy's audience. The co-creators encouraged Upworthy followers to connect with the brand through questions on their posts, opening the door for organic and personal stories to be shared in the comment sections. The book debuted on The New York Times nonfiction bestseller list on September 22, 2024, and remained on the list for two weeks. The book is seen in the top 10 on Publishers Weekly Fall 2024 Adult Preview: Lifestyle and on The Washington Post's "5 feel-good books".

    Read more →