AI Assistant Vs AI Agent

AI Assistant Vs AI Agent — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Perplexity AI

    Perplexity AI

    Perplexity AI, Inc., or simply Perplexity, is an American privately held software company offering a web search engine that processes user queries and synthesizes responses. Perplexity products use large language models and incorporate real-time web search capabilities, providing responses based on current Internet content, citing sources used. Its real-time search engine is called Sonar and is based on Meta's Llama model. A free public version is available, while a paid Pro subscription offers access to more advanced language models and additional features. Perplexity AI, Inc., was founded in August 2022 by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski. As of September 2025, the company was valued at US$20 billion. Perplexity AI has attracted legal scrutiny over allegations of copyright infringement, unauthorized content use, and trademark issues from several major media organizations, including the BBC, Dow Jones, and The New York Times. According to separate analyses by Wired and later Cloudflare, Perplexity uses undisclosed web crawlers with spoofed user-agent strings to scrape the content of websites which prohibit, or explicitly block, web scraping. == History == In August 2022, Perplexity AI, Inc., was founded by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski, engineers with backgrounds in back-end systems, artificial intelligence (AI) and machine learning. It launched its main search engine on December 7, 2022, and has since released a Google Chrome extension and apps for iOS and Android. In February 2023, Perplexity reported two million unique visitors. By April 2024, Perplexity had raised $165 million in funding, valuing the company at over $1 billion. As of June 2025, Perplexity closed a $500 million round of funding that elevated its valuation to $14 billion. Investors in Perplexity AI have included Jeff Bezos, Tobias Lütke, Nat Friedman, Nvidia, and Databricks. Perplexity has also received funding from 1789 Capital, a venture capital firm notable for its association with Donald Trump Jr. During Bloomberg’s Tech Summit 2025, Srinivas shared that the company processed 780 million queries in May 2025, experiencing more than 20% month-over-month growth, processing around 30 million queries daily. In July 2024, Perplexity announced the launch of a new publishers' program to share advertising revenue with partners. On January 18, 2025, the day before the impending U.S. ban on the social media app TikTok, Perplexity submitted a proposal for a merger with TikTok US. On August 12, 2025, Perplexity made a bid to buy Chrome from Google for $34.5 billion. Perplexity stated that the sale could remedy anti-trust litigation against Google, in which a judge was considering compelling the sale of Chrome. In December 2025, Cristiano Ronaldo took an undisclosed stake in Perplexity AI and entered a global brand partnership with the company. === Business Strategy and Finance (2026) === As of early 2026, Perplexity AI reached a valuation of $21.21 billion following its Series E-6 funding round. The company's Annual Recurring Revenue (ARR) grew from $80 million in late 2024 to an estimated $200 million by February 2026. In January 2026, the company entered into a three-year, $750 million commitment with Microsoft Azure to secure the GPU capacity required for its advanced "Deep Research" and "Model Council" features. In February 2026, Perplexity transitioned to a subscription-first model by discontinuing its AI-integrated advertising strategy. Leadership stated the move was intended to preserve user trust in the "answer engine," prioritizing objective results over ad revenue. The company also introduced the "Model Council" feature on February 5, 2026, which allows users to compare outputs from multiple large language models, such as GPT-5.2 and Claude 4.6, simultaneously. To expand its user base, Perplexity began offering a free year of Pro access to students, U.S. Military Veterans, and government employees. == Products and services == === Search engine web portal === Perplexity’s primary offering is an online information retrieval system (search engine) that uses large language models to generate responses to user queries by searching and summarizing web-based content. Perplexity offers a feature known as Perplexity Pages that generates structured summaries and report-like content from user queries by aggregating cited sources. Perplexity is available without charge or registration to Web users, a freemium model. === Perplexity Pro === Perplexity Pro is a subscription tier, a more capable paid "enterprise" service, including stronger security and data protection and additional tools, including the ability to search uploaded documents alongside web content and access to a programmatic application programming interface (API). It allows the user to select between backend models such as GPT-5.4, Claude 4.6 and Gemini 3.1 Pro. The company has also developed its own models, Sonar (based on Llama 3.3) and R1 1776 (based on DeepSeek R1). === Internal Knowledge Search === Internal Knowledge Search enables Pro and Enterprise Pro users to simultaneously search across web content and internal documents. Users can upload and search through Excel, Word, PDF, and other common file formats. Enterprise Pro users can upload and index up to 500 files. === Search API === Perplexity's Search API provides AI developers with programmatic access to the company's search infrastructure. The September 2025 release includes a software development kit, an open-source evaluation framework called search_evals, and documentation detailing the API's design and optimization. === Shopping hub === Perplexity's Shopping Hub is an online shopping platform that provides AI-generated product recommendations, and enables users to purchase products directly through Perplexity's interface. It was launched in November 2024 with backing by Amazon and Nvidia. === Finance === In October 2024, Perplexity AI introduced new finance-related features, including looking up stock prices and company earnings data. The tool provides real-time stock quotes and price tracking, industry peer comparisons and basic financial analysis tools. The platform sources its financial data from Financial Modeling Prep. === Assistant === In January 2025, Perplexity launched the Perplexity Assistant, an AI-powered tool designed to enhance the functionality of its search engine. It can perform tasks across multiple apps, such as hailing a ride or searching for a song, and can maintain context across actions. The assistant is also multi-modal, meaning it can use a phone's camera to provide answers about the user's surroundings or on-screen content. Perplexity has acknowledged that the assistant is still in development and may not always function as expected. For instance, certain features, such as summarizing unread emails or upcoming calendar events, require users to enable a workaround based on notifications. === Comet === In July 2025, Perplexity launched Comet, an AI browser based on Chromium. Initially, access to the browser was limited to users subscribed to the most expensive subscription tier. The browser was later released for free download in October 2025. A key feature is integration of the Perplexity search engine, which can perform a variety of tasks such as generating article summaries, describing an image, conducting research about a topic and composing emails. === Truth Social chatbot === Perplexity has been contracted to produce a chatbot for Donald Trump's social media platform Truth Social. == Leadership == Aravind Srinivas is the CEO and co-founder of Perplexity AI. He previously held research positions at OpenAI, Google DeepMind, and other AI research institutions focusing on machine learning and artificial intelligence. In a March 2026 All-In episode, Srinivas said the incoming AI-related layoffs were "glorious future" to "look forward", as it freed people from jobs they didn't like and gave them opportunities to pursue entrepreneurship. == Controversies == === Copyright and trademark infringement allegations === In June 2024, Forbes publicly criticized Perplexity for using their content. According to Forbes, Perplexity published a story largely copied from a proprietary Forbes article without mentioning or prominently citing Forbes. In response, Srinivas said that the feature had some "rough edges" and accepted feedback but maintained that Perplexity only "aggregates" rather than plagiarizes information. In October 2024, The New York Times sent a cease-and-desist notice to Perplexity to stop accessing and using NYT content, claiming that Perplexity is violating its copyright by scraping data from its website. In June 2024, Dow Jones and New York Post filed a lawsuit against Perplexity, alleging copyright infringement. The lawsuit also alleged that Perplexity harmed their brand by attributing hallucinated quotes, for example on F-16 jets for Ukraine, to artic

    Read more →
  • Social media reach

    Social media reach

    Social media reach is a media analytics metric that refers to the number of users who have come across a particular content on a particular social media platform. Social media platforms have their own individual ways of tracking, analyzing and reporting the traffic on each of the individual platforms. As these platforms are a main source of communication between companies and their target audiences, by conducting research, companies are able to utilize analytical information, such as the reach of their posts, to better understand the interactions between the users and their content. There are multiple underlying factors that will determine what shows up on a newsfeed or timeline. Algorithms, for example, are a type of factor that can alter the reach of a post due to the way the algorithm is coded, which can affect who sees a post and when. Other examples of factors that can impede the reach can include the time at which posts are made, as well as how frequent the posts are between one another. In comparison, an impression is the total number of circumstances where content has been shown on a social timeline, meanwhile, engagement looks at how people interact with the content that they see on a social platform such as like, share or retweet. == Reach on Facebook == Facebook has their own analytic platform which allows the user to see how other users are interacting with their posts, with the use of multiple metrics. This is not something the average user uses, but rather a tool that is used by pages or public figures. For example, Facebook pages that represent a business often look at the activity their posts have generated. There are three types of reach that can be looked at on the Facebook analytic platform. === Types of reach === ==== Organic Reach ==== This type of reach regards the number of distinct users that have seen a specific post on their feed. Organic reach, in other words is the number of people who have seen the post being analyzed on their Facebook newsfeed. Data gathered from this type of reach can give intel to those doing the analysis, such as the demographics of those who have seen the post. ==== Paid Reach ==== This type of reach regards the number of times that distinct users have come across sponsored posts, ads or content. In other words, paid reach is the number of times Facebook users have seen a post that has been paid for by a company. Data collected can give insight, to advertisers or marketers for example, on the activity based around the reach of their post. ==== Viral Reach ==== This type of reach regards the number of views by distinct users on posts that have been commented on or shared by their friends on Facebook. In other words, viral reach looks at the number of people who have seen a post after a friend of theirs commented or shared the original post, therefore it showed on their timeline. Viral reach can be looked at in terms of a collective number of times that the post has been on individual user's timelines. Data collected from viral reach can be used in multiple ways, for example, it can be used to analyze the type of content that gets shared or commented on and can be further used to compare to other posts. === Engaged users === This refers to the number of individual users who have clicked and interacted with a post on Facebook. == Reach on Twitter == Twitter gives access to any of their users to analytics of their tweets as well as their followers. Their dashboard is user friendly, which allows anyone to take a look at the analytics behind their Twitter account. This open access is useful for both the average user and companies as it can provide a quick glance or general outlook of who has seen their tweets. The way that Twitter works is slightly different than the way of Facebook in terms of the reach. On Twitter, especially for users with a higher profile, they are not only engaging with the people who follow them, but also with the followers of their own followers. The reach metric on Twitter looks at the quantity of Twitter users who have been engaged, but also the number of users that follow them as well. This metric is useful to see the if the tweets/content being shared on Twitter are contributing to the growth of audience on this platform. == Reach on Instagram == Instagram gives their users access to their reach, in the Instagram Insights section. Instagram insights can be used to learn more about an account's followers and performance. Reach indicates the total number of unique Instagram accounts that have seen your Instagram post or story. You can find this data by looking at each individual post insights. == Uses of reach == The reach can be a useful metric to analyze for marketers and advertisers. Social media is a platform that is used by marketers to directly target their intended audience with ease. These platforms not only allow marketers to get a better understanding of their audience, but also allow advertisers to insert their ads onto the timelines of specific users to later be able to conduct research to see the reach of their posts/content. The basic goal of marketers is to increase their reach as much as possible to impact bigger audiences of their dream customers and, in the end, make more sales. When doing organic social media marketing, using paid methods like ads or doing influencer marketing whether it is paid or free, it allows marketers to track the performance of their strategy and tweak it based on what works and what does not. == Analytics and reach == Social analytics looks at the data collected based on the interactions of users on social media platforms. A lot of information can be gathered which can provide intel based on user activities on social media. When looking into analytics in regard to social media, each company or group has a different goal in mind to engage their audience. At a glance, the three might seem as if they are very similar, however the differences between them are significant. There are many aspects that can be analyzed from the data gathered from social media platforms, depending on what is being observed, the correct metric would then be selected to further analyze. One example of the many metrics that can be used through social analytics is the reach. == Reach formula == To calculate social media reach one can use the following formula: R = I f ¯ {\displaystyle R={\frac {I}{\bar {f}}}} where R {\displaystyle R} — is social media reach, I {\displaystyle I} stands for the number of impressions, f ¯ {\displaystyle {\bar {f}}} is the average frequency of impressions per user. f ¯ {\displaystyle {\bar {f}}} represents the number of events when the ad is shown to a particular user. The average value should be calculated over the time period with stable settings of advertisement campaign. == Commenting For Better Reach == Commenting For Better Reach also known as "CFBR" is a widely used strategy for organically boosting post reach on social media platforms. Algorithms tend to favor posts with substantial likes and comments, granting them broader exposure compared to less engaging content. Primarily seen on LinkedIn, a platform geared toward professional networking and business connections, the use of CFBR signals active engagement aimed at enhancing post visibility. It is important to note that genuine and meaningful comments are key to effective engagement. Spammy or irrelevant comments not only detract from the conversation but may also limit a post's potential reach and impact.

    Read more →
  • Undeniable signature

    Undeniable signature

    An undeniable signature is a digital signature scheme which allows the signer to be selective to whom they allow to verify signatures. The scheme adds explicit signature repudiation, preventing a signer later refusing to verify a signature by omission; a situation that would devalue the signature in the eyes of the verifier. It was invented by David Chaum and Hans van Antwerpen in 1989. == Overview == In this scheme, a signer possessing a private key can publish a signature of a message. However, the signature reveals nothing to a recipient/verifier of the message and signature without taking part in either of two interactive protocols: Confirmation protocol, which confirms that a candidate is a valid signature of the message issued by the signer, identified by the public key. Disavowal protocol, which confirms that a candidate is not a valid signature of the message issued by the signer. The motivation for the scheme is to allow the signer to choose to whom signatures are verified. However, that the signer might claim the signature is invalid at any later point, by refusing to take part in verification, would devalue signatures to verifiers. The disavowal protocol distinguishes these cases removing the signer's plausible deniability. It is important that the confirmation and disavowal exchanges are not transferable. They achieve this by having the property of zero-knowledge; both parties can create transcripts of both confirmation and disavowal that are indistinguishable, to a third-party, of correct exchanges. The designated verifier signature scheme improves upon deniable signatures by allowing, for each signature, the interactive portion of the scheme to be offloaded onto another party, a designated verifier, reducing the burden on the signer. == Zero-knowledge protocol == The following protocol was suggested by David Chaum. A group, G, is chosen in which the discrete logarithm problem is intractable, and all operation in the scheme take place in this group. Commonly, this will be the finite cyclic group of order p contained in Z/nZ, with p being a large prime number; this group is equipped with the group operation of integer multiplication modulo n. An arbitrary primitive element (or generator), g, of G is chosen; computed powers of g then combine obeying fixed axioms. Alice generates a key pair, randomly chooses a private key, x, and then derives and publishes the public key, y = gx. === Message signing === Alice signs the message, m, by computing and publishing the signature, z = mx. === Confirmation (i.e., avowal) protocol === Bob wishes to verify the signature, z, of m by Alice under the key, y. Bob picks two random numbers: a and b, and uses them to blind the message, sending to Alice: c = magb. Alice picks a random number, q, uses it to blind, c, and then signing this using her private key, x, sending to Bob: s1 = cgq ands2 = s1x. Note that s1x = (cgq)x = (magb)xgqx = (mx)a(gx)b+q = zayb+q. Bob reveals a and b. Alice verifies that a and b are the correct blind values, then, if so, reveals q. Revealing these blinds makes the exchange zero knowledge. Bob verifies s1 = cgq, proving q has not been chosen dishonestly, and s2 = zayb+q, proving z is valid signature issued by Alice's key. Note that zayb+q = (mx)a(gx)b+q. Alice can cheat at step 2 by attempting to randomly guess s2. === Disavowal protocol === Alice wishes to convince Bob that z is not a valid signature of m under the key, gx; i.e., z ≠ mx. Alice and Bob have agreed an integer, k, which sets the computational burden on Alice and the likelihood that she should succeed by chance. Bob picks random values, s ∈ {0, 1, ..., k} and a, and sends: v1 = msga and v2 = zsya, where exponentiating by a is used to blind the sent values. Note that v2 = zsya = (mx)s(gx)a = v1x. Alice, using her private key, computes v1x and then the quotient, v1xv2−1 = (msga)x(zsgxa)−1 = msxz−s = (mxz−1)s. Thus, v1xv2−1 = 1, unless z ≠ mx. Alice then tests v1xv2−1 for equality against the values: (mxz−1)i for i ∈ {0, 1, …, k}; which are calculated by repeated multiplication of mxz−1 (rather than exponentiating for each i). If the test succeeds, Alice conjectures the relevant i to be s; otherwise, she conjectures random value. Where z = mx, (mxz−1)i = v1xv2−1 = 1 for all i, s is unrecoverable. Alice commits to i: she picks a random r and sends hash(r, i) to Bob. Bob reveals a. Alice confirms that a is the correct blind (i.e., v1 and v2 can be generated using it), then, if so, reveals r. Revealing these blinds makes the exchange zero knowledge. Bob checks hash(r, i) = hash(r, s), proving Alice knows s, hence z ≠ mx. If Alice attempts to cheat at step 3 by guessing s at random, the probability of succeeding is 1/(k + 1). So, if k = 1023 and the protocol is conducted ten times, her chances are 1 to 2100.

    Read more →
  • Myrinet

    Myrinet

    Myrinet, ANSI/VITA 26-1998, is a high-speed local area networking system designed by the company Myricom to be used as an interconnect between multiple machines to form computer clusters. == Description == Myrinet was promoted as having lower protocol overhead than standards such as Ethernet, and therefore better throughput, less interference, and lower latency while using the host CPU. Although it can be used as a traditional networking system, Myrinet is often used directly by programs that "know" about it, thereby bypassing a call into the operating system. Earlier versions of Myrinet used a variety of media and connectors: Generation 2 used copper media with DC-37 (Myrinet-LAN, M2L- controllers and switches) or microribbon (Myrinet-SAN, M2M-) connectors. Generation 3 used copper media with HSSDC (Myrinet-Serial, M3S-) or microribbon (Myrinet-SAN, M3M-) connectors, or fiber with LC-connectors (Myrinet-Fiber, M3F-). The later versions of Myrinet physically consist of two fibre optic cables, upstream and downstream, connected to the host computers with a single connector. Machines are connected via low-overhead routers and switches, as opposed to connecting one machine directly to another. Myrinet includes a number of fault-tolerance features, mostly backed by the switches. These include flow control, error control, and "heartbeat" monitoring on every link. The "fourth-generation" Myrinet, called Myri-10G, supported a 10 Gbit/s data rate and can use 10 Gigabit Ethernet on PHY, the physical layer (cables, connectors, distances, signaling). Myri-10G started shipping at the end of 2005. Myrinet was approved in 1998 by the American National Standards Institute for use on the VMEbus as ANSI/VITA 26-1998. One of the earliest publications on Myrinet is a 1995 IEEE article. === Performance === Myrinet is a lightweight protocol with little overhead that allows it to operate with throughput close to the basic signaling speed of the physical layer. For supercomputing, the low latency of Myrinet is even more important than its throughput performance, since, according to Amdahl's law, a high-performance parallel system tends to be bottlenecked by its slowest sequential process, which in all but the most embarrassingly parallel supercomputer workloads is often the latency of message transmission across the network. === Deployment === According to Myricom, 141 (28.2%) of the June 2005 TOP500 supercomputers used Myrinet technology. In the November 2005 TOP500, the number of supercomputers using Myrinet was down to 101 computers, or 20.2%, in November 2006, 79 (15.8%), and by November 2007, 18 (3.6%), a long way behind gigabit Ethernet at 54% and InfiniBand at 24.2%. In the June 2014 TOP500 list, the number of supercomputers using Myrinet interconnect was 1 (0.2%). In November 2013, the assets of Myricom (including the Myrinet technology) were acquired by CSP Inc. In 2016, it was reported that Google had also offered to buy the company.

    Read more →
  • Tweak programming environment

    Tweak programming environment

    Tweak is a graphical user interface (GUI) layer written by Andreas Raab for the Squeak development environment, which in turn is an integrated development environment based on the Smalltalk-80 computer programming language. Tweak is an alternative to an earlier graphic user interface layer called Morphic. Development began in 2001. Applications that use the Tweak software include Sophie (version 1), a multimedia and e-book authoring system, and a family of virtual world systems: Open Cobalt, Teleplace, OpenQwaq, 3d ICC's Immersive Terf and the Croquet Project. == Influences == An experimental version of Etoys, a programming environment for children, used Tweak instead of Morphic. Etoys was a major influence on a similar Squeak-based programming environment known as Scratch.

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • WhoSay

    WhoSay

    WhoSay was an American social media service and branding platform for celebrities and their fans. Founded in Los Angeles in 2010, with financing by Creative Artists Agency (CAA), Amazon.com and other investors, it is notable for allowing its users to retain ownership rights over the content that they post to their accounts, through copyright branding, and for enabling users to post content to other social media sites like Twitter, Facebook, Instagram and Tumblr simultaneously. WhoSay describes itself as a "social celebrity magazine" whose editorial team keeps its users informed about the latest celebrity and entertainment news. Clients such as Dylan McDermott and Chris Rock lauded the service for its ability to add content to multiple social network sites easily. Rock in particular has commented on its ease of use for those who are not part of a tech-savvy demographic, commenting, "It's perfect for someone that's not 25." WhoSay's competitors included theAudience, which is operated by the William Morris Endeavor. == History == WhoSay was founded in March 2010, by Steve Ellis and the Los Angeles-based talent agency Creative Artists Agency (CAA). It was financed through investments Amazon.com (who along with CAA, holds a minority stake in the company), Comcast, Greylock Partners, and High Peak Ventures. The company's main headquarters are in The New York Times Building in Manhattan, with additional headquarters in CAA's office building in the Silicon Beach area of Los Angeles, and in London. The company was founded to protect celebrities' intellectual property and enable the celebrities themselves to profit themselves from their own content through copyright branding. Its chief executive is co-founder Steve Ellis, who, after leaving Getty Images, was contacted by CAA, who were looking to resolve the issue of celebrities losing the rights to their own photos and videos when uploading them to social network sites. Ellis explained WhoSay's mission thus: "We work with people who are constantly being utilized by third parties for the wrong reasons. [The company was formed] to give celebrities and other influential people a set of tools to allow them to manage and control their presence in the digital world." In this way, WhoSay is likened by Ellis to "a People magazine by the people themselves who are in it." The company started slowly, until CAA client Tom Hanks signed onto WhoSay three months after the service's launch. The company continued to maintain a low profile for the first three years of operation, during which it accumulated a client list of 1,500 actors, musicians and artists. Clients are accepted by the service on an invitation-only basis, although they are not restricted to Creative Artists clients. Among them are Kelly Clarkson, Julia Louis-Dreyfus, Paula Patton, Kevin Spacey, Jim Carrey, John Cusack, Bill Maher, Johnny Knoxville, Chelsea Handler, Eva Longoria, Spike Lee, Enrique Iglesias and Katie Couric. Clients are not charged for the service, and are given a share of any revenue that is generated by advertisements. They are also given the ability share in the database of e-mail addresses that come with registration, in order to communicate directly with fans. Actor Dylan McDermott was introduced to WhoSay by his agent, as a way of easily posting content to Facebook, Twitter, Tumblr and even China's Tencent social network with relative ease. McDermott comments, "When you put something out there, you can hit everything at one time. It makes it easy for me." Comedian Chris Rock has commented that WhoSay is ideal for people like him have developed difficulty in keeping track of different websites as they get older, saying, "It's perfect for someone that's not 25." In September 2013 WhoSay introduced a mobile application for consumers. By October 2013, the company's website attracted 12 million monthly visitors. In July 2014 Rob Gregory left his role as president of Newsweek's The Daily Beast to become WhoSay's chief revenue officer. Among his responsibilities are developing ways to monetize WhoSay's web and mobile products, such as premium advertising strategies and brand partnerships. WhoSay does not allow consumers to create accounts, nor does it include search features, making it difficult to access a celebrity's account unless a user is directed there from one of their other social pages. According to Ellis, consumers have enough social media choices, saying, "Frankly they don't really need the services that we provide, and there are a lot of very specific features built into our service that really only benefit someone who is of a high profile." By February 2015, WhoSay had amassed 4.8 million unique users, and expanded its accounts to companies that employ celebrities for branded content. Such companies include Lexus, which partnered with the company to promote a campaign in which actress Rosario Dawson, during the lead up to the 87th Academy Awards, released five short videos on her social media accounts. The videos feature her driving through Los Angeles in preparation for the grand opening of her pop-up store, which sells Studio One Eighty Nine, a clothing line tied to her foundation promoting African culture and content. That April, WhoSay partnered with Chevrolet's #BestDayEver social media campaign for April Fool's Day, enlisting Olivia Wilde, Norman Reedus, Alec Baldwin, Ian Somerhalder, and Nikki Reed to surprise students in four U.S. classrooms as their substitute teachers. For example, Baldwin, dressed as Abraham Lincoln, surprised students in an Occidental College class on U.S. Culture and Society. Other companies that WhoSay has partnered with include KFC, JCPenney, Dunkin' Donuts and Crest. In January 2018, the website was acquired by Viacom (now Paramount Global).

    Read more →
  • Communications security

    Communications security

    Communications security is the discipline of preventing unauthorized interceptors from accessing telecommunications in an intelligible form, while still delivering content to the intended recipients. In the North Atlantic Treaty Organization culture, including United States Department of Defense culture, it is often referred to by the abbreviation COMSEC. The field includes cryptographic security, transmission security, emissions security and physical security of COMSEC equipment and associated keying material. COMSEC is used to protect both classified and unclassified traffic on military communications networks, including voice, video, and data. It is used for both analog and digital applications, and both wired and wireless links. Voice over secure internet protocol VOSIP has become the de facto standard for securing voice communication, replacing the need for Secure Terminal Equipment (STE) in much of NATO, including the U.S.A. USCENTCOM moved entirely to VOSIP in 2008. == Specialties == Cryptographic security: The component of communications security that results from the provision of technically sound cryptosystems and their proper use. This includes ensuring message confidentiality and authenticity. Emission security (EMSEC): The protection resulting from all measures taken to deny unauthorized persons information of value that might be derived from communications systems and cryptographic equipment intercepts and the interception and analysis of compromising emanations from cryptographic equipment, information systems, and telecommunications systems. Transmission security (TRANSEC): The component of communications security that results from the application of measures designed to protect transmissions from interception and exploitation by means other than cryptanalysis (e.g. frequency hopping and spread spectrum). Physical security: The component of communications security that results from all physical measures necessary to safeguard classified equipment, material, and documents from access thereto or observation thereof by unauthorized persons. == Related terms == ACES – Automated Communications Engineering Software AEK – Algorithmic Encryption Key AKMS – the Army Key Management System CCI – Controlled Cryptographic Item - equipment which contains COMSEC embedded devices CT3 – Common Tier 3 DTD – Data Transfer Device ICOM – Integrated COMSEC, e.g. a radio with built in encryption KEK – Key Encryption Key KG-30 – family of COMSEC equipment KOI-18 – Tape Reader General Purpose KPK – Key production key KYK-13 – Electronic Transfer Device KYX-15 – Electronic Transfer Device LCMS – Local COMSEC Management Software OTAR – Over the Air Rekeying OWK – Over the Wire Key SKL – Simple Key Loader SOI – Signal operating instructions STE – Secure Terminal Equipment (secure phone) STU-III – (obsolete secure phone, replaced by STE) TED – Trunk Encryption Device such as the WALBURN/KG family TEK – Traffic Encryption Key TPI – Two person integrity TSEC – Telecommunications Security (sometimes referred to in error transmission security or TRANSEC) Types of COMSEC equipment: Authentication equipment Crypto equipment: Any equipment that embodies cryptographic logic or performs one or more cryptographic functions (key generation, encryption, and authentication). Crypto-ancillary equipment: Equipment designed specifically to facilitate efficient or reliable operation of crypto-equipment, without performing cryptographic functions itself. Crypto-production equipment: Equipment used to produce or load keying material == DoD Electronic Key Management System == The Electronic Key Management System (EKMS) is a United States Department of Defense (DoD) key management, COMSEC material distribution, and logistics support system. The National Security Agency (NSA) established the EKMS program to supply electronic key to COMSEC devices in securely and timely manner, and to provide COMSEC managers with an automated system capable of ordering, generation, production, distribution, storage, security accounting, and access control. The Army's platform in the four-tiered EKMS, AKMS, automates frequency management and COMSEC management operations. It eliminates paper keying material, hardcopy Signal operating instructions (SOI) and saves the time and resources required for courier distribution. It has 4 components: LCMS provides automation for the detailed accounting required for every COMSEC account, and electronic key generation and distribution capability. ACES is the frequency management portion of AKMS. ACES has been designated by the Military Communications Electronics Board as the joint standard for use by all services in development of frequency management and crypto-net planning. CT3 with DTD software is in a fielded, ruggedized hand-held device that handles, views, stores, and loads SOI, Key, and electronic protection data. DTD provides an improved net-control device to automate crypto-net control operations for communications networks employing electronically keyed COMSEC equipment. SKL is a hand-held PDA that handles, views, stores, and loads SOI, Key, and electronic protection data. == Key Management Infrastructure (KMI) Program == KMI is intended to replace the legacy Electronic Key Management System to provide a means for securely ordering, generating, producing, distributing, managing, and auditing cryptographic products (e.g., asymmetric keys, symmetric keys, manual cryptographic systems, and cryptographic applications). This system is currently being fielded by Major Commands and variants will be required for non-DoD Agencies with a COMSEC Mission.

    Read more →
  • Foundry VTT

    Foundry VTT

    Foundry Virtual Tabletop, commonly shortened to Foundry VTT or FVTT, is a commercial, self-hosted virtual tabletop application for role-playing games. It provides a stage for visualizing the game environment and tools allowing the game master and players to organize and track statistics and notes. The software is highly modular and depends on the community-maintained ecosystem of add-on modules that modify the software's behavior and implement different game systems. Perpetual licenses, which include updates, are offered for a one-time fee. == Features == Foundry Virtual Tabletop is a highly modular Node.js web application that is run locally by the Gamemaster or hosted on a remote server. Players connect to their gamemaster's Foundry VTT instance over the network using their web browser. It is system-agnostic in that its core feature-set is not restricted to a specific game system. Systems, specific features and game content are implemented as add-on modules, which can be individually downloaded from a public repository. The module repository contains paid, official content, as well as freely available community-made modules that enhance functionality of the software. As of May 2025, 350 individual game systems are implemented as modules. Individual settings created by the Game Master are termed Worlds in the interface and contain the list of modules that should be loaded as well as world-specific content, which can be added by the gamemaster. This content is grouped into Scenes, Actors, Items and Journals. Battle and world maps are created as Scenes, which contain the backdrop and data on placement of walls, light sources and other entities. Tokens representing Actors, which are player characters, vehicles or NPCs, can be placed on these Scenes to be moved by the user that owns them. Other entities that interact or integrate with actors are termed Items; these can be objects, but also game system-specific concepts such as character classes. Journals are text documents that can link to other entities present in the World or modules. Viewing and editing permissions can be set individually for each entity. The software features a custom lighting engine that determines visibility of certain areas on each battle map depending on the position of players' characters, also revealing areas covered by fog of war. It also contains tools for map creation and comes with a small asset library. == History == Foundry Gaming LLC founder Andrew Clayton, commonly known under his online nickname Atropos, began development of Foundry VTT in 2018 for personal use after becoming dissatisfied with the feature set and business models of other virtual tabletops. Foundry VTT was initially developed for Linux, which remains its primary platform, with support for other platforms having been developed later. Foundry Gaming LLC was incorporated in Spokane, Washington on October 9, 2018, with the software remaining in private beta-testing until May 2020, when it was publicly released. In November 2020, Cubicle 7 partnered with Foundry to bring official content modules for its game system Warhammer Fantasy Roleplay to Foundry VTT. Later, in 2025, Clayton would state that this first major publisher deal was of significant importance to Foundry VTT's growth and credits the community developers of the WFRP system module for making it possible in the first place. In November 2023, Paizo partnered with Foundry to bring official content modules for Pathfinder Roleplaying Game to Foundry VTT. In January 2024, Foundry publicly announced its partnership with Wizards of the Coast in bringing official Dungeons & Dragons content to Foundry VTT, with the first official module, Phandelver and Below: The Shattered Obelisk, having been released in February 2024. == Development == As of 2023, the Foundry VTT software itself is being developed and managed by a team of 9 people, while a content team of 12 people is working with partnered publishers to compile content into downloadable modules. The content team also develops in-house content published by Foundry Gaming LLC. Stated goals are to create a virtual tabletop software that offers a one-time purchase and content ownership, make use of modern web technologies, and provide a platform for developers to build upon. Clayton has stated that integration of Generative AI into Foundry VTT is not planned, citing ethical and legal concerns and calling its usage within the industry a "betrayal of the creative people who made the TTRPG industry what it is in the first place". == Reception == Foundry VTT is one of the most popular virtual tabletops for TTRPGs; in particular, as a self-hosted web-based VTT, it is known as a modern alternative to the software as a service Roll20. Wargamer named it one of the three "best virtual tabletops for D&D in 2023", noting its active community and high degree of technical complexity, which allows for customization not seen in other products at the cost of a much steeper learning curve. Comic Book Resources called it an "underrated gem" and "incredibly versatile" for similar reasons, while also praising its lighting engine and visual fidelity. As the previously mentioned outlets do, Foundry's modular ecosystem and technical implementation are often mentioned as good features, but also as a source of frustration for new users. In a video interview, Clayton acknowledges this issue and affirms that the development team intends to make usage of more technical features "friction-less" and will reduce module breakage between updates in the future.

    Read more →
  • Backup

    Backup

    In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of IT disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server. A backup system contains at least one copy of all data considered worth saving. The data storage requirements can be large. An information repository model may be used to provide structure to this storage. There are different types of data storage devices used for copying backups of data that is already in secondary storage onto archive files. There are also different ways these devices can be arranged to provide geographic dispersion, data security, and portability. Data is selected, extracted, and manipulated for storage. The process can include methods for dealing with live data, including open files, as well as compression, encryption, and de-duplication. Additional techniques apply to enterprise client-server backup. Backup schemes may include dry runs that validate the reliability of the data being backed up. There are limitations and human factors involved in any backup scheme. == Storage == A backup strategy requires an information repository, "a secondary storage space for data" that aggregates backups of data "sources". The repository could be as simple as a list of all backup media (DVDs, etc.) and the dates produced, or could include a computerized index, catalog, or relational database. === 3-2-1 Backup Rule === The backup data needs to be stored, requiring a backup rotation scheme, which is a system of backing up data to computer media that limits the number of backups of different dates retained separately, by appropriate re-use of the data storage media by overwriting of backups no longer needed. The scheme determines how and when each piece of removable storage is used for a backup operation and how long it is retained once it has backup data stored on it. The 3-2-1 rule can aid in the backup process. It states that there should be at least 3 copies of the data, stored on 2 different types of storage media, and one copy should be kept offsite, in a remote location (this can include cloud storage). 2 or more different media should be used to eliminate data loss due to similar reasons (for example, optical discs may tolerate being underwater while LTO tapes may not, and SSDs cannot fail due to head crashes or damaged spindle motors since they do not have any moving parts, unlike hard drives). An offsite copy protects against fire, theft of physical media (such as tapes or discs) and natural disasters like floods and earthquakes. Physically protected hard drives are an alternative to an offsite copy, but they have limitations like only being able to resist fire for a limited period of time, so an offsite copy still remains as the ideal choice. Because there is no perfect storage, many backup experts recommend maintaining a second copy on a local physical device, even if the data is also backed up offsite. === Backup methods === ==== Unstructured ==== An unstructured repository may simply be a stack of tapes, DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but unlikely to achieve a high level of recoverability as it lacks automation. ==== Full only/System imaging ==== A repository using this backup method contains complete source data copies taken at one or more specific points in time. Copying system images, this method is frequently used by computer technicians to record known good configurations. However, imaging is generally more useful as a way of deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems. ==== Incremental ==== An incremental backup stores data changed since a reference point in time. Duplicate copies of unchanged data are not copied. Typically a full backup of all files is made once or at infrequent intervals, serving as the reference point for an incremental repository. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals. Some backup systems can create a synthetic full backup from a series of incrementals, thus providing the equivalent of frequently doing a full backup. When done to modify a single archive file, this speeds restores of recent versions of files. ==== Near-CDP ==== Continuous Data Protection (CDP) refers to a backup that instantly saves a copy of every change made to the data. This allows restoration of data to any point in time and is the most comprehensive and advanced data protection. Near-CDP backup applications—often marketed as "CDP"—automatically take incremental backups at a specific interval, for example every 15 minutes, one hour, or 24 hours. They can therefore only allow restores to an interval boundary. Near-CDP backup applications use journaling and are typically based on periodic "snapshots", read-only copies of the data frozen at a particular point in time. Near-CDP (except for Apple Time Machine) intent-logs every change on the host system, often by saving byte or block-level differences rather than file-level differences. This backup method differs from simple disk mirroring in that it enables a roll-back of the log and thus a restoration of old images of data. Intent-logging allows precautions for the consistency of live data, protecting self-consistent files but requiring applications "be quiesced and made ready for backup." Near-CDP is more practicable for ordinary personal backup applications, as opposed to true CDP, which must be run in conjunction with a virtual machine or equivalent and is therefore generally used in enterprise client-server backups. Software may create copies of individual files such as written documents, multimedia projects, or user preferences, to prevent failed write events caused by power outages, operating system crashes, or exhausted disk space, from causing data loss. A common implementation is an appended ".bak" extension to the file name. ==== Reverse incremental ==== A Reverse incremental backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using hard links—as Apple Time Machine does, or using binary diffs. ==== Differential ==== A differential backup saves only the data that has changed since the last full backup. This means a maximum of two backups from the repository are used to restore the data. However, as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup. A differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Changes in files may be detected through a more recent date/time of last modification file attribute, and/or changes in file size. Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files. === Storage media === Regardless of the repository model that is used, the data has to be copied onto an archive file data storage medium. The medium used is also referred to as the type of backup destination. ==== Magnetic tape ==== Magnetic tape was for a long time the most commonly used medium for bulk data storage, backup, archiving, and interchange. It was previously a less expensive option, but this is no longer the case for smaller amounts of data. Tape is a sequential access medium, so the rate of continuously writing or reading data can be very fast. While tape media itself has a low cost per space, tape drives are typically dozens of times as expensive as hard disk drives and optical drives. Tape media are generally rotated on a schedule so at least one set is off-site in case something should happe

    Read more →
  • Content engineering

    Content engineering

    Content engineering is a term applied to an engineering specialty dealing with the complexities around the use of content in computer-facilitated environments. Content authoring and production, content management, content modeling, content conversion, and content use and repurposing are all areas involving this practice. It is not a specialty with wide industry recognition and is often performed on an ad hoc basis by members of software development or content production or marketing staff, but is beginning to be recognized as a necessary function in any complex content-centric project involving both content production as well as software system development mainly involving content management systems (CMS) or digital experience platforms (DXP). Content engineering tends to bridge the gap between groups involved in the production of content (publishing and editorial staff, marketing, sales, human resources) and more technologically oriented departments such as software development, or IT that put this content to use in web or other software-based environments, and requires an understanding of the issues and processes of both sides. Typically, content engineering involves extensive use of embedded XML technologies, XML being the most widespread language for representing structured content. Content management systems are a key technology often used in the practice of content engineering. == Definition == Content engineering is the practice of organizing the shape and structure of content by deploying content and metadata models, in authoring and publishing processes in a manner that meets the requirements of an organization's Content Strategy, and its implementation through the use of technology such as CMS, XML, schema markup, artificial intelligence, APIs and others. == Purpose and goal == In very general terms, content engineering practices aim to maximize the ROI of content through content reuse and improving efficiency of content marketing, content operations, content strategy. Content engineering can help address content challenges that fairly typical organizations face: Siloed content supply chains Duplicate content in a myriad of formats Inefficient content authoring workflows Chunky, unstructured content Outdated technology Technology in place does not match needs Inability to reuse content across channels (multi-channel content) Metadata and schema are not used Lack of standards for metadata Lack of findability of content for internal and external use Poor SEO performance Inability to implement personalization == Key skills == Content engineering draws on a combination of technical, strategic, and editorial competencies. Practitioners typically require proficiency across several domains: === Content modeling and information architecture === Content engineers design structured content models that define how content is created, stored, and distributed. This includes building taxonomies, ontologies, and metadata schemas that enable content reuse across channels and platforms. === Structured content and markup languages === Proficiency in XML, JSON, HTML, and schema.org markup is fundamental. Content engineers use these languages to structure content for machine readability, search engine optimization, and interoperability between systems. === Content management systems and platforms === Content engineers require working knowledge of content management systems (CMS), digital experience platforms (DXP), and headless CMS architectures. This includes configuring content types, workflows, and publishing pipelines within these systems. === Workflow design and automation === Designing and implementing content workflows - from authoring through review, approval, and distribution - is a core function. Increasingly, this involves configuring AI-assisted and agentic workflows that automate research, drafting, repurposing, and distribution tasks at scale. === Content strategy and editorial understanding === Unlike purely technical roles, content engineering requires a working understanding of content strategy, brand management, editorial standards, and audience analysis. Content engineers must translate strategic objectives into technical content structures and system configurations. === API integration and data interoperability === Content engineers work with APIs to connect content systems, analytics platforms, distribution channels, and third-party services. Understanding how content flows between systems is essential for enabling multi-channel publishing and content personalization. === Analytics and performance measurement === Measuring content effectiveness through web analytics, SEO performance data, and engagement metrics informs how content engineers refine structures, metadata, and distribution workflows. == The role of a content engineer == Content engineers bridge the divide between content strategists and producers and the developers and content managers who publish and distribute content. But rather than simply wedging themselves between these players, content engineers help define and facilitate the content structure during the entire content strategy, production and distribution cycle from beginning to end. As the role has evolved, content engineers are increasingly expected to build and manage AI-powered content systems, moving beyond traditional CMS configuration into agentic workflows that automate content research, production, and distribution. By integrating skills in business and technology, content engineers do not see content as static or finished. Rather, they look at the value of the content and how it can best be adapted and personalized to serve customers and emerging content platforms, technologies, and opportunities. === Create customer experience === Content marketing suffers from two fundamental limitations that constrain the true power and potential that a great content marketing plan can bring to a business' bottom line: Content relevance: how to make content more relevant and personalized to their audiences. The marketer and content strategist direct the customer experience itself, and the content engineer makes it happen with content structure, schema, metadata, microdata, taxonomy, and CMS topology. Content agility: Marketers who are burdened with one-size-fits-all content remain stuck managing their content rather than their customers' experience. Content engineers give marketers the "super powers" to move content-powered experiences across interfaces and personalization variants. === Break down barriers === Empower content strategists: Content engineers work with content strategists by helping them connect content not as a fixed message, but as a modular construct which can be channeled and manipulated. Enable content producers: A content engineer will work with a content producer by helping to find new sources of content and ways the content can be combined and presented. Guide and free developers: The content engineer helps translate marketing strategy into clear technical needs and functions developers can build into content management systems Enhance content management: Develop content structures that make it easier for content writers and content managers to author to a single, very usable, interface for even complex content types that might contain dozens of elements. Engineer content for success: Content engineers help all members of a marketing team work more smoothly, with the support and structures needed to get the most out of the content they produce. === Salary benchmarks === Content engineering roles command significantly higher salaries than traditional content marketing positions. In the United States, IC-level content engineers earn between $120,000 and $165,000 annually, while senior roles reach $160,000 to $220,000. Head of content engineering positions range from $200,000 to $280,000, and VP-level roles can exceed $375,000. The emergence of dedicated content engineer job postings from companies such as Exit Five reflects the growing recognition of the role as a distinct function within marketing organizations.

    Read more →
  • Plaintext

    Plaintext

    In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted. == Overview == With the advent of computing, the term plaintext expanded beyond human-readable documents to mean any data, including binary files, in a form that can be viewed or used without requiring a key or other decryption device. Information—a message, document, file, etc.—if to be communicated or stored in an unencrypted form is referred to as plaintext. Plaintext is used as input to an encryption algorithm; the output is usually termed ciphertext, particularly when the algorithm is a cipher. Codetext is less often used, and almost always only when the algorithm involved is actually a code. Some systems use multiple layers of encryption, with the output of one encryption algorithm becoming "plaintext" input for the next. == Secure handling == Insecure handling of plaintext can introduce weaknesses into a cryptosystem by letting an attacker bypass the cryptography altogether. Plaintext is vulnerable in use and in storage, whether in electronic or paper format. Physical security means the securing of information and its storage media from physical, attack—for instance by someone entering a building to access papers, storage media, or computers. Discarded material, if not disposed of securely, may be a security risk. Even shredded documents and erased magnetic media might be reconstructed with sufficient effort. If plaintext is stored in a computer file, the storage media, the computer and its components, and all backups must be secure. Sensitive data is sometimes processed on computers whose mass storage is removable, in which case physical security of the removed disk is vital. In the case of securing a computer, useful (as opposed to handwaving) security must be physical (e.g., against burglary, brazen removal under cover of supposed repair, installation of covert monitoring devices, etc.), as well as virtual (e.g., operating system modification, illicit network access, Trojan programs). Wide availability of keydrives, which can plug into most modern computers and store large quantities of data, poses another severe security headache. A spy (perhaps posing as a cleaning person) could easily conceal one, and even swallow it if necessary. Discarded computers, disk drives and media are also a potential source of plaintexts. Most operating systems do not actually erase anything— they simply mark the disk space occupied by a deleted file as 'available for use', and remove its entry from the file system directory. The information in a file deleted in this way remains fully present until overwritten at some later time when the operating system reuses the disk space. With even low-end computers commonly sold with many gigabytes of disk space and rising monthly, this 'later time' may be months later, or never. Even overwriting the portion of a disk surface occupied by a deleted file is insufficient in many cases. Peter Gutmann of the University of Auckland wrote a celebrated 1996 paper on the recovery of overwritten information from magnetic disks; areal storage densities have gotten much higher since then, so this sort of recovery is likely to be more difficult than it was when Gutmann wrote. Modern hard drives automatically remap failing sectors, moving data to good sectors. This process makes information on those failing, excluded sectors invisible to the file system and normal applications. Special software, however, can still extract information from them. Some government agencies (e.g., US NSA) require that personnel physically pulverize discarded disk drives and, in some cases, treat them with chemical corrosives. This practice is not widespread outside government, however. Garfinkel and Shelat (2003) analyzed 158 second-hand hard drives they acquired at garage sales and the like, and found that less than 10% had been sufficiently sanitized. The others contained a wide variety of readable personal and confidential information. See data remanence. Physical loss is a serious problem. The US State Department, Department of Defense, and the British Secret Service have all had laptops with secret information, including in plaintext, lost or stolen. Appropriate disk encryption techniques can safeguard data on misappropriated computers or media. On occasion, even when data on host systems is encrypted, media that personnel use to transfer data between systems is plaintext because of poorly designed data policy. For example, in October 2007, HM Revenue and Customs lost CDs that contained the unencrypted records of 25 million child benefit recipients in the United Kingdom. Modern cryptographic systems resist known plaintext or even chosen plaintext attacks, and so may not be entirely compromised when plaintext is lost or stolen. Older systems resisted the effects of plaintext data loss on security with less effective techniques—such as padding and Russian copulation to obscure information in plaintext that could be easily guessed.

    Read more →
  • IBM ALP

    IBM ALP

    IBM Assembly Language Processor (ALP) is an assembler written by IBM for 32-bit OS/2 Warp (OS/2 3.0), which was released in 1994. ALP accepts source programs compatible with Microsoft Macro Assembler (MASM) version 5.1, which was originally used to build many of the device drivers included with OS/2. For OS/2 versions 3 and 4, ALP was distributed, along with other tools and documentation, as part of the Device Driver Kit (DDK). The DDK was withdrawn in 2004 as part of IBM's discontinuance of OS/2.

    Read more →
  • Data

    Data

    Data ( DAY-tə, US also DAT-ə, India: DEE-tə) is a collection of discrete or continuous values that conveys information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A data point or datum is an individual value in a collection of data. Data is usually organized into structures such as tables that provide additional context and meaning, and may itself be used as data in larger structures. Data may be used as variables in a computational process. Data may represent abstract ideas or concrete measurements. Data is commonly used in scientific research, economics, and virtually every other form of human organizational activity. Examples of data sets include price indices (such as the consumer price index), unemployment rates, literacy rates, and census data. In this context, data represents the raw facts and figures from which useful information can be extracted. Data is collected using techniques such as measurement, observation, query, or analysis, and is typically represented as numbers or characters that may be further processed. Field data is data that is collected in an uncontrolled, in-situ environment. Experimental data is data that is generated in the course of a controlled scientific experiment. Data is analyzed using techniques such as calculation, reasoning, discussion, presentation, visualization, or other forms of post-analysis. Prior to analysis, raw data (or unprocessed data) is typically cleaned: Outliers are removed, and obvious instrument or data entry errors are corrected. Data can be seen as the smallest unit of factual information that can be used as a basis for calculation, reasoning, or discussion. Data can range from abstract ideas to concrete measurements, including, but not limited to, statistics. Thematically connected data presented in some relevant context can be viewed as information. Contextually connected pieces of information can then be described as data insights or intelligence. The stock of insights and intelligence that accumulate over time, resulting from the synthesis of data into information, can then be described as knowledge. Data has been described as "the new oil of the digital economy". Data, as a general concept, refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing. Advances in computing technologies have led to the advent of big data, which generally refers to very large quantities of data, typically at the petabyte scale. If restricted to traditional data analysis methods and computing, working with such large (and growing) datasets is difficult, even impossible. In response, the relatively new field of data science uses machine learning (and other artificial intelligence) methods that allow for efficient applications of analytic methods to big data. == Etymology and terminology == The Latin word data is the plural of datum, "(thing) given," and the neuter past participle of dare, "to give". The first English use of the word "data" is from the 1640s. The word "data" was first used to mean "transmissible and storable computer information" in 1946. The expression "data processing" was first used in 1954. When "data" is used more generally as a synonym for "information", it is treated as a mass noun in singular form. This usage is common in everyday language and in technical and scientific fields such as software development and computer science. One example of this usage is the term "big data". When used more specifically to refer to the processing and analysis of sets of data, the term retains its plural form. This usage is common in the natural sciences, life sciences, social sciences, software development and computer science, and grew in popularity in the 20th and 21st centuries. Some style guides do not recognize the different meanings of the term and simply recommend the form that best suits the target audience of the guide. For example, APA style as of the 7th edition requires "data" to be treated as a plural form. == Meaning == Data, information, knowledge, and wisdom are closely related concepts, but each has its role concerning the other, and each term has its meaning. According to a common view, data is collected and analyzed; data only becomes information suitable for making decisions once it has been analyzed in some fashion. One can say that the extent to which a set of data is informative to someone depends on the extent to which it is unexpected by that person. The amount of information contained in a data stream may be characterized by its Shannon entropy. Knowledge is the awareness of its environment that some entity possesses, whereas data merely communicates that knowledge. For example, the entry in a database specifying the height of Mount Everest is a datum that communicates a precisely measured value. This measurement may be included in a book along with other data on Mount Everest to describe the mountain in a manner useful for those who wish to decide on the best method to climb it. Awareness of the characteristics represented by this data is knowledge. Data are often assumed to be the least abstract concept, information the next least, and knowledge the most abstract. In this view, data becomes information by interpretation; e.g., the height of Mount Everest is generally considered "data", a book on Mount Everest geological characteristics may be considered "information", and a climber's guidebook containing practical information on the best way to reach Mount Everest's peak may be considered "knowledge". "Information" bears a diversity of meanings that range from everyday usage to technical use. This view, however, has also been argued to reverse how data emerges from information, and information from knowledge. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation. Beynon-Davies uses the concept of a sign to differentiate between data and information; data is a series of symbols, while information occurs when the symbols are used to refer to something. Before the development of computing devices and machines, people had to manually collect data and impose patterns on it. With the development of computing devices and machines, these devices can also collect data. In the 2010s, computers were widely used in many fields to collect data and sort or process it, in disciplines ranging from marketing, analysis of social service usage by citizens to scientific research. These patterns in the data are seen as information that can be used to enhance knowledge. These patterns may be interpreted as "truth" (though "truth" can be a subjective concept) and may be authorized as aesthetic and ethical criteria in some disciplines or cultures. Events that leave behind perceivable physical or virtual remains can be traced back through data. Marks are no longer considered data once the link between the mark and observation is broken. Mechanical computing devices are classified according to how they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a piece of data as a sequence of symbols drawn from a fixed alphabet. The most common digital computers use a binary alphabet, that is, an alphabet of two characters typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet. Some special forms of data are distinguished. A computer program is a collection of data, that can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books. == Data sources == With respect to ownership of data collected in the course of marketing or other corporate collection, data has been characterized according to party depending on how close the data is to the source or if it has been generated through additional processing. "Zero-party data" refers to data that customers "intentionally and proactively shares". This kind of data can come from a variety of sources, including: subscriptions, preference centers, quizzes, surveys, pop-up forms, and interactive digital experiences. "First-party data" may be collected by a company directly from its customers. The secure exchange of first-party data among companies can be done using data clean rooms. "S

    Read more →
  • G.9970

    G.9970

    G.9970 (also known as G.hnta) is a Recommendation developed by ITU-T that describes the generic transport architecture for home networks and their interfaces to a provider's access network. G.9970 was developed by Study Group 15, Question 1. G.9970 received Consent on December 12, 2008 and was Approved on January 13, 2009. == Relationship with G.hn == G.9970 (G.hnta) and G.9960 (G.hn) are two ITU-T Recommendations that address home networking in a complementary manner. While G.9970 addresses layer 3 (network layer) of the home network architecture, G.9960 addresses layers 1 (physical layer) and 2 (data link layer).

    Read more →