Overcategorization

Overcategorization

Overcategorization or category clutter is a phenomenon during classification where too many categories or classes are assigned to a document, record, or item. Overcategorization is related to the library and information science (LIS) concepts of document classification and subject indexing. It is also related to online shopping where excessive product categories can overwhelm users with too many choices or make it more difficult for customers to find the products they need. Although these categories are intended to improve organization and ease of navigation when shipping online, too many categories can lower customer satisfaction, increase difficulty navigating the online store, and reduce future shopping intentions. In LIS, the ideal number of terms that should be assigned to classify an item are measured by the variables precision and recall. Assigning few category labels that are most closely related to the content of the item being classified will result in searches that have high precision, I.e., where a high proportion of the results are closely related to the query. Assigning more category labels to each item will reduce the precision of each search, but increase the recall, retrieving more relevant results. Related LIS concepts include exhaustivity of indexing and information overload. == Basic principles == If too many categories are assigned to a given document, the implications for users depend on how informative the links are. If the user is able to distinguish between useful and not useful links, the damage is limited: The user only wastes time selecting links. In many cases, however, the user cannot judge whether or not a given link will turn out to be fruitful. In that case he or she has to follow the link and to read or skim another document. The worst case scenario is, of course, that even after reading the new document the user is unable to decide whether or not it might be useful if its subject matter is not thoroughly investigated. Overcategorization also has another unpleasant implication: It makes the system (for example in Wikipedia) difficult to maintain in a consistent way. If the system is inconsistent, it means that when the user considers the links in a given category, he or she will not find all documents relevant to that category. Basically, the problem of overcategorization should be understood from the perspective of relevance and the traditional measures of recall and precision. If too few relevant categories are assigned to a document, recall may decrease. If too many non-relevant categories are assigned, precision becomes lower. The hard job is to say which categories are fruitful or relevant for future use of the document.

Telligent Community

Telligent Community is a community and collaboration software platform developed by Telligent Systems and was first released in 2004. Telligent Community is built on the Telligent Evolution platform, with a variety of core applications running on top of it such as blogs, forums, media galleries, and wikis. Additional applications from third parties using the API's and REST stack can be installed or integrated with the platform. Telligent Community is built with ASP.NET, C#, and Microsoft SQL Server. It is available as downloadable software that can be installed on a web server or via hosting providers. The current version is Verint Community 12.0 which was released February 2012. The product used to be named Community Server before being rebranded as part of the 5.0 release. == History == Telligent Systems was founded by Rob Howard in 2004, who was previously part of Microsoft's ASP.NET team. Telligent introduced its first product, Community Server, in the fall of 2004. Community Server was one of the first integrated community platforms that brought together blogs, photo galleries, wikis, forums, user profiles and more. Community Server was based on the merger of three then-widely used open source ASP.NET projects: the ASP.NET Forums, nGallery photo gallery, and .Text blog engine. The people behind those projects (Scott Watermasysk, Jason Alexander, and Rob Howard) joined together as Telligent Systems and along with several other software developers created Community Server 1.0. Between 2004 and 2009 Community Server steadily grew in scope, features, and capabilities. In 2008 Telligent Systems released a second version of Community Server that targeted as an Enterprise Social Software platform used to create and manage internal employee communities and intranets. Originally branded as Community Server Evolution this was later renamed Telligent Enterprise. Telligent also announced a new Enterprise Reporting platform at its first Community Server Developers Conference in 2008, which was later renamed Harvest. It was one of the first analytics suites for enterprise collaboration software, and provides social analytics including sentiment analysis, social fingerprints, and buzz analysis on social networking sites such as Twitter. Telligent rebranded all of its products on June 23, 2009 at the Enterprise 2.0 conference when it launched its new Evolution platform product suite. Community Server became known as Telligent Community, Community Server Evolution became known as Telligent Enterprise and the underlying platform that both run on is now referred to as Telligent Evolution. The Social Analytics suite was renamed Telligent Analytics.

GoodRx

GoodRx Holdings, Inc. is an American healthcare company that operates a telemedicine platform and free-to-use website and mobile app that track prescription drug prices in the United States and provide drug coupons for discounts on medications. GoodRx compares prescription drug prices at more than 75,000 pharmacies in the United States. The platform allows users to consult a doctor online and obtain a prescription for certain types of medications. == History == === Financial performance === GoodRx was founded in Santa Monica, California in 2011. GoodRx experienced substantial growth in net income in 2017 ($9 million), 2018 ($44 million), and 2019 ($66 million), but recorded a loss of $293.6 million in 2020 due to IPO-related expenses. In September 2020, GoodRx went public on the Nasdaq under the ticker symbol GDRX. The company priced its initial public offering at $33 per share, above the expected range of $24 to $28, raising more than $1.1 billion at an initial valuation of approximately $12.7 billion. In the first half of 2020, the company reported revenues of $257 million and net income of $55 million. GoodRx generated $745.4 million in revenue for the full year 2021, a 35.36% increase over 2020. During the first half of 2021, the company’s share price declined by 10.7%. The decline was attributed to increased competition in online pharmacy services and slower user growth. GoodRx reported full-year revenue of $766.6 million, with adjusted EBITDA reaching $213.5 million, exceeding guidance in the fourth quarter. GoodRx reported that 41% of prescriptions filled using its coupons were newly adherent, meaning they would not have been filled without the service. GoodRx reported a full-year 2023 revenue of $750.3 million, a decrease of 2.1% from 2022. However, its fourth-quarter revenue increased by 7% year-over-year. GoodRx achieved an Adjusted EBITDA of $217.4 million for the year and an Adjusted EBITDA Margin of 28.6%. In 2024, GoodRx achieved 6% revenue growth with $792.3 million for the full year and turned a net loss into a positive net income of $16.4 million. The company also demonstrated strong operational efficiency, with a 32.8% increase in full-year Adjusted EBITDA. In Q2 2025, GoodRx reported revenue of $203.1 million, a 1.2% increase from the previous year, and a net income of $12.8 million, a significant 92% jump, which resulted in a 6.3% net income margin. However, prescription transaction revenue declined by 3% due to a decrease in monthly active consumers, but this was offset by strong 32% growth in its Pharma Manufacturer Solutions business. GoodRx also saw a 7% decrease in subscription revenue. === Mergers and acquisitions === In 2019, GoodRx acquired HeyDoctor, a telemedicine company, to integrate virtual healthcare services into the platform. In 2021, a health video content producer, HealthiNation was acquired by GoodRx, which helped provide consumers with health information and offered pharmaceutical manufacturers new ways to reach relevant audiences. In April 2022, GoodRx acquired VitaCare Prescription Services from TherapeuticsMD to strengthen its pharma manufacturer solutions business. === Partnerships === In 2017, the company announced partnerships with major pharmaceutical companies to negotiate lower prescription drug costs. GoodRx has deep relationships with major pharmacy chains, including Walgreens, Walmart, CVS Caremark, and Publix, to allow customers to use GoodRx discounts and Gold benefits. GoodRx began its partnership with CVS Caremark in July 2023 to automatically apply coupons to insured CVS customers purchasing generic prescriptions at certain locations. In April 2024, GoodRx added Publix into its network, allowing GoodRx Gold members to use their cards at Publix Pharmacies. GoodRx partners with Pharmacy Benefit Management like Caremark, Express Scripts, and MedImpact to apply their savings directly to eligible insurance plans and members. GoodRx partners with companies like Affirm, Benefitfocus, and DoorDash to integrate their services that offer members discounts and financial flexibility for prescriptions. GoodRx also partners with organizations like the American Academy of Family Physicians Foundation to support broader access to care. In October 2022, GoodRx launched Provider Mode, which allows healthcare providers to use the app to compare costs of drugs for patients based on different payment methods and drug alternatives. In 2025, GoodRx partnered with Novo Nordisk to offer discounted cash-pay access to semaglutide products like Ozempic and Wegovy through its platform and participating pharmacies. == Products and services == GoodRx started its telemedicine service GoodRx Care in September 2019. It lets people talk to a licensed provider online for common issues and get prescriptions even if they don't have insurance. They also run condition-specific subscription plans that bundle online doctor visits, FDA-approved meds, and home delivery into one monthly payment. On the weight management side, GoodRx offers prescriptions for GLP-1 drugs like semaglutide through their telemedicine platform. This got a boost when the oral version of Wegovy became widely available in the US in early 2026. GoodRx works with drug makers like Novo Nordisk to make some medications (including semaglutide options) more affordable for people paying cash. The telemedicine part took off after GoodRx bought HeyDoctor in 2019 and brought their virtual care tools into the main platform. == Key people == The Santa Monica-based startup was founded in September 2011 by Trevor Bezdek and former Facebook executives Doug Hirsch and Scott Marlette. Marlette was one of the first 20 employees at Facebook and built Facebook's photo application. In 2005, Hirsch was the Vice President of Product at Facebook, working closely with Mark Zuckerberg. Bezdek and Hirsch served as co-chief executive officers until April 2023, when they stepped down from those roles and technology executive Scott Wagner was appointed interim chief executive officer. Bezdek became chair of the board, while Hirsch took on the role of chief mission officer. In December 2024, GoodRx announced that healthcare executive Wendy Barnes would become president and chief executive officer effective January 1, 2025. As of 2025, Barnes serves as the company’s CEO, while Trevor Bezdek and Scott Wagner serve as co-chairs of the board, and Doug Hirsch remains involved as a co-founder and senior executive. == Controversy == On February 25, 2020, Consumer Reports published an article stating that GoodRx shared user data—specifically, pseudonymized advertising ID numbers that companies use to track the behavior of web users across websites, the names of the drugs that users browsed, and the pharmacies where users sought to fill prescriptions—with Google, Facebook, and around twenty other Internet-based companies. A few days later, GoodRx released a statement saying that it had made changes to prevent user search data on medical conditions and pharmaceuticals from being shared with Facebook. In March 2020, GoodRx stopped sending data about user prescriptions to Facebook. On February 1, 2023, the Federal Trade Commission fined GoodRx US$1.5 million for violations of the Breach Notification Rule and the Federal Trade Commission Act for allegedly failing to obtain specific, informed, and unambiguous consent from users before disclosing health-related information to Facebook and Google. In November 2024, independent pharmacies filed at least three class action lawsuits against GoodRx and major pharmacy benefit managers. The cases, brought by independent pharmacies in California, Michigan, Pennsylvania, and Rhode Island, allege that GoodRx and the PBMs collaborated to suppress reimbursements for generic prescription drugs. They allege that agreements using GoodRx’s software suppressed reimbursements for generic drugs and violated the Sherman Antitrust Act. The suits claim the practices amount to price fixing which harms small pharmacies while benefiting PBMs and their affiliates. GoodRx settled both the 2023 FTC action and the 2025 class action lawsuit without admitting wrongdoing.

Truth discovery

Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

Computer Graphics: Principles and Practice

Computer Graphics: Principles and Practice is a textbook written by James D. Foley, Andries van Dam, Steven K. Feiner, John Hughes, Morgan McGuire, David F. Sklar, and Kurt Akeley and published by Addison–Wesley. First published in 1982 as Fundamentals of Interactive Computer Graphics, it is widely considered a classic standard reference book on the topic of computer graphics. It is sometimes known as the bible of computer graphics (due to its size). == Editions == === First Edition === The first edition, published in 1982 and titled Fundamentals of Interactive Computer Graphics, discussed the SGP library, which was based on ACM's SIGGRAPH CORE 1979 graphics standard, and focused on 2D vector graphics. === Second Edition === The second edition, published 1990, was completely rewritten and covered 2D and 3D raster and vector graphics, user interfaces, geometric modeling, anti-aliasing, advanced rendering algorithms and an introduction to animation. The SGP library was replaced by SRGP (Simple Raster Graphics Package), a library for 2D raster primitives and interaction handling, and SPHIGS (Simple PHIGS), a library for 3D primitives, which were specifically written for the book. === Second Edition in C === In the second edition in C, published in 1995, all examples were converted from Pascal to C. New implementations for the SRGP and SPHIGS graphics packages in C were also provided. === Third Edition === A third edition covering modern GPU architecture was released in July 2013. Examples in the third edition are written in C++, C#, WPF, GLSL, OpenGL, G3D, or pseudocode. == Awards == The book has won a Front Line Award (Hall of Fame) in 1998.

Key–value database

A key-value database, or key-value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary. Dictionaries contain a collection of objects, or records, which in turn have many different fields within them. These records are stored and retrieved using a key that uniquely identifies the record, and is used to find the data within the database. Key-value databases differ from the better known relational databases (RDB). RDBs pre-define the data structure in the database as a series of tables containing fields with well-defined data types. Exposing the data types to the database program allows it to apply various optimizations. In contrast, key-value systems treat the value as opaque to the database itself, and typically support only simple operations such as storing, retrieving, updating, and deleting a value by its key. This offers considerable flexibility and makes such systems well suited to low-latency, high-throughput workloads dominated by direct key lookups, but less suitable for applications that require complex queries or explicit relationships among records. A lack of standardization, limited transaction support, and relatively simple query interfaces long restricted many key-value systems to specialized uses, but the rapid move to cloud computing after 2010 helped drive renewed interest in them as part of the broader NoSQL movement. Some graph databases, such as ArangoDB, are also key–value databases internally, adding the concept of relationships (pointers) between records as a first-class data type. == Types and examples == Key–value systems span a wide consistency spectrum, from eventually consistent designs to strongly consistent or serializable ones, and some allow the consistency level to be configured as part of the trade-off against latency and availability. Renewed interest in key–value and other NoSQL systems was driven in part by the demands of big data, distributed, and cloud applications. Their scalability and availability made them attractive for cloud data management, although limited transaction support, low-level query interfaces, and the lack of standardization remained obstacles to wider adoption. Some maintain data in memory (RAM), while others employ solid-state drives or rotating disks. Some key–value systems add additional structure to their keys. For example, Oracle NoSQL Database organizes records using composite keys with "major" and "minor" components, an arrangement that Oracle compares to a directory-path structure in a file system. More generally, however, key–value stores are defined by their use of unique keys associated with opaque values and by their emphasis on simple key-based operations. Unix included dbm (database manager), a minimal database library written by Ken Thompson for managing associative arrays with a single key and hash-based access. Later implementations and related libraries included sdbm, GNU dbm (gdbm), and Berkeley DB. A more recent example is RocksDB, a persistent key–value storage engine developed at Facebook and designed for large-scale applications. Other examples include in-memory systems such as Memcached and Redis, and persistent systems such as Berkeley DB, Riak, and Voldemort.

Connection string

In computing, a connection string is a string that specifies information about a data source and the means of connecting to it. It is passed in code to an underlying driver or provider in order to initiate the connection. Whilst commonly used for a database connection, the data source could also be a spreadsheet or text file. The connection string may include attributes such as the name of the driver, server and database, as well as security information such as user name and password. == Examples == This example shows a PostgreSQL connection string for connecting to wikipedia.com with SSL and a connection timeout of 180 seconds: DRIVER={PostgreSQL Unicode};SERVER=www.wikipedia.com;SSL=true;SSLMode=require;DATABASE=wiki;UID=wikiuser;Connect Timeout=180;PWD=ashiknoor Users of Oracle databases can specify connection strings: on the command line (as in: sqlplus scott/tiger@connection_string ) via environment variables ($TWO_TASK in Unix-like environments; %TWO_TASK% in Microsoft Windows environments) in local configuration files (such as the default $ORACLE_HOME/network/admin.tnsnames.ora) in LDAP-capable directory services