AI Business Quiz

AI Business Quiz — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

DeepSeek (chatbot)

DeepSeek is a generative artificial intelligence chatbot developed by the Chinese company DeepSeek. Released on 20 January 2025, DeepSeek-R1 surpassed ChatGPT as the most downloaded freeware app on the iOS App Store in the United States by 27 January. DeepSeek's success against larger and more established rivals has been described as "upending AI" and initiating "a global AI space race". DeepSeek's compliance with Chinese government censorship policies and its data collection practices have also raised concerns over privacy and information control in the model, prompting regulatory scrutiny in multiple countries. However, it has also been praised for its open weights and infrastructure code, energy efficiency and contributions to open-source artificial intelligence. == History == On 10 January 2025, DeepSeek released the chatbot, based on the DeepSeek-R1 model, for iOS and Android. By 27 January, DeepSeek-R1 surpassed ChatGPT as the most-downloaded freeware app on the iOS App Store in the United States, which resulted in an 18% drop in Nvidia's share price. And after a "large-scale" cyberattack on the same day disrupted the proper functioning of its servers, DeepSeek had limited its new user registration to phone numbers from mainland China, email addresses, or Google account logins. On 3 April 2025, in collaboration with researchers at Tsinghua University, DeepSeek published a paper unveiling a new model that combines the techniques generative reward modeling (GRM) and self-principled critique tuning (SPCT). The resulting model is referred to as DeepSeek-GRM. The goal of using these techniques is to foster more effective inference-time scaling within their LLM and chatbot services. Notably, DeepSeek has said that these new models will be released and made open source. On 30 April 2025, Deepseek released its math-focused Artificial Intelligence Model named "DeepSeek-Prover-V2-671B". This model is useful for formal theorem proving and mathematical reasoning. On 24 April 2026, DeepSeek released DeepSeek V4 and V4-Pro. == Usage == DeepSeek can answer questions, solve logic problems, and write computer programs on par with other chatbots, according to benchmark tests used by American AI companies. Users can access the chatbot for free through the official DeepSeek website or mobile application, without limitation on the number of queries. DeepSeek only supports user-signup via a global email service, e.g. Gmail, Google or Yahoo. DeepSeek also offers access to the R1 and V3 models that power the chatbot via an API with a usage-based pricing model. This modality is primarily targeted towards developers and businesses. As of February 2025, API usage is priced at approximately $0.28 per million input tokens and $0.42 per million output tokens, making it less expensive than some competing services. Its web version is completely free, with 500 messages per hour cap limit to prevent bots from spamming. == Operation == DeepSeek-V3 uses significantly fewer resources compared to its peers. For example, whereas the world's leading AI companies train their chatbots with supercomputers using as many as 16,000 graphics processing units (GPUs), DeepSeek claims to have needed only about 2,000 GPUs—namely, the H800 series chips from Nvidia. It was trained in around 55 days at a cost of US$5.58 million, which is roughly one-tenth of what tech giant Meta spent building its latest AI technology. == Reactions == DeepSeek's success against larger and more established rivals has been described as "upending AI", constituting "the first shot at what is emerging as a global AI space race", and ushering in "a new era of AI brinkmanship". === Challenge to US AI dominance === DeepSeek's competitive performance at relatively minimal cost has been recognized as potentially challenging the global dominance of American AI models. Various publications and news media, such as The Hill and The Guardian, have described the release of the R1 chatbot as a "Sputnik moment" for American AI, echoing Marc Andreessen's view. OpenAI wrote a letter to the Office of Science and Technology Policy (OSTP), in March 2025, citing issues concerning a possibility that Deepseek could manipulate responses to cause harm. === Chinese perspective === DeepSeek's founder Liang Wenfeng has been compared to OpenAI CEO Sam Altman, with CNN calling him the Sam Altman of China and an evangelist for AI. Chinese state media widely praised DeepSeek as a national asset. On 20 January 2025, Chinese Premier Li Qiang invited Wenfeng to his symposium with experts and asked him to provide opinions and suggestions on a draft for comments of the annual 2024 government work report. On 20 February 2025, Wenfeng met with General Secretary of the Chinese Communist Party Xi Jinping, who encouraged party and state leaders to experiment with DeepSeek. Government officials responded to Xi's approval of the chatbot by reportedly using it to draft legal judgements, propose medical treatment plans, and analyze surveillance videos to search for missing persons. === Performance and success === Leading figures in the American AI sector had mixed reactions to DeepSeek's performance and success. Microsoft CEO Satya Nadella and OpenAI CEO Altman—whose companies are involved in the United States government-backed "Stargate Project" to develop American AI infrastructure—both called DeepSeek "super impressive". Various companies including Amazon Web Services, Toyota, and Stripe are seeking to use the model in their program. When American President Donald Trump announced The Stargate Project, he referred to DeepSeek as a wake-up call and a positive development. Other leaders in the AI field, however—including Scale AI CEO Alexandr Wang, Anthropic cofounder and CEO Dario Amodei, and Elon Musk—have expressed skepticism of the app's performance or of the sustainability of its success. Wang in particularly referred to DeepSeek-V3 as "earth-shattering" and DeepSeek-R1 as "top performing, or roughly on par with the best American models", but speculated that China may possess more AI-powering Nvidia H100 GPUs than thought. === Stock market implications === DeepSeek's optimization of limited resources has highlighted potential limits of United States sanctions on China's AI development, including export restrictions on advanced AI chips to China. The success of the company's AI models consequently "sparked market turmoil" and caused shares in major global technology companies to plunge on 27 January 2025: Nvidia's stock fell by as much as 17–18%, as did the stock of rival Broadcom. Other tech firms also sank, including Microsoft (down 2.5%), Google's owner Alphabet (down over 4%), and Dutch chip equipment maker ASML (down over 7%). A global sell-off of technology stocks on Nasdaq, prompted by the release of the R1 model, led to record losses of about $593 billion in the market capitalizations of AI and computer hardware companies; and by the next day a total of $1 trillion of value was wiped from American stocks. == Concerns == === Distillation === DeepSeek has been reported to sometimes claim that it is ChatGPT. OpenAI said that DeepSeek may have "inappropriately" used outputs from its model as training data in a process called distillation. However, there is currently no method to prove this conclusively. === Censorship === DeepSeek's compliance with Chinese government censorship policies and its data collection practices have raised concerns over information control in the model, prompting regulatory scrutiny in multiple countries. Reports indicate that it applies content moderation in accordance with the government's "public opinion guidance" regulations, limiting responses on topics such as the Tiananmen Square massacre and Taiwan's political status. DeepSeek models that have been uncensored also display a bias towards Chinese government viewpoints on controversial topics such as Xi Jinping's human rights record and Taiwan's political status. However, users who have downloaded the models and hosted them on their own devices and servers have reported successfully removing this censorship. Some sources have observed that the official application programming interface (API) version of R1, which runs from servers located in mainland China, uses censorship mechanisms for topics considered politically sensitive for the government of China. For example, the model may initially generate answers to questions about the 1989 Tiananmen Square massacre, persecution of Uyghurs, comparisons between Xi Jinping and Winnie the Pooh, and human rights in China, but a censorship mechanism deletes the uncensored response afterwards and replaces it with a message such as:"Sorry, that's beyond my current scope. Let's talk about something else." The post hoc censorship mechanisms and restrictions added on top of the model's output can be removed in the open-source version of the R1 model. If the "core Socialist values" defined by the Chinese Internet regul
Read more →
PauseAI

PauseAI is a global political movement founded in the Netherlands with the stated aim of achieving global coordination to stop the development of more powerful general artificial intelligence systems, at least until it is known how to build them safely, and keep them under democratic control. The movement was established in Utrecht in May 2023 by software entrepreneur Joep Meindertsma. == Proposal == PauseAI's stated goal is to "implement a temporary pause on the training of the most powerful general AI systems". Their website lists some proposed steps to achieve this goal: Set up an international AI safety agency, similar to the IAEA. Only allow training of general AI systems if their safety can be guaranteed. Only allow deployment of models after no dangerous capabilities are present. == Background == During the late 2010s and early 2020s, a rapid improvement in the capabilities of artificial intelligence models known as the AI boom was underway, which included the release of large language model GPT-3, its more powerful successor GPT-4, and image generation models Midjourney and DALL-E. This led to an increased concern about the risks of advanced AI, causing the Future of Life Institute to release an open letter calling for "all AI labs to immediately pause for at least six months the training of AI systems more powerful than GPT-4". The letter was signed by thousands of AI researchers and industry CEOs such as Yoshua Bengio, Stuart Russell, and Elon Musk. == History == Founder Joep Meindertsma first became worried about the existential risk from artificial intelligence after reading philosopher Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies. He founded PauseAI in May 2023, putting his job as the CEO of a software firm on hold. Meindertsma claimed the rate of progress in AI alignment research is lagging behind the progress in AI capabilities, and said "there is a chance that we are facing extinction in a short frame of time". As such, he felt an urge to organise people to act. PauseAI's first public action was to protest in front of Microsoft's Brussels lobbying office in May 2023 during an event on artificial intelligence. In November of the same year, they protested outside the inaugural AI Safety Summit at Bletchley Park. The Bletchley Declaration that was signed at the summit, which acknowledged the potential for catastrophic risks stemming from AI, was perceived by Meindertsma to be a small first step. But, he argued "binding international treaties" are needed. He mentioned the Montreal Protocol and treaties banning blinding laser weapons as examples of previous successful global agreements. In February 2024, members of PauseAI gathered outside OpenAI's headquarters in San Francisco, in part due to OpenAI changing its usage policy that prohibited the use of its models for military purposes. On 13 May 2024, protests were held across thirteen countries before the AI Seoul Summit, including the United States, the United Kingdom, Brazil, Germany, Australia, and Norway. Meindertserma said that those attending the summit "need to realize that they are the only ones who have the power to stop this race". Protesters in San Francisco held signs reading "When in doubt, pause", and "Quit your job at OpenAI. Trust your conscience". Jan Leike, head of the "superalignment" team at OpenAI, resigned two days later due to his belief that "safety culture and processes [had] taken a backseat to shiny products".
Read more →
Existential risk from artificial intelligence

Existential risk from artificial intelligence, or AI x-risk, refers to the idea that substantial progress in artificial general intelligence (AGI) and artificial superintelligence (ASI) could lead to human extinction or an irreversible global catastrophe. One argument for the validity of this concern and the importance of this risk references how human beings dominate other species because the human brain possesses distinctive capabilities other animals lack. If AI were to surpass human intelligence and become superintelligent, it might become uncontrollable. Just as the fate of the mountain gorilla depends on human goodwill, the fate of humanity could depend on the actions of a future machine superintelligence. Experts disagree on whether artificial general intelligence (AGI) can achieve the capabilities needed for human extinction. Debates center on AGI's technical feasibility, the speed of self-improvement, and the effectiveness of alignment strategies. Concerns about superintelligence have been voiced by researchers including Geoffrey Hinton, Yoshua Bengio, Demis Hassabis, and Alan Turing, and AI company CEOs such as Dario Amodei (Anthropic), Sam Altman (OpenAI), and Elon Musk (xAI). In 2022, a survey of AI researchers with a 17% response rate found that the majority believed there is a 10 percent or greater chance that human inability to control AI will cause an existential catastrophe. In 2023, hundreds of AI experts and other notable figures signed a statement declaring, "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war". Following increased concern over AI risks, government leaders such as United Kingdom prime minister Rishi Sunak and United Nations Secretary-General António Guterres called for an increased focus on global AI regulation. In 2025, hundreds of public figures including AI experts, five Nobel Prize laureates, and former senior US national security officials such as Michael Mullen and Susan Rice signed a statement calling for a ban on the development of superintelligence. Two sources of concern stem from the problems of AI control and alignment. Controlling a superintelligent machine or instilling it with human-compatible values may be difficult. Many researchers believe that a superintelligent machine would likely resist attempts to disable it or change its goals as that would prevent it from accomplishing its present goals. It would be extremely challenging to align a superintelligence with the full breadth of significant human values and constraints. In contrast, skeptics such as computer scientist Yann LeCun argue that superintelligent machines will have no desire for self-preservation. A June 2025 study showed that in some circumstances, models may break laws and disobey direct commands to prevent shutdown or replacement, even at the cost of human lives. Researchers warn that an "intelligence explosion"—a rapid, recursive cycle of AI self-improvement—could outpace human oversight and infrastructure, leaving no opportunity to implement safety measures. In this scenario, an AI more intelligent than its creators would recursively improve itself at an exponentially increasing rate, too quickly for its handlers or society at large to control. Empirically, examples like AlphaZero, which taught itself to play Go and quickly surpassed human ability, show that domain-specific AI systems can sometimes progress from subhuman to superhuman ability very quickly, although such machine learning systems do not recursively improve their fundamental architecture. == History == One of the earliest authors to express serious concern that highly advanced machines might pose existential risks to humanity was the novelist Samuel Butler, who wrote in his 1863 essay Darwin among the Machines: The upshot is simply a question of time, but that the time will come when the machines will hold the real supremacy over the world and its inhabitants is what no person of a truly philosophic mind can for a moment question. In 1951, foundational computer scientist Alan Turing wrote the article "Intelligent Machinery, A Heretical Theory", in which he proposed that artificial general intelligences would likely "take control" of the world as they became more intelligent than human beings: Let us now assume, for the sake of argument, that [intelligent] machines are a genuine possibility, and look at the consequences of constructing them... There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler's Erewhon. In 1965, I. J. Good originated the concept now known as an "intelligence explosion" and said the risks were underappreciated: Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion', and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously. Scholars such as Marvin Minsky and I. J. Good himself occasionally expressed concern that a superintelligence could seize control, but issued no call to action. In 2000, computer scientist and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech danger to human survival, alongside nanotechnology and engineered bioplagues. Nick Bostrom published Superintelligence in 2014, which presented his arguments that superintelligence poses an existential threat. By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence. Also in 2015, the Open Letter on Artificial Intelligence highlighted the "great potential of AI" and encouraged more research on how to make it robust and beneficial. In April 2016, the journal Nature warned: "Machines and robots that outperform humans across the board could self-improve beyond our control—and their interests might not align with ours". In 2020, Brian Christian published The Alignment Problem, which details the history of progress on AI alignment up to that time. In March 2023, key figures in AI, such as Musk, signed a letter from the Future of Life Institute calling a halt to advanced AI training until it could be properly regulated. In May 2023, the Center for AI Safety released a statement signed by numerous experts in AI safety and the AI existential risk that read: Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war. A 2025 open letter by the Future of Life Institute, whose signers include five Nobel Prize laureates, reads: We call for a prohibition on the development of superintelligence, not lifted before there is broad scientific consensus that it will be done safely and controllably, and strong public buy-in. == Potential AI capabilities == === General Intelligence === Artificial general intelligence (AGI) is typically defined as a system that performs at least as well as humans in most or all intellectual tasks. A 2022 survey of AI researchers found that 90% of respondents expected AGI would be achieved in the next 100 years, and half expected the same by 2061. In May 2023, some researchers dismissed existential risks from AGI as "science fiction" based on their high confidence that AGI would not be created anytime soon. But in August 2023, a survey of 2,778 AI researchers found that most believed that AGI would be achieved by 2040. Breakthroughs in large language models (LLMs) have led some researchers to reassess their expectations. Notably, Geoffrey Hinton said in 2023 that he recently changed his estimate from "20 to 50 years before we have general purpose A.I." to "20 years or less". === Superintelligence === In contrast with AGI, Bostrom defines a superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest", including scientific creativity, strategic planning, and social skills. He argues that a superintelligence can outmaneuver humans anytime its goals conflict with humans'. It may choose to hide its true intent until humanity cannot stop it. Bostrom writes that in order to be safe for
Read more →
SERVQUAL

SERVQUAL is a research tool that measures customer perception of service quality by comparing what customers expect from a service to their assessment of the service actually delivered. The instrument was developed in the United States in the mid-1980s by researchers A. Parasuraman, Valarie Zeithaml, and Leonard L. Berry, and is designed for use in after-service evaluation processes. It assesses service quality across five dimensions: reliability, assurance, tangibles, empathy, and responsiveness. SERVQUAL has been applied in sectors including healthcare, banking, education, and libraries. == Overview == The SERVQUAL questionnaire consists of matched pairs of items, 22 expectation items and 22 perception items, organized into five dimensions that correspond to the consumer's mental framework for evaluating service quality. Each item is part of a pair: one question asks what excellent organizations in a given industry should offer (expectation), and the other asks how the specific organization being evaluated performs (perception). == The model of service quality == The model of service quality, referred to as the gaps model, was developed by Parasuraman, Zeithaml, and Berry during a systematic research program conducted in the 1980s. The model identifies five gaps that may cause customers to experience poor service quality. In this framework, gap 5 is the service quality gap, which represents the difference between customer expectations and their perceptions of the service. This is the only gap that can be directly measured, and the SERVQUAL instrument was designed specifically to capture it. Gaps 1 through 4 have diagnostic value and point to probable causes of service failures. == Development of the instrument == Development of the model of service quality began in 1983 and, after iterative refinements, led to the publication of the SERVQUAL instrument in 1988. The research team conducted in-depth interviews and focus groups in four service sectors: retail banking, credit card services, securities brokerage, and product repair and maintenance. The questionnaire was tested across multiple samples to verify its reliability, validity, and factor structure. == Adaptations and variants == SERVQUAL has been adapted for specific industries and contexts. Well‑known derivatives include: LibQUAL+ – a library service quality survey developed by the Association of Research Libraries. EDUQUAL – an instrument tailored for the evaluation of service quality in educational institutions. HEALTHQUAL – adapted for measuring patient perceptions of healthcare service quality. ARTSQUAL – used to evaluate visitor perceptions of quality in museums and performing arts venues. == Criticisms == Researchers have raised several concerns about SERVQUAL. Critics argue that the instrument's definition of expectations is ambiguous and that it does not adequately account for the dynamic nature of customer expectations over time. Other scholars question whether the five‑dimension structure is universally applicable across all service contexts, and whether a generic instrument can capture the unique attributes of specific industries without modification.
Read more →
BevQ

BevQ is a queue management mobile application developed by Faircode Technologies of Kochi, Kerala. It is provided by the Kerala State Beverages Corporation under Government of Kerala. == History == This app was released together by the Government of Kerala and the Kerala State Beverages Corporation in order to implement social distancing in the liquor stores Kerala in the case of the COVID-19 pandemic in Kerala and to reduce the congestion of people. The BevQ App was released by Faircode Technologies on 27 May 2020 on the Google Play Store. In January 2021, the app was withdrawn as bars had opened. In June 2021, there was a commitment from the Kerala CM that the App will be relaunched again. It has been reported that over 132,000 new users downloaded the app in the 48 hours after the announcement. == Achievements == The BEVQ app, which works only in the state of Kerala, beat all other Indian food and drink apps in 2020 to see the highest growth in year-on-year sessions, according to the State of Mobile 2021 report by App Annie. The app even beat the likes of Domino’s, which is used all across India. Around 300 government Liquor shops and 900 private liquor shops were enlisted in the platform. More than 200 million unique users registered in the platform. About 250,000 tokens were given out a day.
Read more →
Ramification problem

In philosophy and artificial intelligence (especially, knowledge based systems), the ramification problem is concerned with the indirect consequences of an action. It might also be posed as how to represent what happens implicitly due to an action or how to control the secondary and tertiary effects of an action. It is strongly connected to, and is opposite the qualification side of, the frame problem. Limit theory helps in operational usage. For instance, in KBE derivation of a populated design (geometrical objects, etc., similar concerns apply in shape theory), equivalence assumptions allow convergence where potentially large, and perhaps even computationally indeterminate, solution sets are handled deftly. Yet, in a chain of computation, downstream events may very well find some types of results from earlier resolutions of ramification as problematic for their own algorithms.
Read more →
Minimum intelligent signal test

The minimum intelligent signal test, or MIST, is a variation of the Turing test proposed by Chris McKinstry in which only boolean (yes/no or true/false) answers may be given to questions. The purpose of such a test is to provide a quantitative statistical measure of humanness, which may subsequently be used to optimize the performance of artificial intelligence systems intended to imitate human responses. McKinstry gathered approximately 80,000 propositions that could be answered yes or no, e.g.: Is Earth a planet? Was Abraham Lincoln once President of the United States? Is the sun bigger than my foot? Do people sometimes lie? He called these propositions Mindpixels. These questions test both specific knowledge of aspects of culture, and basic facts about the meaning of various words and concepts. It could therefore be compared with the SAT, intelligence testing and other controversial measures of mental ability. McKinstry's aim was not to distinguish between shades of intelligence but to identify whether a computer program could be considered intelligent at all. According to McKinstry, a program able to do much better than chance on a large number of MIST questions would be judged to have some level of intelligence and understanding. For example, on a 20-question test, if a program were guessing the answers at random, it could be expected to score 10 correct on average. But the probability of a program scoring 20 out of 20 correct by guesswork is only one in 220, i.e. one in 1,048,576; so if a program were able to sustain this level of performance over several independent trials, with no prior access to the propositions, it should be considered intelligent. == Discussion == McKinstry criticized existing approaches to artificial intelligence such as chatterbots, saying that his questions could "kill" AI programs by quickly exposing their weaknesses. He contrasted his approach, a series of direct questions assessing an AI's capabilities, to the Turing test and Loebner Prize method of engaging an AI in undirected typed conversation. Critics of the MIST have noted that it would be easy to "kill" a McKinstry-style AI too, due to the impossibility of supplying it with correct answers to all possible yes/no questions by ways of a finite set of human-generated Mindpixels: the fact that an AI can answer the question "Is the sun bigger than my foot?" correctly does not mean that it can answer variations like "Is the sun bigger than (my hand | my liver | an egg yolk | Alpha Centauri A | ...)" correctly, too. However, the late McKinstry might have replied that a truly intelligent, knowledgeable entity (on a par with humans) would be able to work out answers such as (yes | yes | yes | don't know | ...) by applying its knowledge of the relative sizes of the objects named. In other words, the MIST was intended as a test of AI, not as a suggestion for implementing AI. It can also be argued that the MIST is a more objective test of intelligence than the Turing test, a subjective assessment that some might consider to be more a measure of the interrogator's gullibility than of the machine's intelligence. According to this argument, a human's judgment of a Turing test is vulnerable to the ELIZA effect, a tendency to mistake superficial signs of intelligence for the real thing, anthropomorphizing the program. The response, suggested by Alan Turing's essay Computing Machinery and Intelligence, is that if a program is a convincing imitation of an intelligent being, it is in fact intelligent. The dispute is thus over what it means for a program to have "real" intelligence, and by what signs it can be detected. A similar debate exists in the controversy over great ape language, in which nonhuman primates are said to have learned some aspects of sign languages but the significance of this learning is disputed.
Read more →
Alexey Chervonenkis

Alexey Yakovlevich Chervonenkis (Russian: Алексей Яковлевич Червоненкис; 7 September 1938 – 22 September 2014) was a Soviet and Russian mathematician. Along with Vladimir Vapnik, he was one of the main developers of the Vapnik–Chervonenkis theory, also known as the "fundamental theory of learning", an important part of computational learning theory. Chervonenkis held joint appointments with the Russian Academy of Sciences and Royal Holloway, University of London. Alexey Chervonenkis got lost in Losiny Ostrov National Park on 22 September 2014, and later during a search operation was found dead near Mytishchi, a suburb of Moscow. He had died of hypothermia.
Read more →
Vulnerability Discovery Model

A Vulnerability Discovery Model (VDM) uses discovery event data with software reliability models for predicting the same. A thorough presentation of VDM techniques is available in. Numerous model implementations are available in the MCMCBayes open source repository. Several VDM examples include: Alhazmi-Malaiya: Time based model (Alhazmi-Malaiya Logistic (AML) model) Alhazmi-Malaiya: Effort based model Rescorla: Quadratic Model and Exponential Model Anderson: Thermodynamic Model Kim: Weibull Model Linear Model Hump-Shaped Model Independent and Dependent Model Vulnerability Discovery Modeling using Bayesian model averaging Multivariate Vulnerability Discovery Models
Read more →
Hive (artificial intelligence company)

Hive is an American artificial intelligence company offering machine learning models via APIs to enterprise customers. Hive uses around 700,000 gig workers to train data for its models through its Hive Work app. One of Hive's major offerings is to provide automated content moderation services. == Products == Hive is reported to have been engaged to provide content moderation services to social news aggregator Reddit, Giphy, BeReal, Donald Trump-affiliated social network Truth Social, and on online chat website Chatroulette. Parler, after its shutdown by content service providers in early 2021 due to a lack of content moderation, integrated with Hive and was allowed back in the App Store. Hive's content moderation models have been leveraged widely in the livestreaming industry, where the cost of human moderation is high. Hive's models have also been used in events such as the Super Bowl and March Madness, and its contextual advertising models used by NBC Universal and Vevo. Hive provides APIs to detect deepfakes and AI-generated artwork. In early 2023, Hive released a free demo text classifier intended to detect AI-generated text. Mark Hachman at PC World rated Hive's classifier favorably and found it more reliable than OpenAI's AI text classifier. == History == Hive was founded by Kevin Guo and Dmitriy Karpman, and in April 2021, announced $85M in new capital at a valuation of $2 billion.
Read more →
IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence (sometimes abbreviated as IEEE PAMI or simply PAMI) is a monthly peer-reviewed scientific journal published by the IEEE Computer Society. == Background == The journal covers research in computer vision and image understanding, pattern analysis and recognition, machine intelligence, machine learning, search techniques, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, and face and gesture recognition. The editor-in-chief is Kyoung Mu Lee (Seoul National University). According to the Journal Citation Reports, the journal has a 2023 impact factor of 20.8.
Read more →
CatBoost

CatBoost is an open-source software library developed by Yandex. It provides a gradient boosting framework which, among other features, attempts to solve for categorical features using a permutation-driven alternative to the classical algorithm. It works on Linux, Windows, macOS, and is available in Python, R, and models built using CatBoost can be used for predictions in C++, Java, C#, Rust, Core ML, ONNX, and PMML. The source code is licensed under Apache License and available on GitHub. InfoWorld magazine awarded the library "The best machine learning tools" in 2017. along with TensorFlow, Pytorch, XGBoost and 8 other libraries. Kaggle listed CatBoost as one of the most frequently used machine learning (ML) frameworks in the world. It was listed as the top-8 most frequently used ML framework in the 2020 survey and as the top-7 most frequently used ML framework in the 2021 survey. As of April 2022, CatBoost is installed about 100000 times per day from PyPI repository == Features == CatBoost has gained popularity compared to other gradient boosting algorithms primarily due to the following features Native handling for categorical features Fast GPU training Visualizations and tools for model and feature analysis Using oblivious trees or symmetric trees for faster execution Ordered boosting to overcome overfitting == History == In 2009 Andrey Gulin developed MatrixNet, a proprietary gradient boosting library that was used in Yandex to rank search results. Since 2009 MatrixNet has been used in different projects at Yandex, including recommendation systems and weather prediction. In 2014–2015 Andrey Gulin worked with a team of researchers to start a new project called Tensornet which was aimed at solving the problem of "how to work with categorical data". Their work resulted in several proprietary Gradient Boosting libraries with different approaches to handling categorical data. In 2016 the Machine Learning Infrastructure team led by Anna Dorogush started working on Gradient Boosting in Yandex, including Matrixnet and Tensornet. They implemented and open-sourced the next version of Gradient Boosting library called CatBoost, which has support for categorical and text data, GPU training, model analysis, and visualization tools. CatBoost was open-sourced in July 2017 and is under active development in Yandex and the open-source community. == Application == JetBrains uses CatBoost for code completion Cloudflare uses CatBoost for bot detection Careem uses CatBoost to predict future destinations of the rides
Read more →
List of publications in data science

This is a list of publications in data science, generally organized by order of use in a data analysis workflow. See the list of publications in statistics for more research-based and fundamental publications; while this list is more applied, business oriented, and cross-disciplinary. General article inclusion criteria are: Papers from notable practitioners or notable professors, either with a Wikipedia page or reference to their notability Common knowledge all data professionals should know, with references validating this claim Highly cited applied statistics and machine learning publications Discussion-facilitating papers on the field of data science as a whole (for example, the Attention Is All You Need paper is arguably a landmark paper that can be added here, but it is specific to generative artificial intelligence, not for all practitioners of data) Some reasons why a particular publication might be regarded as important: Topic creator – A publication that created a new topic Breakthrough – A publication that changed scientific knowledge significantly Influence – A publication which has significantly influenced the world or has had a massive impact on the teaching of data science. When possible, a reference is used to validate the inclusion of the publication in this list. == History == Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) Author: Leo Breiman Publication data: Online version: https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.pdf Description: Describes two cultures of statistics, one using a parsimonious and generative stochastic model, while the other is an algorithmic model with no known mechanism for how the data is generated. Breiman argues that while statistics has traditionally favored using the stochastic model, there is value in expanding the methods that statisticians can use to study phenomenon. Importance: Influence on the philosophies of statisticians right before the increased use of machine learning and deep learning methods. In a 20-year retrospective on this article, "Breiman's words are perhaps more relevant than ever". Notable statisticians at the time wrote opinion pieces about the publication. Although overall critical of the publication, David Cox writes that the publication "contains enough truth and exposes enough weaknesses to be thought-provoking." Bradley Efron commented that this publication is a "stimulating paper". Emanuel Parzen also comments about this publication that "Breiman alerts us to systematic blunders (leading to wrong conclusions) that have been committed applying current statistical practice of data modeling". Data Scientist: The Sexiest Job of the 21st Century Author: Thomas H. Davenport and DJ Patil Publication data: Online version: hbr.org/2022/07/is-data-scientist-still-the-sexiest-job-of-the-21st-century Description: Describes the new role at companies that is coined "Data scientist", what they do, how an organization might recruit one to their organization, and how to work with one effectively. Importance: This publication has been an influence on the data community as mentioned near the time it was published in 2012 by institutions like IEEE Spectrum, but also mentioned nearly a decade later asking the same question the title poses. In a retrospective response to their own publication 10 years earlier, authors Davenport and Patil have reflected that the role of a data scientist has "become better institutionalized, the scope of the job has been redefined, the technology it relies on has made huge strides, and the importance of non-technical expertise, such as ethics and change management, has grown". 50 Years of Data Science Author: David Donoho Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734 Description: Retrospective discussion paper on the history and origins of data science, with a number of commentary from notable statisticians. Importance: This has been described as "the first in the field to present such a comprehensive and in-depth survey and overview", and helps to define the field that has many definitions. The Composable Data Management System Manifesto Author: Pedro Pedreira, Orri Erling, Konstantinos Karanasos, Scott Schneider, Wes McKinney, Satya R Valluri, Mohamed Zait, Jacques Nadeau Publication data: Online version: https://www.vldb.org/pvldb/vol16/p2679-pedreira.pdf Description: The vision paper advocating for a paradigm shift in how data management systems are designed using standard, composable, interoperable tools rather than siloed software tools. Importance: A paradigm shifting view on how future data science software tools should be designed for more efficient workflows, the principles of which "will be especially crucial for addressing fragmentation, improving interoperability, and promoting user-centricity as data ecosystems grow increasingly complex". == Data collection and organization == Tidy Data Author: Hadley Wickham Publication data: Online version: https://www.jstatsoft.org/article/view/v059i10/ https://vita.had.co.nz/papers/tidy-data.pdf Description: Describes a framework for data cleaning that is summarized in the quote, "each variable is a column, each observation is a row, and each type of observational unit is a table". This allows a standard data structure for which data analysis tools can be consistently built around. Importance: Cited over 1,500 times, this effort for tidy data has been described by David Donoho as having "more impact on today's practice of data analysis than many highly regarded theoretical statistics articles". In the context of data visualization, this publication is said to support "efficient exploration and prototyping because variables can be assigned different roles in the plot without modifying anything about the original dataset". Data Organization in Spreadsheets Author: Karl W. Broman and Kara H. Woo Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989 Description: This article offers practical recommendations for organizing data in spreadsheets, like Microsoft Excel and Google Sheets, to reduce errors and lower the barrier for later analyses due to limitations in spreadsheets or quirks in the software. Importance: Influences teaching both data and non-data practitioners to create more analysis-friendly spreadsheets, and has been described to outline "spreadsheet best practices". == Data visualizations == Quantitative Graphics in Statistics: A Brief History Author: James R. Beniger and Dorothy L. Robyn Publication data: Online version: https://www.jstor.org/stable/2683467 Description: Outlines history and evolution of quantitative graphics in statistics, going through spatial organization (17th and 18th centuries), discrete comparison (18th and 19th centuries), continuous distribution (19th century), and multivariate distribution and correlation (late 19th and 20th centuries). Importance: Helps put into perspective for learning data practitioners the recency of graphics that are used. A later publication "Graphical Methods in Statistics" by Stephen Fienberg in 1979 writes that his publication "owes much to the work of Beniger and Robyn". == Practice == Data Science for Business Author: Foster Provost and Tom Fawcett Publication data: Online version: N/A Description: Broadly outlines principles of data science and data-analytic thinking for businesses. Importance: Cited over 3,000 times, it is "highly recommended for students" but also it is also recommended due to its "relevance to senior management leaders who want to build and lead a team of data scientists and implement data science in solving complex business problems". == Tooling == Hidden Technical Debt in Machine Learning Systems Author: D. Sculley, Gary Holy, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison Publication data: Online version: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf Description: This paper argues that it is "dangerous to think of [complex machine learning] quick wins as coming for free" and overviews risk factors to account for when implementing a machine learning system. Importance: All authors worked for Google, article is cited over 2,000 times, and helped practitioners thinking about quickly implementing a machine learning tool without understanding the long-term maintenance of the tool. A few useful things to know about machine learning Author: Pedro Domingos Publication data: Online version: https://dl.acm.org/doi/10.1145/2347736.2347755 https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf Description: The purpose of this paper is to distill inaccessible "folk knowledge" to effectively implement machine learning projects because "machin
Read more →
LIFER/LADDER

LIFER/LADDER was one of the first database natural language processing systems. It was designed as a natural language interface to a database of information about US Navy ships. This system, as described in a paper by Hendrix (1978), used a semantic grammar to parse questions and query a distributed database. It was implemented in Interlisp. The LIFER/LADDER system could only support simple one-table queries or multiple table queries with easy join conditions. Some examples of queries it could accept: What are the length, width, and draft of the Kitty Hawk? When will Reeves achieve readiness rating C2? What is the nearest ship to Naples with a doctor on board? What ships are carrying cargo for the United States? Where are they going? Print the American cruisers’ current positions and states of readiness?
Read more →
KL-ONE

KL-ONE (pronounced "kay ell won") is a knowledge representation system in the tradition of semantic networks and frames; that is, it is a frame language. The system is an attempt to overcome semantic indistinctness in semantic network representations and to explicitly represent conceptual information as a structured inheritance network. == Overview == There is a whole family of KL-ONE-like systems. One of the innovations that KL-ONE initiated was the use of a deductive classifier, an automated reasoning engine that can validate a frame ontology and deduce new information about the ontology based on the initial information provided by a domain expert. Frames in KL-ONE are called concepts. These form hierarchies using subsume-relations; in the KL-ONE terminology a super class is said to subsume its subclasses. Multiple inheritance is allowed. Actually a concept is said to be well-formed only if it inherits from more than one other concept. All concepts, except the top concept (usually THING), must have at least one super class. In KL-ONE descriptions are separated into two basic classes of concepts: primitive and defined. Primitives are domain concepts that are not fully defined. This means that given all the properties of a concept, this is not sufficient to classify it. They may also be viewed as incomplete definitions. Using the same view, defined concepts are complete definitions. Given the properties of a concept, these are necessary and sufficient conditions to classify the concept. The slot-concept is called roles and the values of the roles are role-fillers. There are several different types of roles to be used in different situations. The most common and important role type is the generic RoleSet that captures the fact that the role may be filled with more than one filler.
Read more →