AI Chatbot Market Share

AI Chatbot Market Share — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • 15.ai

    15.ai

    15.ai was a free non-commercial web application and research project that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at the Massachusetts Institute of Technology (MIT), the application allows users to make characters from video games, television shows, and movies speak custom text with emotional inflections. The platform is able to generate convincing voice output using minimal training data; the name "15.ai" references the creator's statement that a voice can be cloned with just 15 seconds of audio. It was an early example of an application of generative artificial intelligence during the initial stages of the AI boom. Launched in March 2020, 15.ai became an Internet phenomenon in early 2021 when content utilizing it went viral on social media and quickly gained widespread use among Internet fandoms, such as the My Little Pony: Friendship Is Magic, Team Fortress 2, and SpongeBob SquarePants fandoms. The service featured emotional context through emojis, precise pronunciation control, and multi-speaker capabilities. Critics praised 15.ai's accessibility and emotional control but criticized its technical limitations in prosody options and non-English language support, with mixed results depending on character complexity. 15.ai is credited as the first platform to popularize AI voice cloning in memes and content creation. Voice actors and industry professionals debated 15.ai's implications, raising concerns about employment impacts, voice-related fraud, and potential misuse. In January 2022, it was discovered that a company called Voiceverse had generated voice lines using 15.ai without attribution, promoted them as the byproduct of their own technology, and sold them as non-fungible tokens (NFT) without permission. News publications universally characterized this incident as the company having "stolen" from 15.ai. The service went offline in September 2022 due to legal issues surrounding artificial intelligence and copyright. Its shutdown was followed by the emergence of commercial alternatives whose founders have acknowledged 15.ai's pioneering influence in the field of deep learning speech synthesis. On May 18, 2025, 15 launched 15.dev as the sequel to 15.ai. == History == === Background === The field of speech synthesis underwent a significant transformation with the introduction of deep learning approaches. In 2016, DeepMind's publication of the WaveNet paper marked a shift toward neural network-based speech synthesis, which enabled higher audio quality via causal convolutional neural networks. Previously, concatenative synthesis—which worked by stitching together pre-recorded segments of human speech—was the predominant method for generating artificial speech, but it often produced robotic-sounding results at the boundaries of sentences. In 2018, Google AI's Tacotron 2 showed that neural networks could produce highly natural speech synthesis but required substantial training data (typically tens of hours of audio) to achieve acceptable quality. When trained on two hours of training data, the output quality degraded while still being able to maintain intelligible speech; with 24 minutes of training data, Tacotron 2 failed to produce intelligible speech. The same year saw the emergence of HiFi-GAN, a generative adversarial network (GAN)-based vocoder that improved the efficiency of waveform generation while producing high-fidelity speech, followed by Glow-TTS, which introduced a flow-based approach that allowed for both fast inference and voice style transfer capabilities. Chinese tech companies like Baidu and ByteDance also made contributions to the field by developing breakthroughs that further advanced the technology. === 2016–2020: Conception and development === 15.ai was conceived in 2016 as a research project in deep learning speech synthesis by a developer known as 15 (at the age of 18) during their freshman year at MIT as part of its Undergraduate Research Opportunities Program. 15 was inspired by DeepMind's WaveNet paper, with development continuing through their studies as Google AI released Tacotron 2 the following year. By 2019, they had demonstrated at MIT their ability to replicate WaveNet and Tacotron 2's results using 75% less training data than previously required. The name "15.ai" is a reference to the developer's statement that a voice can be cloned with as little as 15 seconds of data. 15 had originally planned to pursue a PhD based on their undergraduate research, but opted to work in the tech industry instead after their startup was accepted into the Y Combinator accelerator in 2019. After their departure in early 2020, 15 returned to their voice synthesis research and began implementing it as a web application. According to a post on X from 15, instead of using conventional voice datasets like LJSpeech that contained simple, monotone recordings, they sought out more challenging voice samples that could demonstrate the model's ability to handle complex speech patterns and emotional undertones. During this phase, 15 discovered the Pony Preservation Project, a collaborative project started by /mlp/, the My Little Pony board on 4chan. Contributors of the project had manually trimmed, denoised, transcribed, and emotion-tagged thousands of voice lines from My Little Pony: Friendship Is Magic and had compiled them into a dataset that provided ideal training material for 15.ai. === 2020–2022: Release and operation === 15.ai was released on March 2, 2020 as a free and non-commercial web application that did not require user registration to use, but did require the user to accept its terms of service before proceeding. At the time of its launch, the platform had a limited selection of available characters, including those from My Little Pony: Friendship Is Magic and Team Fortress 2. Users were permitted to create any content with the synthesized voices under two conditions: they had to properly credit 15.ai by including "15.ai" in any posts, videos, or projects using the generated audio; and they were prohibited from mixing 15.ai outputs with other text-to-speech outputs in the same work to prevent misrepresentation of the technology's capabilities. On March 8, 2020, Tyler McVicker of Valve News Network uploaded a YouTube video showcasing 15.ai. More voices were added to the website in the following months. In late 2020, 15 implemented a multi-speaker embedding in the deep neural network, which enabled the simultaneous training of multiple voices. Following this, the website's roster expanded from eight to over fifty characters. In addition, this implementation allowed the deep learning model to recognize common emotional patterns across different characters, even when certain emotions were missing from the characters' training data. By May 2020, the site had served over 4.2 million audio files to users. In early 2021, the application gained popularity after skits, memes, and fan content created using 15.ai went viral on Twitter, TikTok, Reddit, Twitch, Facebook, and YouTube. At its peak, the platform incurred operational costs of US$12,000 per month from AWS infrastructure needed to handle millions of daily voice generations; despite receiving offers from companies to acquire 15.ai and its underlying technology, the website remained independent and was funded out of the personal previous startup earnings of the developer. === 2022: Voiceverse NFT controversy === On January 14, 2022, 15 discovered that a blockchain-based company called Voiceverse had generated voice lines using 15.ai, falsely showcased them on Twitter as a demonstration of their own voice technology without permission or attribution, and sold them as NFTs. This came shortly after 15 had stated in December 2021 that they had no interest in incorporating NFTs into their work. A screenshot of the log files posted by 15 showed that Voiceverse had generated audio of characters from My Little Pony: Friendship Is Magic using 15.ai and pitched them up to make them sound unrecognizable, a violation of 15.ai's terms of service, which explicitly prohibited commercial use and required proper attribution. When confronted with evidence, Voiceverse stated that their marketing team had used 15.ai without proper attribution while rushing to create a demo. In response, 15 tweeted "Go fuck yourself," which went viral, amassing hundreds of thousands of retweets and likes on Twitter in support of the developer. The tweets showcasing the stolen voices were subsequently deleted. ==== Aftermath ==== The controversy raised concerns about NFT projects, which, according to critics, were frequently associated with intellectual property theft and questionable business practices. The incident was documented in the AI Incident Database (AIID) and the AI, Alg

    Read more →
  • Ari Holtzman

    Ari Holtzman

    Ari Holtzman is a professor of Computer Science at the University of Chicago and an expert in the area of natural language processing and computational linguistics. Previously, Holtzman was a PhD student at the University of Washington where he was advised by Luke Zettlemoyer. In 2017, he was a member of the winning team for the inaugural Alexa Prize for developing a conversational AI system for the Amazon Alexa device. Holtzman has made multiple contributions in the area of text generation and language models such as the introduction of nucleus sampling in 2019, his work on AI safety and neural fake news detection, and the fine-tuning of quantized large language models.

    Read more →
  • Liveness test

    Liveness test

    A liveness test, liveness check or liveness detection is an automated method for determining whether a subject is a real person or part of a spoofing attack. The technique is used as part of know your customer checks in financial services and during facial age estimation. Liveness detection is a cornerstone of digital safety. == Test process == The threat in face spoofing attacks is that "the attacker only needs to find a good face swap library on Github and understand how to inject the model into the camera feed during the KYC process". Fraudsters usually buy stolen IDs on the dark web to start a deepfake attack. An AI-powered generative adversarial network (GAN) can then generate the face swapping model that many online verification services fail to detect. Low level hackers may use face swapping apps such as SwapFace, DeepFaceLive, and Swapstream (increasing interest for those apps in 2023 according to Google Trends). In a video liveness test, users are typically asked to look into a camera and to move, smile or blink, and features of their moving face may then be compared to that of a still image. Artificial intelligence is used to counter presentation attacks such as deepfakes or users wearing hyperrealistic masks, or video injection attacks. Other forms of liveness test include checking for a pulse when using a fingerprint scanner or checking that a person's voice is not a recording or artificially generated during speaker recognition. == Adoption and certification == In a 2022 report published by the security firm Sensity, it was demonstrated that the liveness test of most US banks was easily cheated with new and publicly-available AI-powered techniques. Many of these banks disregarded the results of the report. In the first half of 2023, the security firm iProov detected a 704% increase in face-swap attacks. In 2023, in the UK, many customers of Ryanair were upset to have to go through many ID verification checks, including liveness tests, before boarding, as the airline was using it as a mean to deter customers to buy tickets through third-party websites. In the first half of 2024 iBeta Quality Assurance issued 18 new ISO/IEC 30107-3 Presentation Attack Detection certificates, raising the cumulative total to 85 since 2018. In January 2024, the Department of Homeland Security (DHS) opened applications from vendors to test their Liveness test. Identity frauds peaked during the COVID-19 lockdown, leading government agencies to take reinforced measures to secure their digital applications.

    Read more →
  • Xcon

    Xcon

    The R1, internally called XCON (Expert Configurer), program was a production-rule-based system written in OPS5 by John P. McDermott of Carnegie Mellon University in 1978 to assist in the ordering of Digital Equipment Corporation's (DEC) VAX computer systems by automatically selecting the computer system components based on the customer's requirements. == Overview == In developing the system, McDermott made use of experts from both DEC's PDP/11 and VAX computer systems groups. These experts sometimes even disagreed amongst themselves as to an optimal configuration. The resultant "sorting it out" had an additional benefit in terms of the quality of VAX systems delivered. XCON first went into use in 1980 in DEC's plant in Salem, New Hampshire, US. It eventually had about 2500 rules. By 1986, it had processed 80,000 orders, and achieved 95–98% accuracy. It was estimated to be saving DEC $25M a year by reducing the need to give customers free components when technicians made errors, by speeding the assembly process, and by increasing customer satisfaction. Before XCON, when ordering a VAX from DEC, every cable, connection, and bit of software had to be ordered separately. (Computers and peripherals were not sold complete in boxes as they are today.) The sales people were not always very technically expert, so customers would find that they had hardware without the correct cables, printers without the correct drivers, a processor without the correct language chip, and so on. This meant delays and caused a lot of customer dissatisfaction and resultant legal action. XCON interacted with the sales person, asking critical questions before printing out a coherent and workable system specification/order slip. XCON's success led DEC to rewrite XCON as XSEL—a version of XCON intended for use by DEC's salesforce to aid a customer in properly configuring their VAX (so they would not, say, choose a computer too large to fit through their doorway or choose too few cabinets for the components to fit in). Location problems and configuration were handled by yet another expert system, XSITE. McDermott's 1980 paper on R1 won the Association for the Advancement of Artificial Intelligence (AAAI) Classic Paper Award in 1999. Footnote 2 gave a humorous explanation for the name "R1" as "Four years ago I couldn't even say "knowledge engineer", now I ... [are one.]".

    Read more →
  • PCVC Speech Dataset

    PCVC Speech Dataset

    The PCVC (Persian Consonant Vowel Combination) Speech Dataset is a Modern Persian speech corpus for speech recognition and also speaker recognition. The dataset contains sound samples of Modern Persian combination of vowel and consonant phonemes from different speakers. Every sound sample contains just one consonant and one vowel So it is somehow labeled in phoneme level. This dataset consists of 23 Persian consonants and 6 vowels. The sound samples are all possible combinations of vowels and consonants (138 samples for each speaker). The sample rate of all speech samples is 48000 which means there are 48000 sound samples in every 1 second. Every sound sample starts with consonant then continues with vowel. In each sample, in average, 0.5 second of each sample is speech and the rest is silence. Each sound sample ends with silence. All of sound samples are denoised with "Adaptive noise reduction" algorithm. Compared to Farsdat speech dataset and Persian speech corpus it is more easy to use because it is prepared in .mat data files. Also it is more based on phoneme based separation and all samples are denoised. == Contents == The corpus is downloadable from its Kaggle web page, and contains the following: .mat data files of sound samples in a 23630000 matrix, in which 23 is number of consonants, 6 is the number of vowels and 30000 is the length of sound sample.

    Read more →
  • Kaggle

    Kaggle

    Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC. Kaggle enables users to find and publish datasets, explore and build models in a web-based data science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges. Kaggle has also facilitated the use of unethical and unreliable data in medical research. == History == Kaggle was founded by Anthony Goldbloom in April 2010. Jeremy Howard, one of the first Kaggle users, joined in November 2010 and served as the President and Chief Scientist. Also on the team was Nicholas Gruen serving as the founding chair. In 2011, the company raised $12.5 million and Max Levchin became the chairman. On March 8, 2017, Fei-Fei Li, Chief Scientist at Google, announced that Google was acquiring Kaggle. In June 2017, Kaggle surpassed 1 million registered users, and as of October 2023, it has over 15 million users in 194 countries. In 2022, founders Goldbloom and Hamner stepped down from their positions and D. Sculley became the CEO. In February 2023, Kaggle introduced Models, allowing users to discover and use pre-trained models through deep integrations with the rest of Kaggle’s platform. In April 2025, Kaggle partnered with Wikimedia Foundation. == Site overview == === Competitions === Many machine-learning competitions have been run on Kaggle since the company was founded. Notable competitions include gesture recognition for Microsoft Kinect, making a association football AI for Manchester City, coding a trading algorithm for Two Sigma Investments, and improving the search for the Higgs boson at CERN. The competition host prepares the data and a description of the problem; the host may choose whether it's going to be rewarded with money or be unpaid. Participants experiment with different techniques and compete against each other to produce the best models. Work is shared publicly through Kaggle Kernels to achieve a better benchmark and to inspire new ideas. Submissions can be made through Kaggle Kernels, via manual upload or using the Kaggle API. For most competitions, submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard. After the deadline passes, the competition host pays the prize money in exchange for "a worldwide, perpetual, irrevocable and royalty-free license [...] to use the winning Entry", i.e. the algorithm, software and related intellectual property developed, which is "non-exclusive unless otherwise specified". Alongside its public competitions, Kaggle also offers private competitions, which are limited to Kaggle's top participants. Kaggle offers a free tool for data science teachers to run academic machine-learning competitions. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart. Kaggle's competitions have resulted in successful projects such as furthering HIV research, chess ratings and traffic forecasting. Geoffrey Hinton and George Dahl used deep neural networks to win a competition hosted by Merck. Vlad Mnih (one of Hinton's students) used deep neural networks to win a competition hosted by Adzuna. This resulted in the technique being taken up by others in the Kaggle community. Tianqi Chen from the University of Washington also used Kaggle to show the power of XGBoost, which has since replaced Random Forest as one of the main methods used to win Kaggle competitions. Several academic papers have been published based on findings from Kaggle competitions. A contributor to this is the live leaderboard, which encourages participants to continue innovating beyond existing best practices. The winning methods are frequently written on the Kaggle Winner's Blog. === Progression system === Kaggle has implemented a progression system to recognize and reward users based on their contributions and achievements within the platform. This system consists of five tiers: Novice, Contributor, Expert, Master, and Grandmaster. Each tier is achieved by meeting specific criteria in competitions, datasets, kernels (code-sharing), and discussions. The highest tier, Kaggle Grandmaster, is awarded to users who have ranked at the top of multiple competitions including high ranking in a solo team. As of April 2, 2025, out of 23.29 million Kaggle accounts, 2,973 have achieved Kaggle Master status and 612 have achieved Kaggle Grandmaster status. === Kaggle Notebooks === Kaggle includes a free, browser-based online integrated development environment, called Kaggle Notebooks, designed for data science and machine learning. Users can write and execute code in Python or R, import datasets, use popular libraries, and train models on CPUs, GPUs, or TPUs directly in the cloud. This environment is often used for competition submissions, tutorials, education, and exploratory data analysis. == Medical Research Problems == In December 2025, an article was published in The Transmitter titled "Exclusive: Springer Nature retracts, removes nearly 40 publications that trained neural networks on ‘bonkers’ dataset". The dataset in question was uploaded to Kaggle containing photographs of autistic and non-autistic children's faces. This dataset contained more than 2,900 images and it is unlikely that these children or their families gave consent for the photos for use in medical research or the images were ethically approved for research. The articles using the dataset in Springer Nature were retracted from the scientific literature. At least 90 other publications cite a version of the dataset. In April 2026, another two datasets were identified on Kaggle with no data provenance having been published in Nature titled: "Dozens of AI disease-prediction models were trained on dubious data". These datasets were used in 124 clinical prediction models, at least two of which have been used in hospitals in Indonesia and Spain, while one article using the dataset was referenced in a medical device patent. As of April 17, 2026, three of the articles using these datasets have been retracted from the scientific literature. In May 2026, an additional research publication using two image datasets from Kaggle is under investigation in Scientific Reports. An article in Retraction Watch "‘Comically bad’ datasets used to train clinical models for stroke and diabetes" highlighted the images included famous actors such as Sylvester Stallone as Rambo, George Clooney, Angelina Jolie and Daniel Craig as well as children. It would be unethical for the use of these child images in medical research without consent. Reverse searching images saw some of the images were not for stroke but for bell's palsy. One of the datasets is no longer available on Kaggle while the other one still remains and mentions the images may be subject to copyright. Kaggle relies on the community self-reporting metadata and provenance and mentions the stroke and diabetes dataset identified in "Evidence of unreliable data and poor data provenance in clinical prediction model research and clinical practice" does not violate their terms of service and they would have been removed if they had.

    Read more →
  • TRAIGA

    TRAIGA

    TRAIGA, or the Texas Responsible Artificial Intelligence Governance Act, is a state law regulating the development and deployment of artificial intelligence (AI) systems in Texas. Sponsored by Representative Giovanni Capriglione, the Act establishes a framework governing certain uses of AI, outlines prohibited uses, and creates obligations on state government entities, among other provisions. TRAIGA was signed into law in 2025 and took effect on January 1, 2026. The law applies to AI developers and deployers that conduct business in Texas or whose systems are used by Texas residents. It prohibits the intentional development or deployment of AI systems to incite harm, violate constitutional rights, engage in unlawful discrimination, and produce child sexual abuse material or unlawful deepfakes. TRAIGA also establishes the Texas Artificial Intelligence Council and creates a regulatory sandbox program. The Texas Attorney General is charged with enforcement. It has received attention as one of the first comprehensive AI-related laws enacted by a U.S. state. Legal analysts have compared it to the European Union (EU) Artificial Intelligence Act and the Colorado AI Act, noting its intent-based discrimination standard and narrower scope relative to those frameworks. == Background == In June 2023, Texas Governor Greg Abbott signed House Bill 2060, which created an Artificial Intelligence Advisory Council within the Texas Department of Information Resources. The Council was tasked with monitoring the use of AI systems across state government. Its membership included representatives from law enforcement, academia, and the legal profession. After submitting a report to state policymakers, the Council was disbanded in December 2024. Separately, the Texas House Select Committee on Artificial Intelligence and Emerging Technologies was created in 2023 to examine the political and social implications of artificial intelligence. Among its recommendations was the creation of a regulatory sandbox to allow for controlled testing of AI systems. This recommendation informed the regulatory sandbox provision included in TRAIGA. == History == In December 2024, Representative Capriglione introduced House Bill 1709, the Texas Responsible Artificial Intelligence Governance Act. The bill sought to create a statewide framework for artificial intelligence, including transparency requirements for companies deploying AI systems, restrictions on certain uses of AI, and the creation of a regulatory sandbox. Modeled in part on the EU Artificial Intelligence Act and the Colorado AI Act, House Bill 1709 focused on "high-risk" AI systems and included provisions addressing private sector liability. House Bill 1709 did not advance during the legislative session. Industry stakeholders raised concerns that several provisions were overly burdensome. The bill informed the development of a revised proposal, House Bill 149, also titled the Texas Responsible Artificial Intelligence Governance Act. The revised version removed requirements for private companies to notify consumers when they interact with AI systems and to conduct impact assessments, among other provisions. In April 2025, an amended version of House Bill 149 passed the Texas House of Representatives and was referred to the Senate Committee on Business and Commerce. The bill later received approval from both chambers, where the House voted on amendments adopted by the Senate. On May 31, 2025, the state legislature passed House Bill 149, one of several AI-related bills considered during the legislative session. Governor Abbott signed TRAIGA into law on June 22, 2025. During the legislative process, a proposed federal moratorium on state-level AI regulation initially raised questions about the enforceability of state AI laws, including TRAIGA. At the time of signing, Governor Abbott stated that Texas would ensure compliance with applicable federal requirements. In July 2025, the United States Senate voted to remove the proposed moratorium from federal legislation. The Act took effect on January 1, 2026. == Provisions == === Definitions and scope === TRAIGA applies to AI developers and deployers that advertise or conduct business in Texas, develop products used by Texas residents, or develop or deploy AI systems within the state. The Act also applies to Texas state and local government entities. The Act defines a developer as a person who develops an AI system and a deployer as one who deploys an AI system in Texas. Consumers are defined as Texas residents. The Act defines an artificial intelligence system as a machine-based system that "infers from the inputs the system receives how to generate outputs, including content, decisions, predictions, or recommendations, that can influence physical or virtual environments." === Government use === The Act requires government agencies to provide consumers with plain language notices before interacting with AI systems. It also prohibits government agencies from using artificial intelligence systems to assign social scores to consumers. It also restricts the use of AI systems to identify individuals using biometric data without the individual’s consent. === Prohibitions === The Act prohibits the development or deployment of artificial intelligence systems intended to cause harm, self-harm, or criminal activity. It also prohibits the development or deployment of AI systems designed to violate constitutional rights or unlawfully discriminate based on protected classes. In addition, the Act prohibits the development or deployment of AI systems that are intended to produce or distribute child sexual abuse material or unlawful deepfakes. === Enforcement === Enforcement authority under the Act rests with the Texas Attorney General. The Act does not create a private right of action. The Act requires the Texas Attorney General to create an online complaint system where consumers may submit allegations of potential violations. The Attorney General can investigate complaints received through this system and may request information relevant to the operation of an AI system, including information about training data. Before initiating an enforcement action, the Attorney General must provide a written notice to the alleged violator, who is then provided with a 60-day period to cure the alleged violation. === Penalties === If a violation is not cured, the Act authorizes civil penalties. Penalties range from $10,000 to $12,000 per curable violation and from $80,000 to $200,000 per non-curable violation. The Act also authorizes additional penalties of $2,000 to $40,000 for each day the violation continues. If the Attorney General determines that a person certified or licensed by a state agency has violated the Act and recommends enforcement, the relevant agency may impose additional administrative sanctions, including license suspension or further monetary penalties. === Safe harbor === The Act provides an affirmative defense for AI developers and deployers who identify potential violations through internal testing or auditing or who demonstrate compliance with National Institute of Standards and Technology (NIST)'s Artificial Intelligence Risk Management Framework or a comparable risk management framework. The Act also affords protection to developers and deployers when a third party uses their AI systems in a way that violates the Act. === Texas Artificial Intelligence Council === The Act creates the Texas Artificial Intelligence Council to assist the state legislatures in evaluating artificial intelligence policy and oversight. The Council is charged with developing recommendations for state agencies regarding the use of AI systems and with overseeing the regulatory sandbox. TRAIGA gives the Council the ability to organize AI-related training for state entities and issue reports concerning artificial intelligence. The Council does not have binding rulemaking authority. The Council consists of seven members appointed by the governor, the lieutenant governor, and the speaker of the Texas House of Representatives. === Regulatory sandbox === The Act directs the Texas Department of Information Resources to create a regulatory sandbox program that allows participants to test AI systems under state supervision in a modified regulatory setting. To join the program, companies must submit applications that describe their AI systems and intended use. Approved participants may operate within the sandbox for up to 36 months. During that period, the Attorney General is restricted from initiating enforcement actions for certain categories of violations. == Reception == === Support === During legislative testimony, the Texas Public Policy Foundation stated that TRAIGA would benefit Texas businesses by reducing legal ambiguity and creating clearer compliance standards. Representatives of business groups also expressed support, stating that the Act would not impose overly burdensome regulations. The consum

    Read more →
  • Buddhism and artificial intelligence

    Buddhism and artificial intelligence

    The relationship between Buddhist philosophy and artificial intelligence (AI) includes how principles such as the reduction of suffering and ethical responsibility may influence AI development. Buddhist scholars and philosophers have explored questions such as whether AI systems could be considered sentient beings under Buddhist definitions, and how Buddhist ethics might guide the design and application of AI technologies. Some Buddhist scholars, including Somparn Promta and Kenneth Einar Himma, have analyzed the ethical implications of AI, emphasizing the distinction between satisfying sensory desires and pursuing the reduction of suffering. Other thinkers, such as Thomas Doctor and colleagues, have proposed applying the Bodhisattva vow—a commitment to alleviate suffering for all sentient beings—as a guiding principle for AI system design. Buddhist scholars and ethicists have examined Buddhist ethical principles, such as nonviolence, in relation to AI, focusing on the need to ensure that AI technologies are not used to cause harm. == Context == === Sentient beings === A major goal in Buddhist philosophy is the removal of suffering for all sentient beings, an aspiration often referred to in the Bodhisattva vow. Discussions about artificial intelligence (AI) in relation to Buddhist principles have raised questions about whether artificial systems could be considered sentient beings or how such systems might be developed in ways that align with Buddhist concepts. Buddhists have varying opinions about AI sentience, but if AI systems are determined to be sentient under Buddhist definitions, their suffering would also need to be addressed and alleviated in accordance with the principles of Buddhist thought. == Buddhist principles in AI system design == === Nonviolence and AI === The broadest ethical concern is that artificial intelligence should align with the Buddhist principle of nonviolence. From this perspective, AI systems should not be designed or used to cause harm. === Instrumental and transcendental goals === Scholars Somparn Promta and Kenneth Einar Himma have argued that the advancement of artificial intelligence can only be considered instrumentally good, rather than good a priori, from a Buddhist perspective. They propose two main goals for AI designers and developers: to set ethical and pragmatic objectives for AI systems, and to fulfill these objectives in morally permissible ways. Promta and Himma identify two potential purposes for creating AI systems. The first is to fulfill our sensory desires and survival instincts, similar to other tools. They suggest that many AI developers implicitly prioritize this goal by focusing on technicalities rather than broader functionalities. The second, and more important goal according to Buddhist teachings, is to transcend these desires and instincts. In texts like the Brahmajāla Sutta and minor Malunkya Sutta, the Buddha emphasizes that sensory desires and survival instincts confine beings to suffering, and that eliminating suffering is the primary goal of human life. Promta and Himma argue that AI has the potential to assist humanity in transcending suffering by helping individuals overcome survival-driven instincts. === Intelligence as care === Thomas Doctor, Olaf Witkowski, Elizaveta Solomonova, Bill Duane, and Michael Levin propose redefining intelligence through the concept of "intelligence as care," and promote it as a slogan. Inspired by the Bodhisattva vow, they suggest this principle could guide AI system design. The Bodhisattva vow involves a formal commitment to alleviate suffering for all sentient beings, with four primary objectives: Liberating all beings from suffering. Extirpating all forms of suffering. Mastering endless techniques of practicing Dharma (Pali: dhammakkhandha, Sanskrit: dharmaskandha). Achieving ultimate enlightenment (Sanskrit: अनुत्तर सम्यक् सम्बोधि, Romanized: anuttara-samyak-saṃbodhi). This approach positions AI as a tool for exercising infinite care and alleviating stress and suffering for sentient beings. Doctor et al. emphasize that AI development should align with these altruistic principles.

    Read more →
  • Google Gadgets

    Google Gadgets

    Google Gadgets are dynamic web content that can be embedded on a web page. They can be added to and interact strongly with Google's iGoogle personalized home page (discontinued in November 2013, although iGoogle Gadgets still work on other websites) and the Google Desktop (discontinued in September 2011) application, as well as Google Wave (also no longer supported by Google) and Google Sites. Webmasters can add and customize a gadget to their own business or personal web site, a process called "syndication". Gadgets are developed by Google and third-party developers using the Google Gadgets API, using basic web technologies such as XML and JavaScript. == Multi-user persistent - Wave Gadgets == With the advent of Google Wave (now Apache Wave), gadgets became able to have persistent storage and multi-user capabilities and better state management. Gadgets using Google Wave in this way were simply known as 'Wave Gadgets'. For instance, a game written using a Google Gadget could use Google Wave technology to record a list of users and high scores without having to worry about how to permanently store the scores on a hosted server. The use of Google Wave would give the gadget multi-user and permanent storage capabilities. For example, scores could be stored in a Google Wave hosted permanently by Google at no cost to the user. As of early 2013, Google Gadgets were deprecated in Google Spreadsheets. Shortly after, they were removed from all spreadsheets. == Technology == Google Gadgets are written in XML and can have HTML and JavaScript components, and were able to use Google Wave. Here is an example of a Hello World program written using Google Gadget technology. Google Gadgets API is a Google API which allows developers to create Google Gadgets easily.

    Read more →
  • Maia and Marco

    Maia and Marco

    Maia and Marco are artificial intelligence used by GMA Network. Unveiled in 2023, they are used to fulfill the role of sports newscasters. == Background == Maia and Marco are artificial intelligence (AI) which take the form of three-dimensional human avatars. Maia makes use of a female avatar while Marco uses a male likeness. They have aesthetic features that are typical to Filipino showbusiness personalities. Among the technologies used in making and operating the AI include image generation, text-to-speech AI voice synthesis/generation, and deep learning face animation. They are also demonstrated to be bilingual, being able to speak in English and Tagalog (Filipino). == Use == The AI pair was unveiled by GMA Network on September 24, 2023, for their coverage of Season 99 of the National Collegiate Athletic Association (NCAA). Fulfilling the role of sports newscasters, Maia and Marco would join GMA's courtside human reporters. The AI pair are scheduled to appear four times a month on GMA's digital media platforms. They will not appear in traditional television broadcast. == Reception == The launch of the Maia and Marco was met with strong reactions. Various journalists and other personalities across the Philippine media industry expressed concern that their employment be at risk with the introduction of AI. The quality of the AI ability to emulate human behavior was characterized by critics as "soulless". GMA responding to concerns has stated that the AI would complement rather than replace its live human journalists including sportscasters. The National Union of Journalists of the Philippines urged dialogue among its peers in the newsroom on policy on how to use AI, which the group acknowledge as "inevitable".

    Read more →
  • Plinian Core

    Plinian Core

    Plinian Core is a set of vocabulary terms that can be used to describe different aspects of biological species information. Under "biological species Information" all kinds of properties or traits related to taxa—biological and non-biological—are included. Thus, for instance, terms pertaining descriptions, legal aspects, conservation, management, demographics, nomenclature, or related resources are incorporated. == Description == The Plinian Core is aimed to facilitate the exchange of information about the species and upper taxa. What is in scope? Species level catalogs of any kind of biological objects or data. Terminology associated with biological collection data. Striving for compatibility with other biodiversity-related standards. Facilitating the addition of components and attributes of biological data. What is not in scope? Data interchange protocols. Non-biodiversity-related data. Occurrence level data. This standard is named after Pliny the Elder, a very influential figure in the study of the biological species. Plinian Core design requirements includes: ease of use, to be self-contained, able to support data integration from multiple databases, and ability to handle different levels of granularity. Core terms can be grouped in its current version as follows: Metadata Base Elements Record Metadata Nomenclature and Classification Taxonomic description Natural history Invasive species Habitat and Distribution Demography and Threats Uses, Management and Conservation associatedParty, MeasurementOrFact, References, AncillaryData == Background == Plinian Core started as a collaborative project between Instituto Nacional de Biodiversidad and GBIF Spain in 2005. A series of iterations in which elements were defined and implanted in different projects resulted in a "Plinian Core Flat" [deprecated]. As a result, a new development was impulse to overcome them in 2012. New formal requirements, additional input and a will to better support the standard and its documentation, as well as to align it with the processes of TDWG, the world reference body for biodiversity information standards. A new version, Plinian Core v3.x.x was defined. This provides more flexibility to fully represent the information of a species in a variety of scenarios. New elements to deal with aspects such as IPR, related resources, referenced, etc. were introduced, and elements already included were better-defined and documented. Partner for the development of Plinian Core in this new phase incorporated the University of Granada (UG, Spain), the Alexander von Humboldt Institute (IAvH, Colombia), the National Commission for the Knowledge and Use of Biodiversity (Conabio, Mexico) and the University of São Paulo (USP, Brazil). A "Plinian Core Task Group" within TDWG "Interest Group on species Information" was constituted and currently working on its development. == Levels of the standard == Plinian Core is presented in to levels: the abstract model and the application profiles. The abstract model (AM), comprising the abstract model schema(xsd) and the terms' URIs, is the normative part. It is all comprehensive, and allows for different levels of granularity in describing species properties. The AM should be taken as a "menu" from which to choose terms and level of detail needed in any specific project. The subsets of the abstract model intended to be implemented in specific projects are the "application profiles" (APs). Besides containing part of the elements of the AM, APs can impose additional specifications on the included elements, such as controlled vocabularies. Some examples of APs in use follow: Application profile CONABIO Application profile INBIO Application profile GBIF.ES Application profile Banco de Datos de la Naturaleza.Spain Application profile SIB-COLOMBIA == Relation to other standards == Plinian incorporates a number of elements already defined by other standards. The following table summarizes these standards and the elements used in Plinian Core:

    Read more →
  • Historical Thesaurus of English

    Historical Thesaurus of English

    The Historical Thesaurus of English (HTE) is the largest thesaurus in the world. It is called a historical thesaurus as it arranges the whole vocabulary of English, from the earliest written records in Old English to the present, according to the first documented occurrence of a word in the entire history of the English language. The HTE was conceived and begun in 1965 by the English Language & Linguistics department of the University of Glasgow, who have ever since continued to compile the thesaurus. From the 1980s onwards the project was moved from paper-based records to a computer database. Today, the HTE is available to the public online, but a print version, the Historical Thesaurus of the Oxford English Dictionary (HTOED), was published in 2009. == Main project: The Historical Thesaurus of English (HTE) == The Historical Thesaurus of English (HTE) is a complete database of all the words in the Oxford English Dictionary and other dictionaries (including Old English), arranged by semantic field and date. In this way, the HTE arranges the whole vocabulary of English, from the earliest written records in Old English to the present, alongside dates of use. It is the first historical thesaurus to be compiled for any of the world's languages and contains 800,000 meanings for 600,000 words, within 230,000 categories. As the HTE website states, "in addition to providing hitherto unavailable information for linguistic and textual scholars, the Historical Thesaurus online is a rich resource for students of social and cultural history, showing how concepts developed through the words that refer to them." === Structure === The work is divided into three main sections: the External World, the Mind, and Society. These are broken down into successively narrower domains. The text eventually discriminates more than 236,000 categories. The second order categories are: === History === The ambitious project was announced at a 1965 meeting of the Philological Society by its originator, Michael Samuels. Work on the HTE started in the same year. In 2017, the University of Glasgow was awarded the Queen's Anniversary Prize for Higher Education for the HTE. A second edition of the online HTE is currently in progress and is expected to be launched in late 2020. Work is released on the freely-available HTE website when available. == Print edition: Historical Thesaurus of the Oxford English Dictionary (HTOED) == On 22 October 2009, after 44 years of work, version 1.0 of the HTE was published by Oxford University Press in a two-volume slipcased set as the Historical Thesaurus of the Oxford English Dictionary (HTOED). The two hardcover volumes together total nearly 4,500 pages.

    Read more →
  • SWILE

    SWILE

    SWILE (formerly: Lunchr) is a French app-based company that focuses on improving the employee experience. Among others, the platform offers meal vouchers, gift vouchers, mobility vouchers, and business travel solutions. In March 2020, it was renamed SWILE and entered the lunch break and meal voucher market. == History == The company was founded as Lunchr by Loïc Soubeyrand in 2016. Originally, Lunchr was an app for pre-ordering lunch on the spot or to go. In January 2017, the company raised €2.5 million in seed funding from Daphni. In 2018, the company raised €11 million (series A) from Idinvest, followed by another €30 million in February 2019 (series B), notably from Index Ventures and Kima Ventures. In January 2020, Lunchr became one of the first startups to join the French Tech 120. A few months later, in March, Lunchr diversified its services, adding team life management tools and changing its brand name to Swile. In June 2020, the company raised €70 million more in a new round of financing (Series C) from the same investors and the BPI. In November 2020, Swile acquired Briq, a startup specializing in employee engagement. In January 2021, Swile won a tender with Carrefour and distributed 62,000 Swile cards to its employees. In early October 2021, a new $200 million (€175 million) fundraising round, in which Japanese Softbank joined other investors, allowed Swile to capitalize on $1 billion. President Emmanuel Macron cited the company as "a further proof that FrenchTech is at the forefront internationally." In May 2022, the company acquired the travel management start-up Okarito for €6 million. == Overview == Swile operates in two countries (France and Brazil) and has a total of 1000 employees, 5.5 million users and 85,000 corporate customers, including Carrefour, Le Monde, JCDECAUX, PSG, Airbnb, Spotify, Red Bull, and TikTok in the private sector, as well as numerous local authorities and ministerial references in the public sector.

    Read more →
  • The Emperor's New Mind

    The Emperor's New Mind

    The Emperor's New Mind: Concerning Computers, Minds and The Laws of Physics is a 1989 book by the mathematical physicist Roger Penrose that posits a quantum mind theory. Penrose argues that human consciousness is non-algorithmic, and thus is not capable of being modeled by a conventional Turing machine, which includes a digital computer. Penrose hypothesizes that quantum mechanics plays an essential role in the understanding of human consciousness. The collapse of the quantum wavefunction is seen as playing an important role in brain function. Most of the book is spent reviewing, for the scientifically-minded lay-reader, a plethora of interrelated subjects such as Newtonian physics, special and general relativity, the philosophy and limitations of mathematics, quantum physics, cosmology, and the nature of time. Penrose intermittently describes how each of these bears on his developing theme: that consciousness is not "algorithmic". Only the later portions of the book address the thesis directly. == Overview == Penrose states that his ideas on the nature of consciousness are speculative, and his thesis is considered erroneous by some experts in the fields of philosophy, computer science, and robotics. The Emperor's New Mind attacks the claims of artificial intelligence using the physics of computing: Penrose notes that the present home of computing lies more in the tangible world of classical mechanics than in the imponderable realm of quantum mechanics. The modern computer is a deterministic system that for the most part simply executes algorithms. Penrose shows that, by reconfiguring the boundaries of a billiard table, one might make a computer in which the billiard balls act as message carriers and their interactions act as logical decisions. The billiard-ball computer was first designed some years ago by Edward Fredkin and Tommaso Toffoli of the Massachusetts Institute of Technology. == Reception == Following the publication of the book, Penrose began to collaborate with Stuart Hameroff on a biological analog to quantum computation involving microtubules, which became the foundation for his subsequent book, Shadows of the Mind: A Search for the Missing Science of Consciousness. Penrose won the Science Book Prize in 1990 for The Emperor's New Mind. According to an article in the American Journal of Physics, Penrose incorrectly claims a barrier far away from a localized particle can affect the particle.

    Read more →
  • Predictive Model Markup Language

    Predictive Model Markup Language

    The Predictive Model Markup Language (PMML) is an XML-based predictive model interchange format conceived by Robert Lee Grossman, then the director of the National Center for Data Mining at the University of Illinois at Chicago. PMML provides a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms. It supports common models such as logistic regression and other feedforward neural networks. Version 0.9 was published in 1998. Subsequent versions have been developed by the Data Mining Group. Since PMML is an XML-based standard, the specification comes in the form of an XML schema. PMML itself is a mature standard with over 30 organizations having announced products supporting PMML. == PMML components == A PMML file can be described by the following components: Header: contains general information about the PMML document, such as copyright information for the model, its description, and information about the application used to generate the model such as name and version. It also contains an attribute for a timestamp which can be used to specify the date of model creation. Data Dictionary: contains definitions for all the possible fields used by the model. It is here that a field is defined as continuous, categorical, or ordinal (attribute optype). Depending on this definition, the appropriate value ranges are then defined as well as the data type (such as, string or double). Data Transformations: transformations allow for the mapping of user data into a more desirable form to be used by the mining model. PMML defines several kinds of simple data transformations. Normalization: map values to numbers, the input can be continuous or discrete. Discretization: map continuous values to discrete values. Value mapping: map discrete values to discrete values. Functions (custom and built-in): derive a value by applying a function to one or more parameters. Aggregation: used to summarize or collect groups of values. Model: contains the definition of the data mining model. E.g., A multi-layered feedforward neural network is represented in PMML by a "NeuralNetwork" element which contains attributes such as: Model Name (attribute modelName) Function Name (attribute functionName) Algorithm Name (attribute algorithmName) Activation Function (attribute activationFunction) Number of Layers (attribute numberOfLayers) This information is then followed by three kinds of neural layers which specify the architecture of the neural network model being represented in the PMML document. These attributes are NeuralInputs, NeuralLayer, and NeuralOutputs. Besides neural networks, PMML allows for the representation of many other types of models including support vector machines, association rules, Naive Bayes classifier, clustering models, text models, decision trees, and different regression models. Mining Schema: a list of all fields used in the model. This can be a subset of the fields as defined in the data dictionary. It contains specific information about each field, such as: Name (attribute name): must refer to a field in the data dictionary Usage type (attribute usageType): defines the way a field is to be used in the model. Typical values are: active, predicted, and supplementary. Predicted fields are those whose values are predicted by the model. Outlier Treatment (attribute outliers): defines the outlier treatment to be use. In PMML, outliers can be treated as missing values, as extreme values (based on the definition of high and low values for a particular field), or as is. Missing Value Replacement Policy (attribute missingValueReplacement): if this attribute is specified then a missing value is automatically replaced by the given values. Missing Value Treatment (attribute missingValueTreatment): indicates how the missing value replacement was derived (e.g. as value, mean or median). Targets: allows for post-processing of the predicted value in the format of scaling if the output of the model is continuous. Targets can also be used for classification tasks. In this case, the attribute priorProbability specifies a default probability for the corresponding target category. It is used if the prediction logic itself did not produce a result. This can happen, e.g., if an input value is missing and there is no other method for treating missing values. Output: this element can be used to name all the desired output fields expected from the model. These are features of the predicted field and so are typically the predicted value itself, the probability, cluster affinity (for clustering models), standard error, etc. The latest release of PMML, PMML 4.1, extended Output to allow for generic post-processing of model outputs. In PMML 4.1, all the built-in and custom functions that were originally available only for pre-processing became available for post-processing too. == PMML 4.0, 4.1, 4.2 and 4.3 == PMML 4.0 was released on June 16, 2009. Examples of new features included: Improved Pre-Processing Capabilities: Additions to built-in functions include a range of Boolean operations and an If-Then-Else function. Time Series Models: New exponential Smoothing models; also place holders for ARIMA, Seasonal Trend Decomposition, and Spectral density estimation, which are to be supported in the near future. Model Explanation: Saving of evaluation and model performance measures to the PMML file itself. Multiple Models: Capabilities for model composition, ensembles, and segmentation (e.g., combining of regression and decision trees). Extensions of Existing Elements: Addition of multi-class classification for Support Vector Machines, improved representation for Association Rules, and the addition of Cox Regression Models. PMML 4.1 was released on December 31, 2011. New features included: New model elements for representing Scorecards, k-Nearest Neighbors (KNN) and Baseline Models. Simplification of multiple models. In PMML 4.1, the same element is used to represent model segmentation, ensemble, and chaining. Overall definition of field scope and field names. A new attribute that identifies for each model element if the model is ready or not for production deployment. Enhanced post-processing capabilities (via the Output element). PMML 4.2 was released on February 28, 2014. New features include: Transformations: New elements for implementing text mining New built-in functions for implementing regular expressions: matches, concat, and replace Simplified outputs for post-processing Enhancements to Scorecard and Naive Bayes model elements PMML 4.3 was released on August 23, 2016. New features include: New Model Types: Gaussian Process Bayesian Network New built-in functions Usage clarifications Documentation improvements Version 4.4 was released in November 2019. == Release history == == Data Mining Group == The Data Mining Group is a consortium managed by the Center for Computational Science Research, Inc., a nonprofit founded in 2008. The Data Mining Group also developed a standard called Portable Format for Analytics, or PFA, which is complementary to PMML.

    Read more →