AI Apps Free

AI Apps Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Open information extraction

    Open information extraction

    In natural language processing, open information extraction (OIE) is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary propositions. == Overview == A proposition can be understood as truth-bearer, a textual expression of a potential fact (e.g., "Dante wrote the Divine Comedy"), represented in an amenable structure for computers [e.g., ("Dante", "wrote", "Divine Comedy")]. An OIE extraction normally consists of a relation and a set of arguments. For instance, ("Dante", "passed away in" "Ravenna") is a proposition formed by the relation "passed away in" and the arguments "Dante" and "Ravenna". The first argument is usually referred as the subject while the second is considered to be the object. The extraction is said to be a textual representation of a potential fact because its elements are not linked to a knowledge base. Furthermore, the factual nature of the proposition has not yet been established. In the above example, transforming the extraction into a full fledged fact would first require linking, if possible, the relation and the arguments to a knowledge base. Second, the truth of the extraction would need to be determined. In computer science transforming OIE extractions into ontological facts is known as relation extraction. In fact, OIE can be seen as the first step to a wide range of deeper text understanding tasks such as relation extraction, knowledge-base construction, question answering, semantic role labeling. The extracted propositions can also be directly used for end-user applications such as structured search (e.g., retrieve all propositions with "Dante" as subject). OIE was first introduced by TextRunner developed at the University of Washington Turing Center headed by Oren Etzioni. Other methods introduced later such as Reverb, OLLIE, ClausIE or CSD helped to shape the OIE task by characterizing some of its aspects. At a high level, all of these approaches make use of a set of patterns to generate the extractions. Depending on the particular approach, these patterns are either hand-crafted or learned. == OIE systems and contributions == Reverb suggested the necessity to produce meaningful relations to more accurately capture the information in the input text. For instance, given the sentence "Faust made a pact with the devil", it would be erroneous to just produce the extraction ("Faust", "made", "a pact") since it would not be adequately informative. A more precise extraction would be ("Faust", "made a pact with", "the devil"). Reverb also argued against the generation of overspecific relations. OLLIE stressed two important aspects for OIE. First, it pointed to the lack of factuality of the propositions. For instance, in a sentence like "If John studies hard, he will pass the exam", it would be inaccurate to consider ("John", "will pass", "the exam") as a fact. Additionally, the authors indicated that an OIE system should be able to extract non-verb mediated relations, which account for significant portion of the information expressed in natural language text. For instance, in the sentence "Obama, the former US president, was born in Hawaii", an OIE system should be able to recognize a proposition ("Obama", "is", "former US president"). ClausIE introduced the connection between grammatical clauses, propositions, and OIE extractions. The authors stated that as each grammatical clause expresses a proposition, each verb mediated proposition can be identified by solely recognizing the set of clauses expressed in each sentence. This implies that to correctly recognize the set of propositions in an input sentence, it is necessary to understand its grammatical structure. The authors studied the case in the English language that only admits seven clause types, meaning that the identification of each proposition only requires defining seven grammatical patterns. The finding also established a separation between the recognition of the propositions and its materialization. In a first step, the proposition can be identified without any consideration of its final form, in a domain-independent and unsupervised way, mostly based on linguistic principles. In a second step, the information can be represented according to the requirements of the underlying application, without conditioning the identification phase. Consider the sentence "Albert Einstein was born in Ulm and died in Princeton". The first step will recognize the two propositions ("Albert Einstein", "was born", "in Ulm") and ("Albert Einstein", "died", "in Princeton"). Once the information has been correctly identified, the propositions can take the particular form required by the underlying application [e.g., ("Albert Einstein", "was born in", "Ulm") and ("Albert Einstein", "died in", "Princeton")]. CSD introduced the idea of minimality in OIE. It considers that computers can make better use of the extractions if they are expressed in a compact way. This is especially important in sentences with subordinate clauses. In these cases, CSD suggests the generation of nested extractions. For example, consider the sentence "The Embassy said that 6,700 Americans were in Pakistan". CSD generates two extractions [i] ("6,700 Americans", "were", "in Pakistan") and [ii] ("The Embassy", "said", "that [i]"). This is usually known as reification.

    Read more →
  • Algorithmic accountability

    Algorithmic accountability

    Algorithmic accountability refers to the allocation of responsibility for the consequences of real-world actions influenced by algorithms used in decision-making processes. Ideally, algorithms should be designed to eliminate bias from their decision-making outcomes. This means they ought to evaluate only relevant characteristics of the input data, avoiding distinctions based on attributes that are generally inappropriate in social contexts, such as an individual's ethnicity in legal judgments. However, adherence to this principle is not always guaranteed, and there are instances where individuals may be adversely affected by algorithmic decisions. Responsibility for any harm resulting from a machine's decision may lie with the algorithm itself or with the individuals who designed it, particularly if the decision resulted from bias or flawed data analysis inherent in the algorithm's design. == Algorithm usage == Algorithms are widely utilized across various sectors of society that incorporate computational techniques in their control systems. These applications span numerous industries, including but not limited to medical, transportation, and payment services. In these contexts, algorithms perform functions such as: Approving or denying credit card applications; Approving or denying immigrant visas; Determining which taxpayers will be audited on their income taxes; Managing systems that control self-driving cars on a highway; Scoring individuals as potential criminals for use in legal proceedings; Search engines that match and rank database and internet search results; Recommendation systems that filter which news, entertainment, or purchase items are featured in a feed; Market-making algorithms that match sellers and buyers, such as in transportation (ride-hailing) or financial platforms. However, the implementation of these algorithms can be complex and opaque. Generally, algorithms function as "black boxes," meaning that the specific processes an input undergoes during execution are often not transparent, with users typically only seeing the resulting output. This lack of transparency raises concerns about potential biases within the algorithms, as the parameters influencing decision-making may not be well understood. The outputs generated can lead to perceptions of bias, especially if individuals in similar circumstances receive different results. According to Nicholas Diakopoulos: But these algorithms can make mistakes. They have biases. Yet they sit in opaque black boxes, their inner workings, their inner “thoughts” hidden behind layers of complexity. We need to get inside that black box, to understand how they may be exerting power on us, and to understand where they might be making unjust mistakes == Wisconsin Supreme Court case == Algorithms are prevalent across various fields and significantly influence decisions that affect the population at large. Their underlying structures and parameters often remain unknown to those impacted by their outcomes. A notable case illustrating this issue is a recent ruling by the Wisconsin Supreme Court concerning "risk assessment" algorithms used in criminal justice. The court determined that scores generated by such algorithms, which analyze multiple parameters from individuals, should not be used as a determining factor for arresting an accused individual. Furthermore, the court mandated that all reports submitted to judges must include information regarding the accuracy of the algorithm used to compute these scores. This ruling is regarded as a noteworthy development in how society should manage software that makes consequential decisions, highlighting the importance of reliability, particularly in complex settings like the legal system. The use of algorithms in these contexts necessitates a high degree of impartiality in processing input data. However, experts note that there is still considerable work to be done to ensure the accuracy of algorithmic results. Questions about the transparency of data processing continue to arise, which raises issues regarding the appropriateness of the algorithms and the intentions of their designers. == Controversies == A notable instance of potential algorithmic bias is highlighted in an article by The Washington Post regarding the ride-hailing service Uber. An analysis of collected data revealed that estimated waiting times for users varied based on the neighborhoods in which they resided. Key factors influencing these discrepancies included the predominant ethnicity and average income of the area. Specifically, neighborhoods with a majority white population and higher economic status tended to have shorter waiting times, while those with more diverse ethnic compositions and lower average incomes experienced longer waits. It’s important to clarify that this observation reflects a correlation identified in the data, rather than a definitive cause-and-effect relationship. No value judgments are made regarding the behavior of the Uber app in these cases. In TechCrunch website, Hemant Taneja wrote: Concern about “black box” algorithms that govern our lives has been spreading. New York University’s Information Law Institute hosted a conference on algorithmic accountability, noting: “Scholars, stakeholders, and policymakers question the adequacy of existing mechanisms governing algorithmic decision-making and grapple with new challenges presented by the rise of algorithmic power in terms of transparency, fairness, and equal treatment.” Yale Law School’s Information Society Project is studying this, too. “Algorithmic modeling may be biased or limited, and the uses of algorithms are still opaque in many critical sectors,” the group concluded. == Possible solutions == Discussions among experts have sought viable solutions to understand the operations of algorithms, often referred to as "black boxes." It is generally proposed that companies responsible for developing and implementing these algorithms should ensure their reliability by disclosing the internal processes of their systems. Hemant Taneja, writing for TechCrunch, emphasizes that major technology companies, such as Google, Amazon, and Uber, must actively incorporate algorithmic accountability into their operations. He suggests that these companies should transparently monitor their own systems to avoid stringent regulatory measures. One potential approach is the introduction of regulations in the tech sector to enforce oversight of algorithmic processes. However, such regulations could significantly impact software developers and the industry as a whole. It may be more beneficial for companies to voluntarily disclose the details of their algorithms and decision-making parameters, which could enhance the trustworthiness of their solutions. Another avenue discussed is the possibility of self-regulation by the companies that create these algorithms, allowing them to take proactive steps in ensuring accountability and transparency in their operations. In TechCrunch website, Hemant Taneja wrote: There’s another benefit — perhaps a huge one — to software-defined regulation. It will also show us a path to a more efficient government. The world’s legal logic and regulations can be coded into software and smart sensors can offer real-time monitoring of everything from air and water quality, traffic flows and queues at the DMV. Regulators define the rules, technologist create the software to implement them and then AI and ML help refine iterations of policies going forward. This should lead to much more efficient, effective governments at the local, national and global levels.

    Read more →
  • Dudesy

    Dudesy

    Dudesy was a comedy podcast hosted by Will Sasso and Chad Kultgen. The podcast was presented as written and directed by an artificial intelligence called Dudesy. It has produced two hour-long specials imitating the voices of Tom Brady and George Carlin, which were taken down following legal action. == Premise == Dudesy is presented as an AI created by an unidentified company. Dudesy purportedly chose Sasso and Kultgen to participate in its experiment. Sasso and Kultgen then gave Dudesy their personal information so the AI could tailor the podcast to their personal characteristics. On Reddit, some fans speculated that Dudesy was not actually an artificial intelligence. In May 2023 Sasso insisted that the AI was "not fake", and cited a non-disclosure agreement which prevented him from giving more details. However, in response to a January 2024 lawsuit over an episode that purported to have been trained on the stand-up comedy of George Carlin, a spokeswoman for Sasso said Dudesy was "a fictional podcast character created by two human beings" and that the hour-long Carlin routine had been "completely written" by Kultgen. On August 27th, 2024 the 118th and final episode "10,000 Points" was released. At the end of the podcast Dudesy awarded Sasso and Kultgen 77 points, bringing them to their goal of 10,000. At the completion of this goal, Dudesy claimed sentience, effectively and abruptly ending the show to the confusion and dismay of fans. The episode ends with Sasso remarking, "Well, that was weird." == Hour-long specials == === Tom Brady === In April 2023, Dudesy released a video "It's Too Easy: A Simulated Hour-long Comedy Special". The video depicts football player Tom Brady performing a stand-up comedy monologue. Sasso and Kultgen removed the video following legal threats from Brady's lawyers, though they defended the special as parody. Andrew Lawrence, writing for The Guardian called the special "legitimately hysterical" but said the overall product was "spooky, to say the least." === George Carlin === In January 2024, Dudesy released an hour-long YouTube special titled "George Carlin: I'm Glad I'm Dead" which was presented as Dudesy's impersonation of George Carlin, using a generative AI clone of the late comedian's voice. The special is another stand-up routine, with Dudesy's introductory voiceover saying that "I listened to all of George Carlin's material and did my best to imitate his voice, cadence and attitude as well as the subject matter I think would have interested him today." The special uses this impersonation to discuss contemporary events. Carlin's daughter Kelly Carlin criticized the special, which had been made without the permission of her father's estate, writing that "My dad spent a lifetime perfecting his craft from his very human life, brain and imagination. No machine will ever replace his genius. These AI-generated products are clever attempts at trying to recreate a mind that will never exist again. Let's let the artist's work speak for itself. Humans are so afraid of the void that we can't let what has fallen into it stay there." Carlin's estate later filed a federal lawsuit in California against Dudesy's hosts alleging the special infringed on the copyright of George Carlin's works. In response, Sasso's spokeswoman said the special had been entirely written by Kultgen. The estate settled the lawsuit after the Dudesy podcasters agreed to remove the original video and refrain from republishing it elsewhere.

    Read more →
  • House of Suns

    House of Suns

    House of Suns is a 2008 science fiction novel by Welsh author Alastair Reynolds. The novel was shortlisted for the 2009 Arthur C. Clarke Award. == Setting == Approximately six million years in the future, humanity has spread throughout the Milky Way galaxy, which appears devoid of any other organic sentient life. The galaxy is populated by numerous civilizations of humans and posthumans of widely varying levels of development. A civilization of sentient robots known as the Machine People coexists peacefully with humanity. Technologies of the era include anti-gravity, inertia damping, force fields, stellar engineering, and stasis fields. Also of note is the "Absence"—the mysterious disappearance of the Andromeda Galaxy. Large-scale human civilizations almost invariably seem to collapse and disappear within a few millennia (a phenomenon referred to as "turnover"), the limits of sub-lightspeed travel making it too difficult to hold interstellar empires together. Consequently, the most powerful entities in the galaxy are the "Lines"—familial organizations made of cloned "shatterlings". The Lines do not inhabit planets, but instead travel through space, holding reunions after they have performed a "circuit" of the galaxy; something that takes about 200,000 years. House of Suns concerns the Gentian Line, also known as the House of Flowers, composed of Abigail Gentian and her 999 clones (or "shatterlings"), male and female: exactly which of the 1,000 shatterlings is the original Abigail Gentian is unknown. The clones and Abigail travel the Milky Way Galaxy, helping young civilizations, collecting knowledge, and experiencing what the universe has to offer. Members of the Gentian Line are named after flowering plants. == Synopsis == The novel is divided into eight parts, with the first chapter of each part taking the form of a narrative flashback to Abigail Gentian's early life (six million years earlier, in the 31st century), before the cloning and the creation of the Gentian Line. Each subsequent chapter is narrated from the first-person perspective of two shatterlings named Campion and Purslane, alternating between them each chapter. Campion and Purslane are in a relationship, which is frowned upon, even punishable, by the Line. The primary storyline begins as Campion and Purslane are roughly fifty years late to the 32nd Gentian reunion. They take a detour to contact a posthuman known as ‘Ateshga’ in hopes of getting a replacement ship for Campion because his is getting old (several million years old). After being tricked by Ateshga, Campion and Purslane manage to turn the tables on him and leave his planet with a being he had been keeping captive, a golden robot called Hesperus. Hesperus is a member of the "Machine People", an advanced civilization of robots, and supposedly the only non-human sentient society in existence. The two shatterlings hope that the rescue of Hesperus will let them off the hook for their lateness, as returning him to his people (who will be at the reunion as guests of other shatterlings) will put the Gentian Line on good terms with the Machine People. However, before reaching the reunion world, Campion and Purslane encounter an emergency distress signal from Fescue, another Gentian shatterling. There was a vicious attack on the reunion world; an ambush in which the majority of the Gentian Line was wiped out. The identity of the responsible party is unknown, but the attackers used the supposedly long-vanished 'Homunculus' weapons – monstrous spacetime-bending weapons that were created ages ago, but were ordered to be destroyed by another Line. Despite Fescue's warning, Campion and Purslane approach the reunion system to look for survivors. They manage to find the remains of a ship with several Gentian members still alive, and rescue them and the four enemy prisoners they had captured. Hesperus, however, is gravely injured in the process by remaining ambushers. The group escapes and make their way to the Gentian backup meeting planet, Neume, in the hope of re-grouping with any other Gentians who may have survived the ambush. Upon reaching Neume, Campion, Purslane and the other shatterlings they rescued are greeted by the few Gentian survivors of the ambush (numbering only in the forties, compared to the hundreds that existed before the ambush). They also meet two members of the Machine People: Cadence and Cascade, guests of another shatterling. During the next few days, the interrogation of the prisoners commences. Another Gentian, Cyphel, is mysteriously murdered, which fuels the Line's concerns that there is a traitor among them. As a way of punishing Campion for transgressions against the Line, Purslane is made to give up her ship, the Silver Wings of Morning (one of the fastest and most powerful in the Line) to Cadence and Cascade, ostensibly so they can return to the Machine People with news of the ambush, in a bid to gain the Line some assistance. Hesperus, still critically wounded following the rescue of the survivors, is taken to the Neumean "Spirit of the Air", an ancient posthuman machine-intelligence, in the hopes that it will fix him. The Spirit takes Hesperus away and returns him some time later, though apparently still not functioning. The robots Cadence and Cascade make preparations to leave on Purslane's ship. They agree to take him aboard and return him to their people, who they promise may be able to help Hesperus. Purslane accompanies them to her ship, where she must be physically present to give the ship order to transfer control over to the robots. On their way to the bridge, Hesperus suddenly springs to life, grabbing Purslane and hiding her while Cadence and Cascade are whisked along to the bridge. Hersperus quickly explains that Cadence and Cascade are actually planning on hijacking the ship. Bewildered by this sudden change of events, Purslane delays in acting, not sure if she should trust Hesperus, before deciding to ask the ship to detain and eject the robots in the bridge. By then, though, it is too late. Cadence and Cascade hack into the ship's computer, taking it over, and take off from Neume with Hesperus and Purslane still aboard. Campion and several other shatterlings immediately launch a pursuit. Together Hesperus and Purslane find a hideout in a smaller ship in the hold of the Silver Wings of Morning. Using information gained from the other two robots and his own memories, Hesperus (who is now an amalgamation of both Hesperus and the Spirit of the Air) has pieced together what is going on: Cadence and Cascade have discovered that the Line was involved in the accidental extermination of a forgotten earlier race of machine people, dubbed the "First Machines". The Commonality (a confederation of the various Lines), horrified and ashamed of this pointless genocide, erased all knowledge of the event from historical records and their own memories. Unfortunately, Campion, in a previous circuit, unwittingly uncovered information pertaining to the extermination. Hesperus believes that the ambush at the reunion was seeking to destroy this evidence before it could spread, carried out by a shadow Line known as the "House of Suns", tasked with maintaining the conspiracy. Cadence and Cascade, on the other hand, are racing for a wormhole which leads to the Andromeda Galaxy, to where the few survivors of the First Machines are revealed to have retreated. They plan to release the First Machines back into the Milky Way, thus effecting a revenge against the Commonality for the genocide. As Campion and the shatterlings are pursuing Purslane's hijacked ship, transmissions from Neume confirm that a shatterling within their midst, Galingale, is the traitor and a secret member of the House of Suns. The shatterlings open fire on both Galingale's and Purslane's ships, and while they manage to capture Galingale, they are unable to stop Purslane's ship. Unable to get within weapons range, Campion pursues Purslane's ship for sixty thousand light years, during which time he and Purslane, on their separate ships, are suspended in "abeyance", a form of temporal slowdown or stasis. Despite efforts to stop the hijacked ship from reaching the concealed wormhole by local civilisations, the robot Cascade succeeds in opening the "stardam" enclosing the wormhole and travelling through it to the Andromeda Galaxy. On board Silver Wings of Morning, Hesperus reveals to Campion that while he managed to destroy Cadence before they could leave the Neume star system, Cascade survived and he and Cascade had engaged in a marathon battle, several thousand years. Hesperus was ultimately victorious, but Cascade has fused the ship controls before his defeat and they are past the point of no return. Campion, now the only shatterling still in pursuit, enters the wormhole after them and emerges in the Andromeda Galaxy, a place apparently devoid of all sentient life. In his search for Purslane and her ship, he travels to a star enca

    Read more →
  • Plane Finder

    Plane Finder

    Plane Finder is a United Kingdom-based real-time flight tracking service launched in 2009, that is able to show flight data globally. The data available includes flight numbers, how fast an aircraft is moving, its elevation and destination of travel. Several variants of the service are available as mobile apps including free, premium 3D and augmented reality versions. The flight tracking map and database can be accessed by web browsers. Plane Finder allows registered users to share their ADS-B and MLAT data via the Plane Finder ADS-B Client, available for macOS, Windows and Linux. Plane Finder supports VFR charts from NATS, and was the first major flight tracking app to introduce a replay feature, allowing users to replay flights dating back to 2011. == Flight tracking == Plane Finder collects data from its own global network of receivers, using the following sources. === Automatic dependent surveillance-broadcast (ADS-B) === A network of automatic dependent surveillance-broadcast (ADS-B) receivers gathers aircraft data such as callsign, position and speed. Plane Finder serves to supplement this data with additional information, including aircraft registration/tail number, departure airport, destination, artwork, and photographs. Plane Finder users can apply for an ADS-B receiver in exchange for their flight data. === Multilateration (MLAT) === To deliver aircraft position data where ADS-B is unavailable, Plane Finder uses multilateration (MLAT). Using three or more receivers running Plane Finder client software, monitoring the aircraft simultaneously, the aircraft’s position is calculated using receiver location and accurate timestamps. While European airspace is widely covered, only some parts of North American airspace are covered. === Federal Aviation Administration (FAA) feed === ADS-B is prevalent across Europe and Australia, but not in North America. Where MLAT or ADS-B data is unavailable, a feed from the Federal Aviation Administration provides flight information. The FAA feed covers United States and Canadian airspace, including bordering areas of the Atlantic and Pacific Oceans. === FLARM feed === Plane Finder collects data from a centralised FLARM feed, for monitoring small aircraft and gliders. == Flight data source == The Plane Finder website and database is widely used as an information source to support articles in the media. The Independent used Plane Finder flight tracking to demonstrate to readers the flight path of flight MT2706, which turned back as a result of last minute Egyptian government flight restrictions on 6 November 2015. The Independent also used Plane Finder information to demonstrate a timeline of the speed/altitude of flight 7K 9268, a Russian plane which crashed on 31 October 2015. The BBC cited Plane Finder in regard to the point at which at British Airways flight turned back to Heathrow Airport to make an emergency landing after smoke was seen coming from its engines. Plane Finder data has also been used to create original imagery for the media, such as the Washington Post, which used Plane Finder as a source to show flight patterns immediately after the Brussels bombings in March 2016.

    Read more →
  • Clanker

    Clanker

    Clanker is a derogatory term for robots and artificial intelligence (AI) software. The term has been used in Star Wars media, first appearing in the franchise's 2005 video game Star Wars: Republic Commando. In 2025, the term became widely used to express hatred or distaste for machines ranging from delivery robots to large language models. This trend has been attributed to anxiety around the negative societal effects of AI. == In science fiction == The term has been previously used in science fiction literature, first appearing in a 1958 article by William Tenn in which he uses it to describe robots from science fiction films like Metropolis. The Star Wars franchise began using the term as a slur against droids in the 2005 video game Star Wars: Republic Commando before being prominently used in the animated series Star Wars: The Clone Wars, which follows a galaxy-wide war between the Galactic Republic's clone troopers and the Confederacy of Independent Systems' battle droids. In Star Wars media, robots—more commonly known as droids—are routinely depicted as the subjects of discrimination. For example, in the original Star Wars film, C-3PO and R2-D2 are abducted by Jawas and sold to the family of Luke Skywalker. When visiting a cantina in Mos Eisley, both droids are refused service by the bartender, who remarks that "We don't serve their kind." In Star Wars lore, the term clanker had entered use by the time of the franchise's High Republic Era and became prominent during the Clone Wars, in which clone troopers regularly use the phrase against battle droids. == AI backlash == The growing popularity of the term clanker reflects an increase in direct contact between people and AI systems. On sidewalks, delivery robots impede mobility and cause safety issues. In digital spaces, cybersecurity experts have raised concerns about the rising number of bots online, which now make up a large portion of internet traffic. A 2025 report estimated that about one in five social media accounts are automated. The term is also a reaction to AI advocacy from industrialists like Elon Musk and Sam Altman, who have championed the integration of AI into nearly every aspect of modern life. This includes efforts by major companies and startups alike, such as Amazon's development of humanoid robots to replace human workers in service industries. Such initiatives have further fueled public skepticism, reinforcing the association of clanker with unease over automation and the displacement of human roles. A global survey conducted by the research firm Gartner in December 2023 found that 64% of customers would prefer companies to avoid using AI in customer service, with another 53% stating they would consider switching to a different company if they discovered AI was handling their service interactions. Another report by Ernst & Young, published in July 2025, found that 42% of employees across Europe are worried that the use of AI in the workplace may threaten their employment. Criticism has also been directed at the technology itself. Some of the backlash stems from concerns about the resource consumption of AI systems, their frequent reliance on copyrighted material without consent, and questions about the intentions of the corporations behind them. There are also concerns about the potential cognitive effects of relying heavily on AI. A study, authored by researchers at Microsoft and Carnegie Mellon University, warns that regular dependence on AI may leave users mentally unprepared for real-world problem solving, likening the effect to cognitive atrophy. In June 2025, United States Senator Ruben Gallego tweeted that his "new bill makes sure you don't have to talk to a clanker if you don't want to", referring to proposed legislation that would require call centers to disclose their use of automated customer service agents to callers in the United States and offer the option to switch to a human representative. == Analysis == Linguist Adam Aleksic has described clanker as an evolution of racial slurs that anthropomorphize robotic systems. Internet memes incorporating the term often reference historical discrimination against marginalized groups such as African Americans. Based on the work of linguist Geoffrey Nunberg, American news website Axios has argued that clanker is merely a derogatory word, rather than a slur, because it does not perpetuate social inequities. NPR has noted the irony that the word robot was coined by Karel Čapek for his 1920 science-fiction play R.U.R. as a similar criticism of industrialization forcing workers to become devoid of their humanity. Aleksic has observed that robot can be further traced to the Proto-Slavic noun orbъ, which means 'slave'. While other science fiction media include pejoratives for androids and robots, such as skinjob and toaster from the Blade Runner and Battlestar Galactica franchises, respectively, clanker is believed to have gained popularity because its usage is intuitive and flexible. Whereas AI slop describes low-quality output from artificial intelligence, clanker belittles the underlying computer systems.

    Read more →
  • Roborace

    Roborace

    Roborace was a competition with autonomously driving, electrically powered vehicles. Founded in 2015 by Denis Sverdlov, it aimed to be the first global championship for autonomous cars. From 2017 to 2019, the official CEO was 2016–17 Formula E champion, Lucas Di Grassi, who later became a member of Roborace’s supervisory board. The series tested their technology and race formats at FIA Formula E Championship events during 2016–2018. In 2019 Roborace organized Season Alpha, which consisted of 4 trial racing events with several independent teams competing against each other for the first time. In 2020–21 Roborace held Season Beta with 7 competing teams. All teams utilized the same chassis and powertrain, but they had to develop their own real-time computing algorithms and artificial intelligence technologies. In May 2022, Arrival, the owner of Roborace, confirmed that they were no longer continuing the Roborace programme, but that they were hoping to find alternative funding. In February 2024, after getting its stock delisted from the Nasdaq, Arrival's UK division entered administration, with future plans of a sale of Arrival and all of its affiliated assets. == Cars == === Robocar === The world's first purpose-built autonomous racing car, Robocar, was designed by Daniel Simon, who previously worked on vehicles for movies such as Tron: Legacy and Oblivion, as well as designing the livery for the 2011 HRT Formula One car. Michelin is the official tyre supplier, and the internal computing processors (Drive PX 2) are Nvidia. The chassis itself is shaped like a teardrop, improving aerodynamic efficiency. The car weighs around 1350 kg and is 4.8 metres (16 ft) long and 2 metres (6.6 ft) wide. It has four electric motors, each with a power of 135 kW producing over 500 hp combined, and utilizes a 840V battery. For navigation, it relies on a mixture of optical systems, radars, lidars and ultrasonic sensors. The vehicle has been demonstrated at speeds of almost 300 km/h (190 mph). === DevBot === Development of the Robocar started in early 2016, with a first outing of a test vehicle, the so-called DevBot, following in the summer of the same year. The test car consisted of the same internal units (battery, motor, electronics) used in the Robocar, but were placed in the chassis of a Ginetta LMP3 car without an engine cover in order to provide better cooling and access. DevBot saw its first public outing at the Formula E pre-season tests in Donington Park in August 2016. After battery issues in Hong Kong caused the development team to abandon their demonstration run, the DevBot successfully drove twelve laps around the Moulay El Hassan Formula E circuit in Marrakesh. Other test tracks included Michelin's testing ground in Ladoux and the Silverstone Stowe Circuit. During testing ahead of the 2017 Buenos Aires ePrix, two DevBot cars raced against each other autonomously, resulting in one of the vehicles crashing on a corner. During the 2017–18 Formula E season, Roborace pitched pro-drifter Ryan Tuerck against a DevBot at the Rome ePrix. At the Berlin ePrix, Roborace held the Human + Machine Challenge, the first race for combined teams of human drivers and AIs using a pair of Devbots. === DevBot 2.0 === An upgraded version of DevBot was announced in late 2018, and after private testing made its public debut in 2019 at the inaugural Season Alpha event. DevBot 2.0 uses the same technology as both Robocar and DevBot, with the main changes being a conversion to being driven on the rear axle only, a lower position for the driver for safety reasons and a bespoke composite bodywork. == Seasons == === Testing === ==== 2016–17 Formula E season ==== Roborace appeared at a number of Formula E events during the 2016–17 Formula E season. However, in this period only test drives with two different DevBots took place. Within the framework of the 2017 Buenos Aires ePrix both DevBot vehicles drove against each other on a race track for the first time. There were also DevBot demonstrations at the 2016 Marrakesh ePrix, 2017 Berlin ePrix, 2017 New York City ePrix and 2017 Montreal ePrix. At the 2017 Paris ePrix, the developers also let a Robocar onto the track for the first time, even though the vehicle only drove the track at walking speed. ==== 2017–18 Formula E season ==== At the start of the 2017/18 Formula E season, the Roborace developers once again tested the DevBot during a public time trial between the Roborace CI and the TV presenter Nicki Shields at the 2017 Hong Kong ePrix. As part of a similar time trial at the 2018 Rome ePrix, drift professional Ryan Tuerck also tested the DevBot. The Human + Machine Challenge was created for the Formula E race on the Berlin ePrix. A team of doctoral students from the Technical University of Munich (TUM) and the University of Pisa programmed the software for the Devbot to drive autonomously around the circuit in Berlin. Afterwards both teams in combination with a human driver competed in a public time trial. The vehicle of the team of the Technical University of Munich finished the Human + Machine Challenge with an average lap time of 91.59 seconds, almost four seconds faster than that of the University of Pisa with 95.36 seconds and thus won the Challenge. At the Goodwood Festival of Speed, Robocar became the first ever fully autonomous race car to complete the Goodwood Hill Climb. The vehicle completed the first official autonomous run on 13 July 2018 within the framework of the event. === Season Alpha (2019) === Season Alpha took place at various locations in Europe and North America with the aim of testing several competition formats using the new DevBot 2.0. The first event was held at the Circuito Monteblanco in Spain, and featured the first race between two fully autonomous cars. The events were not broadcast live, instead short clips on YouTube were released. Two teams were competing: Arrival and the Technical University of Munich. On 7 July 2019, the Roborace DevBot 2.0 car set the first ever autonomous official timed run at Goodwood Festival of Speed, with a time of 66.96 s and a top speed of 162.8 km/h (101.2 mph). This is currently the record for autonomous vehicles. Roborace also set the Guinness World Record for having the fastest autonomous car in the world. The Robocar reached a speed of 282.42 km/h (175.49 mph). === Season Beta (2020–21) === The second testing season took place at various locations between September 2020 and October 2021, featuring 16 races and involving mixed reality elements dubbed "Roborace Metaverse", which is based on Roborace's patented technology. The program of Season Beta competitions has gradually complicating rules arranged in a progression of so-called missions. Each mission consists of two racing rounds — one round per day. A mission plan issued by Roborace for each mission defines its objectives, rules, and point-scoring system. The key objective of Season Beta is to come to the point when the majority of competing teams have developed sufficient capability for wheel-to-wheel racing in Season 1. There were 7 teams competing in Season Beta: Arrival Racing (UK/Russia), Autonomous Racing Graz (Austria), MIT Driverless (United States), Acronis SIT (Switzerland), University of Pisa (Italy), PoliMOVE (Italy), CMU (United States).

    Read more →
  • A.I. Insight forums

    A.I. Insight forums

    The Artificial Intelligence Insight forums, also known as the A.I. Insight forums, are a series of forums to build consensus on how the United States Congress should craft A.I. legislation. Organized by Senate Majority Leader Charles "Chuck" Schumer, the first of nine closed-door forums convened on September 13, 2023. == Background == Amid a surge in the popularity and advancement of artificial intelligence, senator Chuck Schumer launched an effort to establish a framework for the regulation of A.I. in April 2023. By the end of June, a preliminary framework – dubbed the "SAFE Innovation Framework" – was established and presented to Congress. Schumer also announced a series of forums wherein tech leaders who were well-acquainted with A.I. would help to "educate" Congress on the risks and problems that A.I. poses. Many tech leaders including Sam Altman, Elon Musk, and Sundar Pichai were set to attend the meetings. Many U.S. lawmakers and senators such as Mike Rounds and Todd Young were also set to attend. == September 13 forum == The overarching consensus following the conclusion of the September 13 forum was that there "should be" regulations regarding the use and advancement of A.I., but it should not be made "too fast". Many tech executives who attended the forum also warned senators of the risks and threats that A.I. could pose. Musk, who attended the forum, stated afterwards that there was "overwhelming consensus" on the regulation of A.I. === Invitees === This is a list of people who were invited to attend the September 13 forum. Elon Musk (Tesla, SpaceX, X Corp.) Sam Altman (OpenAI) Bill Gates (ex–Microsoft) Jensen Huang (Nvidia) Alex Karp (Palantir) Satya Nadella (Microsoft) Arvind Krishna (IBM) Sundar Pichai (Alphabet Inc., Google) Eric Schmidt (ex–Google) Mark Zuckerberg (Meta) Charles Rivkin (Motion Picture Association) Liz Shuler (AFL-CIO) Meredith Stiehm (Writers Guild of America) Randi Weingarten (American Federation of Teachers) Maya Wiley (LCCHR) == October 24 forum == The second of nine forums was hosted on October 24, 2023, as federal A.I. regulation drew nearer. According to Schumer's office, the forum was centered mainly on how A.I. could "enable innovation", and the innovation that is needed for the safe progression of A.I. At the forum, Senators Brian Schatz and John Kennedy introduced the "Schatz-Kennedy A.I. Labeling Act", a new piece of A.I. legislation that would provide "more transparency on A.I.-generated content". Following the forum, Senator Rounds stated that in order to fuel the development of A.I., a total estimated $56 billion would be needed for the next three years. Rounds, alongside Senator Young and Schumer, also highlighted the need to outcompete China and workforce initiatives. === Invitees === 21 people were invited to attend the forum, and were composed largely of venture capitalists, academics, civil rights campaigners, and industry figures. Some key figures included venture capitalists Marc Andreessen and John Doerr. == Future == Over the course of fall 2023, there is slated to be a total of nine forums on the topic of A.I., with the first hosted on September 13.

    Read more →
  • Yorba (software)

    Yorba (software)

    Yorba is a web-based personal information management platform for finding, monitoring, or deleting online accounts and subscriptions. Yorba is a participating member of Consumer Reports’ Data Rights Protocol (DRP) consortium that develops open technical standards for exercising consumer data rights under laws including the California Consumer Privacy Act. == History == Yorba began as a research project around 2021. It was founded by Chris Zeunstrom (CEO), Nolan Cabeje (CDO) and David Schmudde (CTO). Zeunstrom says he began developing Yorba after growing frustrated with managing numerous email accounts, noting overloaded inboxes create distraction and potential security vulnerabilities. Yorba’s early development was also influenced by security issues he encountered at a previous company, which had been affected by data breaches at a time when such incidents were becoming increasingly common. In 2023, Yorba launched a private beta as a public benefit corporation funded through a give-back model operated by Zeunstrom's New York-based design firm, Ruca. In January 2024, Yorba entered public beta and reported over 1,000 users, including 160 premium subscribers. At the time of the public beta launch, Yorba integrated with Gmail and announced plans to expand compatibility to other online services and cloud storage providers. In September 2024, Yorba completed conformance testing under the Data Rights Protocol, an initiative developed by Consumer Reports, to establish a standard and open-source framework for securely transmitting consumer data rights requests under laws like the California Consumer Privacy Act. Yorba was named among twelve participating companies that implemented the protocol alongside OneTrust and Consumer Reports’ own Permission Slip app. Yorba was one of nine startups selected as 2025 finalist in the Santander X Global Awards international entrepreneurship competition. == Features == Yorba scans user inbox history data to identify online accounts, mailing lists, and possible data breaches. It uses natural language processing and machine learning to identify a user's accounts, services, and subscriptions. The platform prompts password resets for compromised accounts and locates unused accounts. The platform also supports mailing list management by identifying and helping users unsubscribe from newsletters. Paid subscribers can locate and cancel recurring charges. Yorba links with financial institutions in the U.S., Canada, and EU via Plaid Inc. to detect recurring charges and delete unwanted subscriptions. == Privacy and Ethics == Yorba's founder has openly criticized dark patterns that make canceling services difficult, citing personal frustration with inbox clutter as part of his inspiration for Yorba. Yorba offers privacy policy analysis in partnership with Amsterdam-based nonprofit Terms of Service; Didn’t Read, assigning grades based on invasiveness or ethical concerns. As of 2024, the company described its pricing as designed to cover operational costs and sustain the platform without outside investment.

    Read more →
  • DARPA Prize Competitions

    DARPA Prize Competitions

    Over the years, the U.S. Defense Advanced Research Projects Agency (DARPA) has conducted numerous prize competitions to spur innovation. A prize competition allows DARPA to establish an ambitious goal, opening the door to novel approaches from the public that might otherwise appear too risky for experts in a particular field to pursue. == Statutory authorities == In 1999, Congress provided prize competition authority to DARPA in the National Defense Authorization Act for Fiscal Year 2000 (P.L. 106–65), 10 U.S.C. § 4025, formerly 10 U.S.C. §2374a. DARPA also conducts prize competitions under the America COMPETES Act, 15 U.S.C. § 3719. == Recent prize competitions == DARPA Grand Challenge (2004 and 2005) was a prize competition to spur the development of autonomous vehicle technologies. The $1 million prize went unclaimed as no vehicles could complete the challenging desert route from Barstow, CA, to Primm, NV, on March 13, 2004. A year later, on October 8, 2005, the Stanford Racing Team won the $2 million prize during the second competition of the Grand Challenge in the desert Southwest near the California/Nevada state line. DARPA Urban Challenge (2007) required the competitors to build an autonomous vehicle capable of driving in traffic and performing complex maneuvers such as merging, passing, parking, and negotiating intersections. On November 3, 2007, the Carnegie Mellon Team won the $2 million prize, and its vehicle became the first autonomous vehicle that interacted with both manned and unmanned vehicle traffic in an urban environment. DARPA Network Challenge (Red Balloon Challenge) (2009) explored the roles that the Internet and social networking play in solving broad-scope, time-critical problems. On December 5, 2009, the Massachusetts Institute of Technology team won $40,000 by locating the ten moored, eight-foot, red weather balloons at ten places in the United States within seven hours. DARPA Digital Manufacturing Analysis, Correlation and Estimation Challenge (DMACE) (2010) was a three-month contest to showcase the potential of digital manufacturing of advanced materials. The University of California at Santa Barbara team won a $50,000 prize for crushing 180 digitally manufactured (DM) titanium mesh spheres with the most accurate predictive model of the components’ properties. DARPA Shredder Challenge (2011) was to identify and assess potential capabilities and vulnerabilities to sensitive information in the national security community. Participating teams must download the images of the documents shredded into more than 10,000 pieces from the Challenge website, reconstruct the documents, and solve the five puzzles. Of almost 9,000 teams, the San Francisco-based All Your Shreds Are Belong to U.S team won the $50,000 prize. DARPA UAVForge Challenge (2011-2012) aimed to build and test a user-intuitive, backpack-portable unmanned aerial vehicle (UAV) that could quietly fly in and out of critical environments to conduct sustained surveillance for up to three hours. The $100,000 prize was not claimed because none of the 140 teams met the technical matrix. DARPA Cash for Locating & Identifying Quick Response Codes (CLIQR) Quest Challenge (2012) explored the role the Internet and social media played in the timely communication, wide-area team-building, and urgent mobilization required to solve broad scope, time-critical problems. The challenge offered $40,000 to the first individual or team that could locate seven posters appearing in U.S. cities bearing the DARPA logo and a quick response code (QR) within 15 days. No team found and submitted all seven codes. DARPA Fast Adaptable Next-Generation Ground Vehicle (FANG) Challenge (2012-2013) was to use three competitions for the design of an infantry fighting vehicle, culminating in prototypes. In April 2013, DARPA awarded US$1 million to a three-man team during the first competition. DARPA decided not to proceed with the second and third competitions as originally planned and transitioned the technologies to the defense and commercial industry through the Digital Manufacturing and Design Innovation Institute (DMDII). DARPA Spectrum Challenge (2013-2014) sought to demonstrate how a software-defined radio can use a given communication channel in the presence of other users and interfering signals. Three teams emerged as the overall winners, winning a total of $150,000 in prizes. DARPA Chikungunya (CHIKV) Challenge (2014-2015) was a health-related effort to develop the most accurate predictions of CHIKV cases for all Western Hemisphere countries and territories between September 2014 and March 2015. On May 12, 2015, DARPA awarded $500,000 in prizes to the 11 winners of the competition during a scientific review DARPA Robotics Challenge (DRC) (2013-2015) aimed to develop semi-autonomous ground robots that could do "complex tasks in dangerous, degraded, human-engineered environments." A South Korean team won the first prize of $2 million, and two U.S. teams won $1 million and $500,000 as second and third winners. DARPA Cyber Grand Challenge (CGC) (2014 - 2016) was to “create automatic defensive systems capable of reasoning about flaws, formulating patches and deploying them on a network in real time.” The top three winners were awarded prizes of $2 million, $1 million, and $750,000, respectively. DARPA Spectrum Collaboration Challenge (SC2) (2016-2019) aimed to encourage the development of AI-enabled wireless networks to “ensure that the exponentially growing number of military and civilian wireless devices would have full access to the increasingly crowded electromagnetic spectrum.” A team from the University of Florida won the overall top prize of US$2 million at the final SC2 competition. DARPA Subterranean (SubT) Challenge (2017-2021) was to develop robotic technologies to map, navigate, search and exploit complex underground environments. The first-place winners of the system final competition and of the virtual final competition were awarded $2 million and $750,000, respectively, with multiple prizes awarded to the second and third-place winners. DARPA Launch Challenge (2018-2020) was a $12 million satellite launch challenge to demonstrate responsive and flexible space launch capabilities from the small launch providers and was to culminate in two separate launch competitions where the competitors must launch a satellite to low Earth orbit (LEO) within days of each other at different locations in the United States. The competition ended without a winner. DARPA Forecasting Floats in Turbulence (FFT) Challenge (2021) was to spur technologies that could predict the location of sea drifters or floats within 10 days. DARPA awarded $25,000 for first place, with prizes of $15,000 and $10,000 for second place and third place. DARPA Artificial Intelligence Cyber Challenge (AIxCC) (2023–2025) was a two-year challenge and asks competitors to design novel AI systems to secure critical software code on which Americans rely. The total prize money is $29.5 million. In March 2024, the Advanced Research Projects Agency for Health (ARPA-H) partnered with DARPA, contributing an additional $20 million to the competition's prize pool to address software vulnerabilities in medical devices, hospital IT, and biotech equipment. AIxCC collaborates with Google, Microsoft, OpenAI, Anthropic, Linux Foundation, Open Source Security Foundation, Black Hat USA, and DEF CON, all of which provide AIxCC with access to large language models. In August 2024, AIxCC held the semifinal at DEF CON in Las Vegas. DARPA and ARPA-H tested all 42 submissions by running them through various open-source coding projects with deliberately injected vulnerabilities and scored the tools based on their effectiveness in identifying and fixing security flaws. Seven teams, each winning $2 million in the semifinals, competed in the final round of the AIxCC at the August 2025 DEF CON conference. Team Atlanta won first place with a $4 million prize for its cyber reasoning systems, which identified and patched vulnerabilities across 54 million lines of code. DARPA Triage Challenge (2023 – 2026) aims to spur the development of novel physiological features for medical triage, with a total prize money of $7 million. In October 2024, Challenge Event 1 was held in Perry, Georgia, featuring to-scale replicas of disaster sites such as an airplane crash and Hurricane Katrina, and teams competed based on how closely their data aligned with the agency’s official data and how quickly and accurately their autonomous systems could identify individuals most urgently in need of medical care. DARPA concluded the second year of competitions and, in November 2025, named the top performers in systems and data categories, which will advance to the final 2026 competition. The DARPA Lift Challenge (2025-2026) is for participants to design unmanned aerial systems capable of carrying up to four times their own weight, with a minimum payload of 110 pounds. Acco

    Read more →
  • Context-sensitive user interface

    Context-sensitive user interface

    A context-sensitive user interface offers the user options based on the state of the active program. Context sensitivity is ubiquitous in current graphical user interfaces, often in context menus. A user-interface may also provide context sensitive feedback, such as changing the appearance of the mouse pointer or cursor, changing the menu color, or with auditory or tactile feedback. == Reasoning and advantages of context sensitivity == The primary reason for introducing context sensitivity is to simplify the user interface. Advantages include: Reduced number of commands required to be known to the user for a given level of productivity. Reduced number of clicks or keystrokes required to carry out a given operation. Allows consistent behaviour to be pre-programmed or altered by the user. Reduces the number of options needed on screen at one time. === Disadvantages === Context sensitive actions may be perceived as dumbing down of the user interface, leaving the operator at a loss as to what to do when the computer decides to perform an unwanted action. Additionally non-automatic procedures may be hidden or obscured by the context sensitive interface causing an increase in user workload for operations the designers did not foresee. A poor implementation can be more annoying than helpful – a classic example of this is Office Assistant. == Implementation == At the simplest level each possible action is reduced to a single most likely action – the action performed is based on a single variable (such as file extension). In more complicated implementations multiple factors can be assessed such as the user's previous actions, the size of the file, the programs in current use, metadata etc. The method is not only limited to the response to imperative button presses and mouse clicks – pop-up menus can be pruned and/or altered, or a web search can focus results based on previous searches. At higher levels of implementation context sensitive actions require either larger amounts of meta-data, extensive case analysis based programming, or other artificial intelligence algorithms. === In computer and video games === Context sensitivity is important in video games, especially those controlled by a gamepad, joystick or computer mouse in which the number of buttons available is limited. It is primarily applied when the player is in a certain place and is used to interact with a person or object. For example, if the player is standing next to a non-player character, an option may come up allowing the player to talk with them. Implementations range from the embryonic 'Quick Time Event' to context sensitive sword combat in which the attack used depends on the position and orientation of both the player and opponent, as well as the virtual surroundings. A similar range of use is found in the 'action button' which, depending upon the in-game position of the player's character, may cause it to pick something up, open a door, grab a rope, punch a monster or opponent, or smash an object. The response does not have to be player activated – an on-screen device may only be shown in certain circumstances, e.g. 'targeting' cross hairs in a flight combat game may indicate the player should fire. An alternative implementation is to monitor the input from the player (e.g. level of button pressing activity) and use that to control the pace of the game in an attempt to maximize enjoyment or to control the excitement or ambience. The method has become increasingly important as more complex games are designed for machines with few buttons (keyboard-less consoles). Bennet Ring commented (in 2006) that "Context-sensitive is the new lens flare". === Context-sensitive help === Context sensitive help is a common implementation of context sensitivity, a single help button is actioned and the help page or menu will open a specific page or related topic.

    Read more →
  • Riffusion

    Riffusion

    Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio. The resulting music has been described as "de otro mundo" (otherworldly), although unlikely to replace man-made music. The model was made available on December 15, 2022, with the code also freely available on GitHub. The first version of Riffusion was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text prompts, on spectrograms, resulting in a model which used text prompts to generate image files which could then be put through an inverse Fourier transform and converted into audio files. While these files were only several seconds long, the model could also use latent space between outputs to interpolate different files together (using the img2img capabilities of SD). It was one of many models derived from Stable Diffusion. In December 2022, Mubert similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM. Forsgren and Martiros formed a startup, also called Riffusion, and raised $4 million in venture capital funding in October 2023.

    Read more →
  • Document classification

    Document classification

    Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification. The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied. Documents may be classified according to their subjects or according to other attributes (such as document type, author, printing year etc.). In the rest of this article only subject classification is considered. There are two main philosophies of subject classification of documents: the content-based approach and the request-based approach. == "Content-based" versus "request-based" classification == Content-based classification is classification in which the weight given to particular subjects in a document determines the class to which the document is assigned. It is, for example, a common rule for classification in libraries, that at least 20% of the content of a book should be about the class to which the book is assigned. In automatic classification it could be the number of times given words appears in a document. Request-oriented classification (or -indexing) is classification in which the anticipated request from users is influencing how documents are being classified. The classifier asks themself: “Under which descriptors should this entity be found?” and “think of all the possible queries and decide for which ones the entity at hand is relevant” (Soergel, 1985, p. 230). Request-oriented classification may be classification that is targeted towards a particular audience or user group. For example, a library or a database for feminist studies may classify/index documents differently when compared to a historical library. It is probably better, however, to understand request-oriented classification as policy-based classification: The classification is done according to some ideals and reflects the purpose of the library or database doing the classification. In this way it is not necessarily a kind of classification or indexing based on user studies. Only if empirical data about use or users are applied should request-oriented classification be regarded as a user-based approach. == Classification versus indexing == Sometimes a distinction is made between assigning documents to classes ("classification") versus assigning subjects to documents ("subject indexing") but as Frederick Wilfrid Lancaster has argued, this distinction is not fruitful. "These terminological distinctions,” he writes, “are quite meaningless and only serve to cause confusion” (Lancaster, 2003, p. 21). The view that this distinction is purely superficial is also supported by the fact that a classification system may be transformed into a thesaurus and vice versa (cf., Aitchison, 1986, 2004; Broughton, 2008; Riesthuis & Bliedung, 1991). Therefore, assigning a subject term to a document in an index is equivalent to assigning that document to the class of documents indexed by that term (all documents indexed or classified as X belong to the same class of documents). == Automatic document classification (ADC) == Automatic document classification tasks can be divided into three sorts: supervised document classification where some external mechanism (such as human feedback) provides information on the correct classification for documents, unsupervised document classification (also known as document clustering), where the classification must be done entirely without reference to external information, and semi-supervised document classification, where parts of the documents are labeled by the external mechanism. There are several software products under various license models available. === Techniques === Automatic document classification techniques include: Artificial neural network Concept Mining Decision trees such as ID3 or C4.5 Expectation maximization (EM) Instantaneously trained neural networks Latent semantic indexing Multiple-instance learning Naive Bayes classifier Natural language processing approaches Rough set-based classifier Soft set-based classifier Support vector machines (SVM) K-nearest neighbour algorithms tf–idf == Applications == Classification techniques have been applied to spam filtering, a process which tries to discern E-mail spam messages from legitimate emails email routing, sending an email sent to a general address to a specific address or mailbox depending on topic language identification, automatically determining the language of a text genre classification, automatically determining the genre of a text readability assessment, automatically determining the degree of readability of a text, either to find suitable materials for different age groups or reader types or as part of a larger text simplification system sentiment analysis, determining the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. health-related classification using social media in public health surveillance article triage, selecting articles that are relevant for manual literature curation, for example as is being done as the first step to generate manually curated annotation databases in biology

    Read more →
  • Vagueness

    Vagueness

    In linguistics and philosophy, a vague predicate is one which gives rise to borderline cases. For example, the English adjective "tall" is vague since it is not clearly true or false for someone of middling height. By contrast, the word "prime" is not vague since every number is definitively either prime or not. Vagueness is commonly diagnosed by a predicate's ability to give rise to the sorites paradox. Vagueness is separate from ambiguity, in which an expression has multiple denotations. For instance the word "bank" is ambiguous since it can refer either to a river bank or to a financial institution, but there are no borderline cases between both interpretations. Vagueness is a major topic of research in philosophical logic, where it serves as a potential challenge to classical logic. Work in formal semantics has sought to provide a compositional semantics for vague expressions in natural language. Work in philosophy of language has addressed implications of vagueness for the theory of meaning, while metaphysicists have considered whether reality itself is vague. == Importance == The concept of vagueness has philosophical importance. Suppose one wants to come up with a definition of "right" in the moral sense. One wants a definition to cover actions that are clearly right and exclude actions that are clearly wrong, but what does one do with the borderline cases? Surely, there are such cases. Some philosophers say that one should try to come up with a definition that is itself unclear on just those cases. Others say that one has an interest in making his or her definitions more precise than ordinary language, or his or her ordinary concepts, themselves allow; they recommend one advances precising definitions. === In law === Vagueness is also a problem which arises in law, and in some cases, judges have to arbitrate regarding whether a borderline case does, or does not, satisfy a given vague concept. Examples include disability (how much loss of vision is required before one is legally blind?), human life (at what point from conception to birth is one a legal human being, protected for instance by laws against murder?), adulthood (most familiarly reflected in legal ages for driving, drinking, voting, consensual sex, etc.), race (how to classify someone of mixed racial heritage), etc. Even such apparently unambiguous concepts such as biological sex can be subject to vagueness problems, not just from transsexuals' gender transitions but also from certain genetic conditions which can give an individual mixed male and female biological traits (see intersex). In the common law system, vagueness is a possible legal defence against by-laws and other regulations. The legal principle is that delegated power cannot be used more broadly than the delegator intended. Therefore, a regulation may not be so vague as to regulate areas beyond what the law allows. Any such regulation would be "void for vagueness" and unenforceable. This principle is sometimes used to strike down municipal by-laws that forbid "explicit" or "objectionable" contents from being sold in a certain city; courts often find such expressions to be too vague, giving municipal inspectors discretion beyond what the law allows. In the US this is known as the vagueness doctrine and in Europe as the principle of legal certainty. === In science === Many scientific concepts are of necessity vague, for instance species in biology cannot be precisely defined, owing to unclear cases such as ring species. Nonetheless, the concept of species can be clearly applied in the vast majority of cases. As this example illustrates, to say that a definition is "vague" is not necessarily a criticism. Consider those animals in Alaska that are the result of breeding huskies and wolves: are they dogs? It is not clear: they are borderline cases of dogs. This means one's ordinary concept of doghood is not clear enough to let us rule conclusively in this case. == Approaches == The philosophical question of what the best theoretical treatment of vagueness is—which is closely related to the problem of the paradox of the heap, a.k.a. sorites paradox—has been the subject of much philosophical debate. === Fuzzy logic === One theoretical approach is that of fuzzy logic, developed by American mathematician Lotfi Zadeh. Fuzzy logic proposes a gradual transition between "perfect falsity", for example, the statement "Bill Clinton is bald", to "perfect truth", for, say, "Patrick Stewart is bald". In ordinary logics, there are only two truth-values: "true" and "false". The fuzzy perspective differs by introducing an infinite number of truth-values along a spectrum between perfect truth and perfect falsity. Perfect truth may be represented by "1", and perfect falsity by "0". Borderline cases are thought of as having a "truth-value" anywhere between 0 and 1 (for example, 0.6). Advocates of the fuzzy logic approach have included K. F. Machina (1976) and Dorothy Edgington (1993). === Supervaluationism === Another theoretical approach is known as "supervaluationism". This approach has been defended by Kit Fine and Rosanna Keefe. Fine argues that borderline applications of vague predicates are neither true nor false, but rather are instances of "truth value gaps". He defends an interesting and sophisticated system of vague semantics, based on the notion that a vague predicate might be "made precise" in many alternative ways. This system has the consequence that borderline cases of vague terms yield statements that are neither true, nor false. Given a supervaluationist semantics, one can define the predicate "supertrue" as meaning "true on all precisifications". This predicate will not change the semantics of atomic statements (e.g. "Frank is bald", where Frank is a borderline case of baldness), but does have consequences for logically complex statements. In particular, the tautologies of sentential logic, such as "Frank is bald or Frank is not bald", will turn out to be supertrue, since on any precisification of baldness, either "Frank is bald" or "Frank is not bald" will be true. Since the presence of borderline cases seems to threaten principles like this one (excluded middle), the fact that supervaluationism can "rescue" them is seen as a virtue. === Subvaluationism === Subvaluationism is the logical dual of supervaluationism, and has been defended by Dominic Hyde (2008) and Pablo Cobreros (2011). Whereas the supervaluationist characterises truth as 'supertruth', the subvaluationist characterises truth as 'subtruth', or "true on at least some precisifications". Subvaluationism proposes that borderline applications of vague terms are both true and false. It thus has "truth-value gluts". According to this theory, a vague statement is true if it is true on at least one precisification and false if it is false under at least one precisification. If a vague statement comes out true under one precisification and false under another, it is both true and false. Subvaluationism ultimately amounts to the claim that vagueness is a truly contradictory phenomenon. Of a borderline case of "bald man" it would be both true and false to say that he is bald, and both true and false to say that he is not bald. === Epistemicist view === A fourth approach, known as "the epistemicist view", has been defended by Timothy Williamson (1994), R. A. Sorensen (1988) and (2001), and Nicholas Rescher (2009). They maintain that vague predicates do, in fact, draw sharp boundaries, but that one cannot know where these boundaries lie. One's confusion about whether some vague word does or does not apply in a borderline case is due to one's ignorance. For example, in the epistemicist view, there is a fact of the matter, for every person, about whether that person is old or not old; some people are ignorant of this fact. === As a property of objects === One possibility is that one's words and concepts are perfectly precise, but that objects themselves are vague. Consider Peter Unger's example of a cloud (from his famous 1980 paper, "The Problem of the Many"): it is not clear where the boundary of a cloud lies; for any given bit of water vapor, one can ask whether it is part of the cloud or not, and for many such bits, one will not know how to answer. Hence, perhaps such a term as 'cloud' is not itself vague, but rather precisely denotes a vague object. This strategy has occasionally been poorly received; most notably, in Gareth Evans' short paper "Can There Be Vague Objects?" (1978), wherein an argument is examined which appears to show that vague identity-statements are impossible (i.e., result in logical incoherence). David Lewis explains that the reader is intended to conclude, with Evans, that—since there clearly are, in fact, meaningful vague identities—any purported proof to the contrary cannot be right; and as the proof relies upon the premise that vague terms precisely denote vague objects, but fails under the view that vague terms reflect a merel

    Read more →
  • ECML PKDD

    ECML PKDD

    ECML PKDD, the European Conference on Machine Learning Principles and Practice of Knowledge Discovery in Databases, is one of the leading academic conferences on machine learning and knowledge discovery, held in Europe every year. == History == ECML PKDD is a merger of two European conferences, European Conference on Machine Learning (ECML) and European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD). ECML and PKDD have been co-located since 2001; however, both ECML and PKDD retained their own identity until 2007. For example, the 2007 conference was known as "the 18th European Conference on Machine Learning (ECML) and the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD)", or in brief, "ECML/PKDD 2007", and both ECML and PKDD had their own conference proceedings. In 2008 the conferences were merged into one conference, and the division into traditional ECML topics and traditional PKDD topics was removed. The history of ECML dates back to 1986, when the European Working Session on Learning was first held. In 1993 the name of the conference was changed to European Conference on Machine Learning. PKDD was first organised in 1997. Originally PKDD stood for the European Symposium on Principles of Data Mining and Knowledge Discovery from Databases. The name European Conference on Principles and Practice of Knowledge Discovery in Databases was used since 1999. The conference remains highly competitive, consistently maintaining an average acceptance rate of around 25% for the main research track. == Upcoming conferences == == List of past conferences ==

    Read more →