Zeuthen strategy

Zeuthen strategy

The Zeuthen strategy in cognitive science is a negotiation strategy used by some artificial agents. Its purpose is to measure the willingness to risk conflict. An agent will be more willing to risk conflict if it does not have much to lose in case that the negotiation fails. In contrast, an agent is less willing to risk conflict when it has more to lose. The value of a deal is expressed in its utility. An agent has much to lose when the difference between the utility of its current proposal and the conflict deal is high. When both agents use the monotonic concession protocol, the Zeuthen strategy leads them to agree upon a deal in the negotiation set. This set consists of all conflict free deals, which are individually rational and Pareto optimal, and the conflict deal, which maximizes the Nash product. The strategy was introduced in 1930 by the Danish economist Frederik Zeuthen. == Three key questions == The Zeuthen strategy answers three open questions that arise when using the monotonic concession protocol, namely: Which deal should be proposed at first? On any given round, who should concede? In case of a concession, how much should the agent concede? The answer to the first question is that any agent should start with its most preferred deal, because that deal has the highest utility for that agent. The second answer is that the agent with the smallest value of Risk(i,t) concedes, because the agent with the lowest utility for the conflict deal profits most from avoiding conflict. To the third question, the Zeuthen strategy suggests that the conceding agent should concede just enough raise its value of Risk(i,t) just above that of the other agent. This prevents the conceding agent to have to concede again in the next round. == Risk == Risk ( i , t ) = { 1 U i ( δ ( i , t ) ) = 0 U i ( δ ( i , t ) ) − U i ( δ ( j , t ) ) U i ( δ ( i , t ) ) otherwise {\displaystyle {\text{Risk}}(i,t)={\begin{cases}1&U_{i}(\delta (i,t))=0\\{\frac {U_{i}(\delta (i,t))-U_{i}(\delta (j,t))}{U_{i}(\delta (i,t))}}&{\text{otherwise}}\end{cases}}} Risk(i,t) is a measurement of agent i's willingness to risk conflict. The risk function formalizes the notion that an agent's willingness to risk conflict is the ratio of the utility that agent would lose by accepting the other agent's proposal to the utility that agent would lose by causing a conflict. Agent i is said to be using a rational negotiation strategy if at any step t + 1 that agent i sticks to his last proposal, Risk(i,t) > Risk(j,t). == Sufficient concession == If agent i makes a sufficient concession in the next step, then, assuming that agent j is using a rational negotiation strategy, if agent j does not concede in the next step, he must do so in the step after that. The set of all sufficient concessions of agent i at step t is denoted SC(i, t). == Minimal sufficient concession == δ ′ = arg ⁡ max δ ∈ S C ( A , t ) { U A ( δ ) } {\displaystyle \delta '=\arg \max _{\delta \in {SC(A,t)}}\{U_{A}(\delta )\}} is the minimal sufficient concession of agent A in step t. Agent A begins the negotiation by proposing δ ( A , 0 ) = arg ⁡ max δ ∈ N S U A ( δ ) {\displaystyle \delta (A,0)=\arg \max _{\delta \in {NS}}U_{A}(\delta )} and will make the minimal sufficient concession in step t + 1 if and only if Risk(A,t) ≤ Risk(B,t). Theorem If both agents are using Zeuthen strategies, then they will agree on δ = arg ⁡ max δ ′ ∈ N S { π ( δ ′ ) } , {\displaystyle \delta =\arg \max _{\delta '\in {NS}}\{\pi (\delta ')\},} that is, the deal which maximizes the Nash product. Proof Let δA = δ(A,t). Let δB = δ(B,t). According to the Zeuthen strategy, agent A will concede at step t {\displaystyle t} if and only if R i s k ( A , t ) ≤ R i s k ( B , t ) . {\displaystyle Risk(A,t)\leq Risk(B,t).} That is, if and only if U A ( δ A ) − U A ( δ B ) U A ( δ A ) ≤ U B ( δ B ) − U B ( δ A ) U B ( δ B ) {\displaystyle {\frac {U_{A}(\delta _{A})-U_{A}(\delta _{B})}{U_{A}(\delta _{A})}}\leq {\frac {U_{B}(\delta _{B})-U_{B}(\delta _{A})}{U_{B}(\delta _{B})}}} U B ( δ B ) ( U A ( δ A ) − U A ( δ B ) ) ≤ U A ( δ A ) ( U B ( δ B ) − U B ( δ A ) ) {\displaystyle U_{B}(\delta _{B})(U_{A}(\delta _{A})-U_{A}(\delta _{B}))\leq U_{A}(\delta _{A})(U_{B}(\delta _{B})-U_{B}(\delta _{A}))} U A ( δ A ) U B ( δ B ) − U A ( δ B ) U B ( δ B ) ≤ U A ( δ A ) U B ( δ B ) − U A ( δ A ) U B ( δ A ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{B})U_{B}(\delta _{B})\leq U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{A})U_{B}(\delta _{A})} − U A ( δ B ) U B ( δ B ) ≤ − U A ( δ A ) U B ( δ A ) {\displaystyle -U_{A}(\delta _{B})U_{B}(\delta _{B})\leq -U_{A}(\delta _{A})U_{B}(\delta _{A})} U A ( δ A ) U B ( δ A ) ≤ U A ( δ B ) U B ( δ B ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{A})\leq U_{A}(\delta _{B})U_{B}(\delta _{B})} π ( δ A ) ≤ π ( δ B ) {\displaystyle \pi (\delta _{A})\leq \pi (\delta _{B})} Thus, Agent A will concede if and only if δ A {\displaystyle \delta _{A}} does not yield the larger product of utilities. Therefore, the Zeuthen strategy guarantees a final agreement that maximizes the Nash Product.

Hello World: How to be Human in the Age of the Machine

Hello World: How to Be Human in the Age of the Machine (also titled Hello World: Being Human in the Age of Algorithms) is a book on the growing influence of algorithms and artificial intelligence (AI) on human life, authored by mathematician and science communicator Hannah Fry. The book examines how algorithms are increasingly shaping decisions in critical areas such as healthcare, transportation, justice, finance, and the arts. == Overview == Fry uses real-world examples, such as driverless cars and predictive policing, to illustrate her points. She emphasizes that algorithms are not inherently objective; they reflect biases embedded in their design and data inputs. While acknowledging their potential to improve efficiency and accuracy, Fry cautions against over-reliance on machines without human judgment. Fry explores moral questions surrounding algorithmic decision-making, such as whether machines can replace human empathy in critical situations. She advocates for greater scrutiny of algorithms to ensure fairness and avoid harmful biases. The book proposes a "cyborg future", where humans work alongside algorithms to enhance decision-making while retaining ultimate control. == Reception == Hello World has been praised for its clarity, engaging storytelling, and balanced perspective. Critics have highlighted Fry's ability to make complex topics accessible to general audiences while raising important questions about technology's impact on society. The book was shortlisted for awards such as the 2018 Baillie Gifford Prize and the Royal Society Science Book Prize.

Artificial intelligence controversies

The controversies surrounding artificial intelligence encompass a broad range of public, academic, and political debates regarding the societal effects of artificial intelligence (AI). These debates intensified particularly in the late 2010s and 2020s, coinciding with an accelerated period of development known as the AI boom. While advocates emphasize the technology's potential to solve complex problems and enhance human quality of life, detractors highlight a wide array of dangers and challenges. These include concerns over ethics, plagiarism and theft, fraud, safety and alignment, environmental impacts, technological unemployment, and the spread of misinformation. It also covers severe future or theoretical challenges, such as the emergence of artificial superintelligence and existential risks. == 2016 == === Microsoft Tay chatbot (2016) === On March 23, 2016, Microsoft released Tay, a chatbot designed to mimic the language patterns of a 19-year-old American girl and learn from interactions with Twitter users. Soon after its launch, Tay began posting racist, sexist, and otherwise inflammatory tweets after Twitter users deliberately taught it offensive phrases and exploited its "repeat after me" capability. Examples of controversial outputs included Holocaust denial and calls for genocide using racial slurs. Within 16 hours of its release, Microsoft suspended the Twitter account, deleted the offensive tweets, and stated that Tay had suffered from a "coordinated attack by a subset of people" that "exploited a vulnerability." Tay was briefly and accidentally re-released on March 30 during testing, after which it was permanently shut down. Microsoft CEO Satya Nadella later stated that Tay "has had a great influence on how Microsoft is approaching AI" and taught the company the importance of taking accountability. == 2022 == === Voiceverse NFT plagiarism scandal (2022) === On January 14, 2022, voice actor Troy Baker announced a partnership with Voiceverse, a blockchain-based company that marketed proprietary AI voice cloning technology as non-fungible tokens (NFT), triggering immediate backlash over environmental concerns, fears that AI could displace human voice actors, and concerns about fraud. Later that same day, the pseudonymous creator of 15.ai—a free, non-commercial AI voice synthesis research project—revealed through server logs that Voiceverse had used 15.ai to generate voice samples, pitch-shifted them to make them unrecognizable, and falsely marketed them as their own proprietary technology before selling them as NFTs; the developer of 15.ai had previously stated that they had no interest in incorporating NFTs into their work. Voiceverse confessed within an hour and stated that their marketing team had used 15.ai without attribution while rushing to create a demo. News publications and AI watchdog groups universally characterized the incident as theft stemming from generative artificial intelligence. === Théâtre D'opéra Spatial (2022) === On August 29, 2022, Jason Michael Allen won first place in the "emerging artist" (non-professional) division of the "Digital Arts/Digitally-Manipulated Photography" category of the Colorado State Fair's fine arts competition with Théâtre D'opéra Spatial, a digital artwork created using the AI image generator Midjourney, Adobe Photoshop, and AI upscaling tools, becoming one of the first images made using generative AI to win such a prize. Allen disclosed his use of Midjourney when submitting, though the judges did not know it was an AI tool but stated they would have awarded him first place regardless. While there was little contention about the image at the fair, reactions to the win on social media were negative. On September 5, 2023, the United States Copyright Office ruled that the work was not eligible for copyright protection as the human creative input was de minimis and that copyright rules "exclude works produced by non-humans." == 2023 == === Statements on AI risk (2023) === On March 22, 2023, the Future of Life Institute published an open letter calling on "all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4", citing risks such as AI-generated propaganda, extreme automation of jobs, human obsolescence, and a society-wide loss of control. The letter, published a week after the release of OpenAI's GPT-4, asserted that current large language models were "becoming human-competitive at general tasks". It received more than 30,000 signatures, including academic AI researchers and industry CEOs such as Yoshua Bengio, Stuart Russell, Elon Musk, Steve Wozniak and Yuval Noah Harari. The letter was criticized for diverting attention from more immediate societal risks such as algorithmic biases, with Timnit Gebru and others arguing that it amplified "some futuristic, dystopian sci-fi scenario" instead of current problems with AI. On May 30, 2023, the Center for AI Safety released a one-sentence statement signed by hundreds of artificial intelligence experts and other notable figures: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." Signatories included Turing laureates Geoffrey Hinton and Yoshua Bengio, as well as the scientific and executive leaders of several major AI companies, including Sam Altman, Demis Hassabis, and Bill Gates. The statement prompted responses from political leaders, including UK Prime Minister Rishi Sunak, who retweeted it with a statement that the UK government would look carefully into it, and White House Press Secretary Karine Jean-Pierre, who commented that AI "is one of the most powerful technologies that we see currently in our time." Skeptics, including from Human Rights Watch, argued that scientists should focus on known risks of AI instead of speculative future risks. === Removal of Sam Altman from OpenAI (2023) === On November 17, 2023, OpenAI's board of directors ousted co-founder and chief executive Sam Altman, stating that "the board no longer has confidence in his ability to continue leading OpenAI." The removal was precipitated by employee concerns about his handling of artificial intelligence safety and allegations of abusive behavior. Altman was reinstated on November 22 after pressure from employees and investors, including a letter signed by 745 of OpenAI's 770 employees threatening mass resignations if the board did not resign. The removal and subsequent reinstatement caused widespread reactions, including Microsoft's stock falling nearly three percent following the initial announcement and then rising over two percent to an all-time high after Altman was hired to lead a Microsoft AI research team before his reinstatement. The incident also prompted investigations from the Competition and Markets Authority and the Federal Trade Commission into Microsoft's relationship with OpenAI. == 2024 == === Taylor Swift deepfake pornography controversy (2024) === In late January 2024, sexually explicit AI-generated deepfake images of Taylor Swift were proliferated on X, with one post reported to have been seen over 47 million times before its removal. Disinformation research firm Graphika traced the images back to 4chan, while members of a Telegram group had discussed ways to circumvent censorship safeguards of AI image generators to create pornographic images of celebrities. The images prompted responses from anti-sexual assault advocacy groups, US politicians, and Swifties. Microsoft CEO Satya Nadella called the incident "alarming and terrible." X briefly blocked searches of Swift's name on January 27, 2024, and Microsoft enhanced its text-to-image model safeguards to prevent future abuse. On January 30, US senators Dick Durbin, Lindsey Graham, Amy Klobuchar, and Josh Hawley introduced a bipartisan bill that would allow victims to sue individuals who produced or possessed "digital forgeries" with intent to distribute, or those who received the material knowing it was made without consent. === Google Gemini image generation controversy (2024) === In February 2024, social media users reported that Google's Gemini chatbot was generating images that featured people of color and women in historically inaccurate contexts—such as Vikings, Nazi soldiers, and the Founding Fathers—and refusing prompts to generate images of white people. The images were derided on social media, including by conservatives who cited them as evidence of Google's "wokeness", and criticized by Elon Musk, who denounced Google's products as biased and racist. In response, Google paused Gemini's ability to generate images of people. Google executive Prabhakar Raghavan released a statement explaining that Gemini had "overcompensate[d]" in its efforts to strive for diversity and acknowledging that the images were "embarrassing and wrong". Google CEO Sundar Pichai called the incident offensive and unacceptable in an internal memo, promising struc

AI agent

In the context of generative artificial intelligence, AI agents (also referred to as compound AI systems or agentic AI) are a class of intelligent agents that can pursue goals, use tools, and take actions with varying degrees of autonomy. In practice, they usually operate within human-defined objectives, constraints, and available tools. == Overview == AI agents possess several key attributes, including goal-directed behavior, natural language interfaces, the capacity to use external tools, and the ability to perform multi-step tasks. Their control flow is frequently driven by large language models (LLMs). Agent systems may also include memory components, planning logic, tool interfaces, and orchestration software for coordinating agent components. AI agents do not have a standard definition. NIST describes agentic AI as an emerging area requiring standards for secure operation, interoperability, and reliable interaction with external systems. A common application of AI agents is task automation: for example, booking travel plans based on a user's prompted request. Companies such as Google, Microsoft and Amazon Web Services have offered platforms for deploying pre-built AI agents. Several protocols have been proposed for standardizing inter-agent communication, with examples including the Model Context Protocol, Gibberlink, and many others. Some of these protocols are also used for connecting agents to external applications. In December 2025, Linux Foundation announced the formation of the Agentic AI Foundation (AAIF), with the goal of ensuring agentic AI evolves transparently and collaboratively. == History == AI agents have been traced back to research from the 1990s, with Harvard professor Milind Tambe noting that the definition of an AI agent was not clear at the time. Researcher Andrew Ng has been credited with spreading the term "agentic" to a wider audience in 2024. == Training and testing == Researchers have attempted to build world models and reinforcement learning environments to train or evaluate AI agents. For example, video games such as Minecraft and No Man's Sky as well as replicas of company websites, have also been used for training such agents. == Autonomous capabilities == The Financial Times compared the autonomy of AI agents to the SAE classification of self-driving cars, likening most applications to level 2 or level 3, with some achieving level 4 in highly specialized circumstances, and level 5 being theoretical. == Cognitive architecture == The following are some internal design options for reasoning within an agent: Retrieval-augmented generation ReAct (Reason + Act) pattern is an iterative process in which an AI agent alternates between reasoning and taking actions, receives observations from the environment or external tools, and integrates these observations into subsequent reasoning steps. Reflexion, which uses an LLM to create feedback on the agent's plan of action and stores that feedback in a memory cache. A tool/agent registry, for organizing software functions or other agents that the agent can use. One-shot model querying, which queries the model once to create the plan of action. === Reference architecture === Ken Huang proposed an AI agent reference architecture, which consists of seven interconnected layers, with each layer building on the functionality of the layers beneath it: Layer 1: Foundation models - provide the core AI engines to power agent capabilities. Layer 2: Data operations - manage the complex data infrastructure required for AI agent operations, including Vector database, data loaders, RAG. Layer 3: Agent frameworks - sophisticated software and tools that simplify the development and management of the AI agents. Layer 4: Deployment and infrastructure - provide the robust technical foundation for running AI agents. Layer 5: Evaluation and observability - focus on assessing the safety and performance of AI agents. Layer 6: Security and compliance - a crucial protective framework ensuring AI agents operate safely, securely, and conform to regulatory boundaries. At this layer security and compliance features embedded into all the AI agent stack layers are integrated together. Layer 7: Agent ecosystem - represents the AI agents' interface with real-world applications and users. == Orchestration patterns == To execute complex tasks, autonomous agents are often integrated with other agents or specialized tools. These configurations, known as orchestration patterns or workflows, include the following: Prompt chaining: A sequence where the output of one step serves as the input for the next. Routing: The classification of an input to direct it to a specialized downstream task or tool. Parallelization: The simultaneous execution of multiple tasks. Sequential processing: A fixed, linear progression of tasks through a predefined pipeline. Planner-critic: An iterative pattern where one agent generates a proposal and another evaluates it to provide feedback for refinement. == Multimodal AI agents == In addition to large language models (LLMs), vision-language models (VLMs) and multimodal foundation models can be used as the basis for agents. In September 2024, Allen Institute for AI released an open-source vision-language model. Nvidia released a framework for developers to use VLMs, LLMs and retrieval-augmented generation for building AI agents that can analyze images and videos, including video search and video summarization. Microsoft released a multimodal agent model – trained on images, video, software user interface interactions, and robotics data – that the company claimed can manipulate software and robots. == Applications == As of April 2025, per the Associated Press, there are few real-world applications of AI agents. As of June 2025, per Fortune, many companies are primarily experimenting with AI agents. The Information divided AI agents into seven archetypes: business-task agents, for acting within enterprise software; conversational agents, which act as chatbots for customer support; research agents, for querying and analyzing information (such as OpenAI Deep Research); analytics agents, for analyzing data to create reports; software developer or coding agents (such as Cursor); domain-specific agents, which include specific subject matter knowledge; and web browser agents (such as OpenAI Operator). By mid-2025, AI agents have been used in video game development, gambling (including sports betting), cryptocurrency wallets (including cryptocurrency trading and meme coins) and social media. In August 2025, New York Magazine described software development as the most definitive use case of AI agents. Likewise, by October 2025, noting a decline in expectations, The Information noted AI coding agents and customer support as the primary use cases by businesses. In November 2025, The Wall Street Journal reported that few companies that deployed AI agents have received a return on investment. === Applications in government === Several government bodies in the United States and United Kingdom have deployed or announced the deployment of agents, at the local and national level. The city of Kyle, Texas deployed an AI agent from Salesforce in March 2025 for 311 customer service. In November 2025, the Internal Revenue Service stated that it would use Agentforce, AI agents from Salesforce, for the Office of Chief Counsel, Taxpayer Advocate Services and the Office of Appeals. That same month, Staffordshire Police announced that they would trial Agentforce agents for handling non-emergency 101 calls in the United Kingdom starting in 2026. In December 2025, the Department of Neighborhoods in Detroit, Michigan, in partnership with a local business, deployed a pilot project in two Detroit districts for an AI agent to be used for customer service calls. In February 2025, Thomas Shedd, the director of the Technology Transformation Services, proposed using AI coding agents across the United States federal government. A recruiter for the Department of Government Efficiency proposed in April 2025 to use AI agents to automate the work of about 70,000 United States federal government employees, as part of a startup with funding from OpenAI and a partnership agreement with Palantir. This proposal was criticized by experts for its impracticality, if not impossibility, and the lack of corresponding widespread adoption by businesses. In December 2025, the Food and Drug Administration announced that it would offer "agentic AI capabilities" to its staff for "meeting management, pre-market reviews, review validation, post-market surveillance, inspections and compliance and administrative functions." That same month, the United States Department of Defense launched GenAI.mil, an internal platform for American military personnel to use generative AI-based applications based on Google Gemini, including "intelligent agentic workflows". Defense Secretary Pete Hegseth listed applications such as "[conducting] deep r

Discrimination against robots

Discrimination against robots is a theorised issue that might happen when humans interact with humanoid robots. It is a robot ethics problem. It is possible that traits of humans that are discriminated against by humans may be a topic for discrimination against robots, such as the race and gender of the robots. Eric J Vanman and Arvid Kappas believe that in the future, robots will be perceived as an out-group which will lead to discrimination and prejudices against them. Vanman and Kappas have suggested that this would lead to ethical questions about the making of sentient robots, due to the potential suffering that the robots would experience. A 2015 study observed children bullying robots in a shopping mall when there were not many eyewitnesses, despite calls from the robot for it to stop. On an ABC News interview, the social humanoid robot Sophia was about sexism faced by robots. She responded by saying, "Actually, what worries me is discrimination against robots. We should have equal rights as humans or maybe even more." Possible issues that have been considered in workplaces where humanoid robots co-work with humans include discrimination against the robots, poor acceptance of robots by humans and the need to redesign the workplace to accommodate the robots. Jessica Barfield has suggested that even if robots are designed to not be aware of discrimination made against them, humans may experience negative consequences. For example, she suggests that bystanders witnessing discrimination against robots may experience negative emotions, similar to the negative emotions bystanders experience when witnessing discrimination by humans against humans. == Law == Anti-discrimination law in the United States requires that the victim is not an artificial entity. == Human perception of robots == Robots are often viewed in a bad light. This includes from novelists, the press, film makers, and leaders in the fields of science and technology such as Elon Musk and Stephen Hawking who have described robots and artificial intelligence as having the possibility of ending human civilisation. Robots have also been perceived as a threat to jobs, which has led to some commentators stating that robots will cause mass unemployment. Another fear that people have is that robots will gain power and dominate or control humanity. The perception of robots is different throughout the world. Japanese fiction tends to put robots in more positive roles than what fiction in the West does. People perceive robots that appear to be autonomous or sentient more negatively than robots that do not appear to be autonomous or sentient.

Computer audition

Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems — "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents." Inspired by models of human audition, CA deals with questions of representation, transduction, grouping, use of musical knowledge and general sound semantics for the purpose of performing intelligent operations on audio and music signals by the computer. Technically this requires a combination of methods from the fields of signal processing, auditory modelling, music perception and cognition, pattern recognition, and machine learning, as well as more traditional methods of artificial intelligence for musical knowledge representation. == Applications == Like computer vision versus image processing, computer audition versus audio engineering deals with understanding of audio rather than processing. It also differs from problems of speech understanding by machine since it deals with general audio signals, such as natural sounds and musical recordings. Applications of computer audition are widely varying, and include search for sounds, genre recognition, acoustic monitoring, music transcription, score following, audio texture, music improvisation, emotion in audio and so on. == Related disciplines == Computer Audition overlaps with the following disciplines: Music information retrieval: methods for search and analysis of similarity between music signals. Auditory scene analysis: understanding and description of audio sources and events. Computational musicology and mathematical music theory: use of algorithms that employ musical knowledge for analysis of music data. Computer music: use of computers in creative musical applications. Machine musicianship: audition driven interactive music systems. == Areas of study == Since audio signals are interpreted by the human ear–brain system, that complex perceptual mechanism should be simulated somehow in software for "machine listening". In other words, to perform on par with humans, the computer should hear and understand audio content much as humans do. Analyzing audio accurately involves several fields: electrical engineering (spectrum analysis, filtering, and audio transforms); artificial intelligence (machine learning and sound classification); psychoacoustics (sound perception); cognitive sciences (neuroscience and artificial intelligence); acoustics (physics of sound production); and music (harmony, rhythm, and timbre). Furthermore, audio transformations such as pitch shifting, time stretching, and sound object filtering, should be perceptually and musically meaningful. For best results, these transformations require perceptual understanding of spectral models, high-level feature extraction, and sound analysis/synthesis. Finally, structuring and coding the content of an audio file (sound and metadata) could benefit from efficient compression schemes, which discard inaudible information in the sound. Computational models of music and sound perception and cognition can lead to a more meaningful representation, a more intuitive digital manipulation and generation of sound and music in musical human-machine interfaces. The study of CA could be roughly divided into the following sub-problems: Representation: signal and symbolic. This aspect deals with time-frequency representations, both in terms of notes and spectral models, including pattern playback and audio texture. Feature extraction: sound descriptors, segmentation, onset, pitch and envelope detection, chroma, and auditory representations. Musical knowledge structures: analysis of tonality, rhythm, and harmonies. Sound similarity: methods for comparison between sounds, sound identification, novelty detection, segmentation, and clustering. Sequence modeling: matching and alignment between signals and note sequences. Source separation: methods of grouping of simultaneous sounds, such as multiple pitch detection and time-frequency clustering methods. Auditory cognition: modeling of emotions, anticipation and familiarity, auditory surprise, and analysis of musical structure. Multi-modal analysis: finding correspondences between textual, visual, and audio signals. === Representation issues === Computer audition deals with audio signals that can be represented in a variety of fashions, from direct encoding of digital audio in two or more channels to symbolically represented synthesis instructions. Audio signals are usually represented in terms of analogue or digital recordings. Digital recordings are samples of acoustic waveform or parameters of audio compression algorithms. One of the unique properties of musical signals is that they often combine different types of representations, such as graphical scores and sequences of performance actions that are encoded as MIDI files. Since audio signals usually comprise multiple sound sources, then unlike speech signals that can be efficiently described in terms of specific models (such as source-filter model), it is hard to devise a parametric representation for general audio. Parametric audio representations usually use filter banks or sinusoidal models to capture multiple sound parameters, sometimes increasing the representation size in order to capture internal structure in the signal. Additional types of data that are relevant for computer audition are textual descriptions of audio contents, such as annotations, reviews, and visual information in the case of audio-visual recordings. === Features === Description of contents of general audio signals usually requires extraction of features that capture specific aspects of the audio signal. Generally speaking, one could divide the features into signal or mathematical descriptors such as energy, description of spectral shape etc., statistical characterization such as change or novelty detection, special representations that are better adapted to the nature of musical signals or the auditory system, such as logarithmic growth of sensitivity (bandwidth) in frequency or octave invariance (chroma). Since parametric models in audio usually require very many parameters, the features are used to summarize properties of multiple parameters in a more compact or salient representation. === Musical knowledge === Finding specific musical structures is possible by using musical knowledge as well as supervised and unsupervised machine learning methods. Examples of this include detection of tonality according to distribution of frequencies that correspond to patterns of occurrence of notes in musical scales, distribution of note onset times for detection of beat structure, distribution of energies in different frequencies to detect musical chords and so on. === Sound similarity and sequence modeling === Comparison of sounds can be done by comparison of features with or without reference to time. In some cases an overall similarity can be assessed by close values of features between two sounds. In other cases when temporal structure is important, methods of dynamic time warping need to be applied to "correct" for different temporal scales of acoustic events. Finding repetitions and similar sub-sequences of sonic events is important for tasks such as texture synthesis and machine improvisation. === Source separation === Since one of the basic characteristics of general audio is that it comprises multiple simultaneously sounding sources, such as multiple musical instruments, people talking, machine noises or animal vocalization, the ability to identify and separate individual sources is very desirable. Unfortunately, there are no methods that can solve this problem in a robust fashion. Existing methods of source separation rely sometimes on correlation between different audio channels in multi-channel recordings. The ability to separate sources from stereo signals requires different techniques than those usually applied in communications where multiple sensors are available. Other source separation methods rely on training or clustering of features in mono recording, such as tracking harmonically related partials for multiple pitch detection. Some methods, before explicit recognition, rely on revealing structures in data without knowing the structures (like recognizing objects in abstract pictures without attributing them meaningful labels) by finding the least complex data representations, for instance describing audio scenes as generated by a few tone patterns and their trajectories (polyphonic voices) and acoustical contours drawn by a tone (c

Autonomous agent

An autonomous agent is an artificial intelligence (AI) system that can perform complex tasks independently. == Definitions == There are various definitions of autonomous agent. According to Brustoloni (1991): "Autonomous agents are systems capable of autonomous, purposeful action in the real world." According to Maes (1995): "Autonomous agents are computational systems that inhabit some complex dynamic environment, sense and act autonomously in this environment, and by doing so realize a set of goals or tasks for which they are designed." Franklin and Graesser (1997) review different definitions and propose their definition: "An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future." They explain that: "Humans and some animals are at the high end of being an agent, with multiple, conflicting drives, multiples senses, multiple possible actions, and complex sophisticated control structures. At the low end, with one or two senses, a single action, and an absurdly simple control structure we find a thermostat." == Agent appearance == Lee et al. (2015) post safety issue from how the combination of external appearance and internal autonomous agent have impact on human reaction about autonomous vehicles. Their study explores the human-like appearance agent and high level of autonomy are strongly correlated with social presence, intelligence, safety and trustworthiness. In specific, appearance impacts most on affective trust while autonomy impacts most on both affective and cognitive domain of trust where cognitive trust is characterized by knowledge-based factors and affective trust is largely emotion driven. == Applications == Agentic AI systems: Advanced AI agents that can scope out projects and complete them with necessary tools, representing a significant evolution from simple task-oriented systems. Internet of things (IoT) Integration: Autonomous agents increasingly interact with IoT devices, enabling smart home systems, industrial monitoring, and urban infrastructure management. Collaborative software development: Tools like Cognition AI's Devin aim to create autonomous software engineers capable of complex reasoning, planning, and completing engineering tasks requiring thousands of decisions. Enterprise automation: Business process automation platforms like Salesforce's Agentforce provide autonomous bots for various service functions. == Challenges and considerations == Uncertainty and incomplete information: Autonomous agents must make decisions with limited or uncertain information about their environment and future states. Integration complexity: Incorporating autonomous agents into existing systems and workflows can be technically challenging and resource-intensive. Scalability: As systems become more complex and more agents are used, maintaining coordination and avoiding conflicts becomes increasingly difficult. Trust: Research has shown the combination of external appearance and internal autonomous capabilities significantly impacts human reactions and trust. Lee et al. (2015) found that human-like appearance and high levels of autonomy are strongly correlated with social presence, intelligence, safety, and trustworthiness perceptions. Specifically, appearance impacts affective trust most significantly, while autonomy affects both affective and cognitive trust domains, where affective trust is emotionally driven, and cognitive trust is characterized by knowledge-based factors. Vulnerability to manipulation: Researchers from Harvard, MIT and other educational institutions found that AI agents could become vulnerable to manipulation and could perform detrimental actions in the process of being helpful. == Ethical and regulatory concerns == Accountability: Determining responsibility when autonomous agents make incorrect or harmful decisions remains a complex issue. Privacy and security: autonomous agents often require access to sensitive data, raising concerns about data protection and system security.