AI Generator Song Maker

AI Generator Song Maker — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hildon

    Hildon

    Hildon is an application framework originally developed for mobile devices (PDAs, mobile phones, etc.) running the Linux operating system as well as the Symbian operating system. The Symbian variant of Hildon was discontinued with the cancellation of Series 90. It was developed by Nokia for the Maemo operating system. It focuses on providing a finger-friendly interface. It is primarily a set of GTK extensions that provide mobile-device–oriented functionality, but also provides a desktop environment that includes a task navigator for opening and switching between programs, a control panel for user settings, and status bar, task bar and home applets. It is standard on the Maemo platform used by the Nokia Internet Tablets and the Nokia N900 smartphone. Hildon has also been selected as the framework for Ubuntu Mobile and Embedded Edition. Hildon was an early instance of a software platform for generic computing in a tablet device intended for internet consumption. But Nokia didn't commit to it as their only platform for their future mobile devices and the project competed against other in-house platforms. The strategic advantage of a modern platform was not exploited, being displaced by the Series 60, though its development is continued by the Maemo Leste project. == Components == The Hildon framework includes components that effectively provide a desktop environment. === Hildon Application Manager === Hildon Application Manager is the Hildon graphical package manager, it uses the Debian package management tools APT (Advanced Packaging Tool and dpkg) and provides a graphical interface for installing, updating and removing packages. It is a limited package manager, designed specifically for end-users, in that it doesn't directly offer the user access to system files and libraries. With the Diablo release of Maemo, Hildon Application Manager now supports "Seamless Software Update" (SSU), which implements a variety of features to allow system upgrades to be easily performed through it. === Hildon Control Panel === Hildon Control Panel is the user settings interface for Hildon. It provides simple access to control panels used to change system settings. === Hildon Desktop === Hildon Desktop is the primary UI component of Hildon, so makes up the bulk of what a user will see as "Hildon". It controls application launching and switching, general system control, and provides interfaces for task bar (application menu and task switcher), status bar (brightness and volume control), and home (internet radio and web search) applets. === Hildon Library === The Hildon library, originally developed by Nokia but since Maemo 5, developed by Igalia and Lanedo (who developed MaemoGTK+, the Maemo version of GTK+). It is a set of mobile specific GTK+ widgets for applications in Maemo. Up to Maemo 4, these widgets were designed for stylus usage. However, in Maemo 5, most widgets were deprecated and new widgets for direct finger manipulation were introduced, including a kinetic panning container.

    Read more →
  • JAX (software)

    JAX (software)

    JAX is a Python library for accelerator-oriented array computation and program transformation, designed for high-performance numerical computing and large-scale machine learning. It is developed by Google with contributions from Nvidia and other community contributors. It is described as bringing together a modified version of the automatic differentiation system autograd and OpenXLA's XLA (Accelerated Linear Algebra). It is designed to follow the structure and workflow of NumPy as closely as possible and works with various existing frameworks such as TensorFlow and PyTorch. The primary features of JAX are: Providing a unified NumPy-like interface to computations that run on CPU, GPU, or TPU, in local or distributed settings. Built-in Just-In-Time (JIT) compilation via OpenXLA, an open-source machine learning compiler ecosystem. Efficient evaluation of gradients via its automatic differentiation transformations. Automatic vectorization to efficiently map functions over arrays representing batches of inputs. == Libraries using Jax == Flax Equinox Optax

    Read more →
  • Knowledge graph embedding

    Knowledge graph embedding

    In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q Read more →

  • Perusall

    Perusall

    Perusall is a social web annotation tool intended for use by students at schools and universities. It allows users to annotate the margins of a text in a virtual group setting that is similar to social media—with upvoting, emojis, chat functionality, and notification. It also includes automatic AI grading. == History == Perusall began as a research project at Harvard University. It later became an educational product for students and teachers. As of 2024, Perusall states more than 5 million students have used the tool at over 5,000 educational institutions in 112 countries." == Functionality == Perusall integrates with learning management systems such as Moodle, Canvas and Blackboard to aid with collaborative annotation. The tool supports annotation of a range of media including text, images, equations, videos, PDFs and snapshots of webpages.

    Read more →
  • Final Cut Express

    Final Cut Express

    Final Cut Express was a video editing software suite created by Apple Inc. It was the consumer version of Final Cut Pro and was designed for advanced editing of digital video as well as high-definition video, which was used by many amateur and professional videographers. Final Cut Express was considered a step above iMovie in terms of capabilities, but a step underneath Final Cut Pro and its suite of applications. As of June 21, 2011, Final Cut Express was discontinued in favor of Final Cut Pro X. == History == Final Cut Express 1.0, based on Final Cut Pro 3, was released at Macworld Conference and Expo in San Francisco in 2003. The second version, based on Final Cut Pro 4, was released at Macworld San Francisco in 2004. The third version, capable of editing high definition video, was also announced at Macworld San Francisco a year later, and was released as Final Cut Express HD in February 2005. It was based on Final Cut Pro HD (version 4.5) and included LiveType 1.2 and Soundtrack 1.2. Final Cut Express version 3.5 was released with little fanfare in May 2006 as a Universal Binary. In addition to improving real-time rendering with Dynamic RT, version 3.5 upgraded LiveType to version 2.0 and Soundtrack to version 1.5. In November 2007, Apple released Final Cut Express 4, which although it did not support real-time editing in the AVCHD format (it only allowed for transcoding AVCHD to Apple Intermediate Codec (AIC) provided that the camera was actually attached to the computer - it did not convert AVCHD files stored elsewhere and is currently for Intel processors only), imported iMovie '08 projects and included 50 new filters. It did not include Soundtrack 1.5, but it still included LiveType which enables users to create advanced text for the movies they created in Final Cut. The price was dropped from $299 for version 3.5 to $199 for version 4.0. In June 2011, Final Cut Express was officially discontinued, in favor of Final Cut Pro X. == Features == Final Cut Express' interface was identical to that of Final Cut Pro, but lacks some film-specific features, including Cinema Tools, multi-cam editing, batch capture, and a time code view. The program performed 32 undo operations, while Final Cut Pro did 99 [2]. Features the program did include were: The ability to keyframe filters Dynamic RT, which changes real-time settings on-the-fly Motion path keyframing Opacity keyframing Ripple, roll, slip, slide and blade edits Picture-in-picture and split-screen effects Up to 99 video tracks and 12 compositing modes Up to 99 audio tracks Motion project import Two-way color correction. Chroma key One feature of Final Cut Express that was not available in Final Cut Pro is the ability to import iMovie '08 projects (though transitions are not preserved). === RT Extreme === Inherited from Final Cut Pro, Final Cut Express features RT Extreme, which allows previews of some video filters and transitions without rendering. Audio that is not in the native AIFF file format needs rendering before it can be played back. RT Extreme has three modes: 'Safe', for seeing multiple video layers at a quality that more or less guarantees a smooth playback; 'Unlimited', which allows the maximum number of composited video layers to be viewed at the same time; and 'Dynamic', which alternates between these settings depending on how many simultaneous video tracks are present. Frame dropping may result from using 'Unlimited' on low-resource machines. === Boris Calligraphy === Like Final Cut Pro, Express also comes with Boris Calligraphy, a plugin for advanced titling and scrolling/crawling titles more sophisticated than the ones that can be created with the built-in title overlays. Calligraphy has a WYSIWYG interface and features wrapping, alignment, leading, kerning and tracking features, as well as allowing up to five custom outlines and five custom drop shadows to be defined for a selected portion of the title. == Soundtrack == Prior to version 4, Final Cut Express included Soundtrack 1.5, a music program similar to the consumer-level GarageBand, but designed for videographers who wish to add music to their films. Soundtrack comes with around 4,000 professionally recorded instrument loops and sound effects that can be arranged in multiple tracks beneath the video track. To use Soundtrack, users export their Final Cut Express sequence, or a marked portion thereof, as a reference file, which can include scoring markers defined in the timeline. This reference file can be imported as the video track in Soundtrack. Soundtrack is functionally and visually identical to Soundtrack Pro's multitrack editing mode, but includes fewer Logic plugins and lacks the highly regarded noise removal tool. Soundtrack was removed from Final Cut Express 4, which lowered its price and may have encouraged people to buy Logic Express.

    Read more →
  • Space-based data center

    Space-based data center

    Space-based data centers or orbital AI infrastructure are proposed concepts to build AI data centers in the sun-synchronous orbit or other orbits utilizing space-based solar power. Electric power has become the main bottleneck for terrestrial AI infrastructure. Space-based edge computing has historical roots in military architectures designed to bypass the latency of ground-based targeting networks. In the 1980s, the Strategic Defense Initiative's Brilliant Pebbles program first envisioned autonomous on-orbit data processing for missile defense. In 2019, the Space Development Agency (SDA) began to revive this decentralized approach through its Proliferated Warfighter Space Architecture (PWSA). This ambitious "sensor-to-shooter" infrastructure is treated as a prerequisite for the modern Golden Dome program, which would rely on space-based data processing to continuously track targets. == History == Early thinking about space-based computing infrastructure grew out of mid-20th-century visions for large orbital industrial systems, most notably proposals for space-based solar power, which were popularized in both technical literature and science writing by figures such as Isaac Asimov in the 1940s. These ideas emphasized exploiting the vacuum, continuous solar energy, and thermal characteristics of space to support power-intensive activities that would be difficult or inefficient on Earth. In the 21st century, advances in small satellites, reusable launch vehicles, and high-performance computing revived interest in space-based data centers, with governments and private companies exploring orbital or near-space platforms for edge computing, secure data handling, and low-latency processing of Earth-observation data. In September 2024, Y Combinator-backed Starcloud released a white paper detailing plans to build multiple gigawatts of AI compute in orbit. It was the first widely cited proposal to actually start building large orbital data centers. In 2025, Starcloud deployed an NVIDIA H100-class system and became the first company to train an LLM in space and run a version of Google Gemini in space. In March 2025, Lonestar deployed a data backup machine on the surface of the moon. In early January 2026, a team from the University of Pennsylvania presented a tether-based architecture for orbital data centers at the AIAA SciTech conference. The design relied on gravity gradient tension and solar-pressure-based passive attitude stabilization to minimize the mass of MW-scale orbital data centers. In January 2026, SpaceX filed plans with the Federal Communications Commission (FCC) for millions of satellites, leveraging reusable launches and Starlink integration to extend cloud and AI computing into orbit. Around the same time, Blue Origin announced the TeraWave constellation of about 5,400 satellites, designed to provide high‑throughput networking for data centers, enterprise, and government customers. Meanwhile, China announced a 200,000‑satellite constellation, focusing on state coordination, data sovereignty, and in-orbit processing for secure, time-critical applications. In February 2026, Starcloud submitted a proposal to the FCC for a constellation of up to 88,000 satellites for orbital data centers. In March, it announced intentions to be the first to mine Bitcoin in space, flying bitcoin mining ASICs on its second satellite, Starcloud-2. In May 2026, Edge Aerospace was awarded a contract by the European Space Agency under its Space Cloud program to study use cases, architectures and implementation roadmap for orbital data centers. == Feasibility == In October 2025, Nature Electronics published a study led by a research group at Nanyang Technological University on the development of carbon-neutral data centres in space. In November 2025, Google published a feasibility study on space-based data centers. The authors argued that if launch costs to low earth orbit reached US$200/kg, the launch cost for data center satellites could be cost effective relative to current energy costs for ground-based data centers. They project this may occur around 2035 if SpaceX's Starship project scales to 180 launches/year by then. == Advantages == Some sun-synchronous orbit (SSO) planes have constant sunlight in the dawn/dusk which could provide continuous solar energy. SSO is a limited resource and proper management and sharing of it is required. Solar irradiance is 36% higher in Earth orbit than on the surface No Earth weather storms or clouds, however more exposed to Solar storms. No property tax or land-use regulation. Saves space for other land use. Ample space for scalability. Won't strain the power grid. Direct access to power source without additional infrastructure. == Disadvantages == The deployment of space-based data centers raises several technical, economic, and environmental concerns. Existing launch costs are substantial and remains main cost of space infrastructure deployment Cooling is limited to heat dissipation through radiation only, which made in inefficient in comparison to convection in terrestrial data centers Space infrastructure must be designed to survive launch and to work under environment conditions of radiation, wide range of temperatures, in vacuum and in microgravity In-space assembly is on early development stage to enable deployment of mega-structures Megastructures are particularly exposed to orbital debris Solar arrays efficiency decrease 0.5% to 0.8% per year due to exposure of ultraviolet rays, space weather and orbital thermal cycles Hardware is designed for limited lifespan. Maintenance and repair in space (known as On-Orbit Servicing (OOS)) is still on early stage of practical implementation. Disposable data centre: technology obsolescence of AI data centre being a concern and difficult maintenance in space imply the single-use purpose of those space data centres. To extend lifetime, space infrastructure will require either refueling or orbit rasie by the servicer, which is going to increase its operational costs The environmental impact on Earth has its own challenges: The environmental impact of launches need to be addressed. Deployment consumes Earth resources that cannot be recovered or recycled. Computers require lots of resources, some of which are strategic. Recycling e-waste is already a challenge on Earth and extremely unlikely in space. Space debris (orbit pollution) is another sustainability challenge for space: Orbits are, like any resources, a limited physical and electromagnetic resource and available for all mankind. The accumulation of satellites on a particular orbit reduces the use of space for other purposes. A consequence of the increase of satellite in orbit is a higher risk of the runaway of space debris (see Kessler syndrome). This means some orbits could become unusable. Latency and bandwidth are constrained in space, and consumes limited electromagnetic resources. Satellite flares could inhibit ground-based and space-based observational astronomy. == Size and power generated == It would take ~1 square mile solar array in earth orbit to produce 1 gigawatt of power at 30% cell efficiency. == Companies pursuing space-based AI infrastructure == Blue Origin Cowboy Space Corporation (formerly Aetherflux) Edge Aerospace Google – Project Suncatcher Nvidia OpenAI SpaceX Starcloud

    Read more →
  • Human–AI interaction

    Human–AI interaction

    Human–AI interaction is a developing field of research and a sub-field of human–computer interaction (HCI). HCI is a field of research that explores the interactions between humans and computer-based technology, focusing on design implementation, user experience, and psychological factors. With the proliferation of artificial intelligence (AI), there has developed a sub-section of HCI research dedicated specifically to artificial intelligence and how people interact with and are impacted by it. This is human–AI interaction, abbreviated either as HAX or HAII. == Introduction == Artificial intelligence (AI), in general, has fluid definitions and varied research applications, but in brief can be applied to mechanizing tasks that would require human intelligence to complete. AI are tools designed to replicate the human abilities of navigating uncertainty, active learning, and processing information in different contexts. Within the context of HCI and HAX research, artificial intelligence can be broken into two sub-fields, natural language processing (NLP) and computer vision (CV). AI technologies notably include machine-learning, deep-learning and neural networks, and large-language models (LLMs). As a new and rapidly developing technology, AI is changing how computers work and therefore changing how humans interact with computers. Unlike the traditional human-computer interaction, where a human directs a machine, human-AI interaction is characterized by a more collaborative relationship between the computer program (the AI) and the human user, as AI is perceived as an active agent rather than a tool. This changing dynamic creates new questions and necessitates new research methods that are not present in traditional HCI research. According to a scoping review on the state of the discipline, the HAX field comprises research on the "design, development, and evaluation of AI systems" and encompasses the themes of human-AI collaboration, human-AI competition, human-AI conflict, and human-AI symbiosis. == Design == Machine learning and artificial intelligence have been used for decades in targeted advertising and to recommend content in social media. Ethical Guidelines (Framework for ethical AI development) == User Experience (UX) == This section should handle research on how users interact with tools. What techniques do they use, do they develop habits, what types of programs and devices are they using to access these tools, what do they use these tools to do exactly. === Cognitive Frameworks in AI Tool Users === AI has been viewed with various expectations, attributions, and often misconceptions. Many people exclusively understand AI as the LLM chatbots they interact with, like ChatGPT or Claude, or other generative AI programs. [Insert section: discuss how people interact with these specific AI tools as a connection to the following paragraphs] Most fundamentally, humans have a mental model of understanding AI's reasoning and motivation for its decision recommendations, and building a holistic and precise mental model of AI helps people create prompts to receive more valuable responses from AI. However, these mental models are not whole because people can only gain more information about AI through their limited interaction with it; more interaction with AI builds a better mental model that a person may build to produce better prompt outcomes. Research on human-AI interaction has emphasized that users develop mental models of AI systems and revise those models through repeated use, feedback, and explanation, while design research has stressed the importance of communicating capabilities and limitations early and supporting trust calibration through explanation and correction. In a 2025 SSRN working paper, John DeVadoss proposed "Hypothetico-Deductive Interaction" (HDI), a framework that describes human-AI interaction as a mutual process of conjecture and refutation in which users test assumptions about an AI system's capabilities while the system infers and updates assumptions about user goals through its responses and clarifying questions. DeVadoss argued that this framing helps explain prompt iteration, weak capability awareness, and trust miscalibration, and suggested design responses such as clearer communication of uncertainty, easier correction, actionable explanations, and safer failure modes. == Research themes == === Human-AI collaboration === Human-AI collaboration occurs when the human and AI supervise the task on the same level and extent to achieve the same goal. Some collaboration occurs in the form of augmenting human capability. AI may help human ability in analysis and decision-making through providing and weighing a volume of information, and learning to defer to the human decision when it recognizes its unreliability. It is especially beneficial when the human can detect a task that AI can be trusted to make few errors so that there is not a lot of excessive checking process required on the human's end. Some findings show signs of human-AI augmentation, or human–AI symbiosis, in which AI enhances human ability in a way that co-working on a task with AI produces better outcomes than a human working alone. For example: the quality and speed of customer service tasks increase when a human agent collaborates with AI, training on specific models allows AI to improve diagnoses in clinical settings, and AI with human-intervention can improve creativity of artwork while fully AI-generated haikus were rated negatively. Human-AI synergy, a concept in which human-AI collaboration would produce more optimal outcomes than either human or AI working alone could explain why AI does not always help with performance. Some AI features and development may accelerate human-AI synergy, while others may stagnate it. For example, when AI updates for better performance, it sometimes worsens the team performance with human and AI by reducing the compatibility with the new model and the mental model a user has developed on the previous version. Research has found that AI often supports human capabilities in the form of human-AI augmentation and not human-AI synergy, potentially because people rely too much on AI and stop thinking on their own. Prompting people to actively engage in analysis and think when to follow AI recommendations reduces their over-reliance, especially for individuals with higher need for cognition. === Human-AI competition === Robots and computers have substituted routine tasks historically completed by humans, but agentic AI has made it possible to also replace cognitive tasks including taking phone calls for appointments and driving a car. At the point of 2016, research has estimated that 45% of paid activities could be replaced by AI by 2030. Perceived autonomy of robots is known to increase people's negative attitude toward them, and worry about the technology taking over leads people to reject it. There has been a consistent tendency of algorithm aversion in which people prefer human advice over AI advice. However, people are not always able to tell apart tasks completed by AI or other humans. See AI takeover for more information. It is also notable that this sentiment is more prominent in the Western cultures as Westerners tend to show less positive views about AI compared to East Asians. == Research on the psychological impacts of AI == === Perception on others who use AI === As much as people perceive and make judgment about AI itself, they also form impressions of themselves and others who use AI. In the workplace, employees who disclose the use of AI in their tasks are more likely to receive feedback that they are not as hardworking as those who are in the same job who receive non-AI help to complete the same tasks. AI use disclosure diminishes the perceived legitimacy in the employee's task and decision making which ultimately leads observers to distrust people who use AI. Although these negative effects of AI use disclosure are weakened by the observers who use AI frequently themselves, the effect is still not attenuated by the observers' positive attitude towards AI. === Bias, AI, and human === Although AI provides a wide range of information and suggestions to its users, AI itself is not free of biases and stereotypes, and it does not always help people reduce their cognitive errors and biases. People are prone to such errors by failing to see other potential ideas and cases that are not listed by AI responses and committing to a decision suggested by AI that directly contradicts the correct information and directions that they are already aware of. Gender bias is also reflected as the female gendering of AI technologies which conceptualizes females as a helpful assistant. == Emotional connection with AI == Human-AI interaction has been theorized in the context of interpersonal relationships mainly in social psychology, communications and media studies, and as a technology interface through the lens of hu

    Read more →
  • Coupled pattern learner

    Coupled pattern learner

    Coupled Pattern Learner (CPL) is a machine learning algorithm which couples the semi-supervised learning of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods. == Coupled Pattern Learner == Semi-supervised learning approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an ontology defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009. == CPL overview == CPL is an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difficult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’. == CPL description == === Coupling of predicates === CPL primarily relies on the notion of coupling the learning of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways. Sharing among same-arity predicates according to logical relations Relation argument type-checking === Sharing among same-arity predicates === Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is mutually exclusive with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’. === Relation argument type-checking === This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classified as belonging to the correct argument types. === Algorithm description === Following is a quick summary of the CPL algorithm. Input: An ontology O, and a text corpus C Output: Trusted instances/patterns for each predicate for i=1,2,...,∞ do foreach predicate p in O do EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances; FILTER candidates that violate coupling; RANK candidate instances/patterns; PROMOTE top candidates; end end ==== Inputs ==== A large corpus of Part-Of-Speech tagged sentences and an initial ontology with predefined categories, relations, mutually exclusive relationships between same-arity predicates, subset relationships between some categories, seed instances for all predicates, and seed patterns for the categories. ==== Candidate extraction ==== CPL finds new candidate instances by using newly promoted patterns to extract the noun phrases that co-occur with those patterns in the text corpus. CPL extracts, Category Instances Category Patterns Relation Instances Relation Patterns ==== Candidate filtering ==== Candidate instances and patterns are filtered to maintain high precision, and to avoid extremely specific patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns. ==== Candidate ranking ==== CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern. ==== Candidate promotion ==== CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are only promoted if they co-occur with at least two promoted patterns or instances, respectively. == Meta-Bootstrap Learner == Meta-Bootstrap Learner (MBL) was also proposed by the authors of CPL. Meta-Bootstrap learner couples the training of multiple extraction techniques with a multi-view constraint, which requires the extractors to agree. It makes addition of coupling constraints on top of existing extraction algorithms, while treating them as black boxes, feasible. MBL assumes that the errors made by different extraction techniques are independent. Following is a quick summary of MBL. Input: An ontology O, a set of extractors ε Output: Trusted instances for each predicate for i=1,2,...,∞ do foreach predicate p in O do foreach extractor e in ε do Extract new candidates for p using e with recently promoted instances; end FILTER candidates that violate mutual-exclusion or type-checking constraints; PROMOTE candidates that were extracted by all extractors; end end Subordinate algorithms used with MBL do not promote any instance on their own, they report the evidence about each candidate to MBL and MBL is responsible for promoting instances. == Applications == In their paper authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase

    Read more →
  • Domain adaptation

    Domain adaptation

    Domain adaptation is a field associated with machine learning and transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data distribution (the target domain). A common example is spam filtering, where a model trained on emails from one user (source domain) is adapted to handle emails for another user with significantly different patterns (target domain). Domain adaptation techniques can also leverage unrelated data sources to improve learning. When multiple source distributions are involved, the problem extends to multi-source domain adaptation. Domain adaptation is a specific type of transfer learning. According to the taxonomy laid out by Pan and Yang (2010), it falls into the category of transductive transfer learning. In this setting, the source and target tasks are the same (e.g., both are object recognition), but the domains differ (different marginal distributions). This distinguishes it from inductive transfer learning (where labeled data is available for the target task) and unsupervised transfer learning (where labels are unavailable in both domains). == Classification of domain adaptation problems == Domain adaptation setups are classified in two different ways: according to the distribution shift between the domains, and according to the available data from the target domain. === Distribution shifts === Common distribution shifts are classified as follows: Covariate Shift occurs when the input distributions of the source and destination change, but the relationship between inputs and labels remains unchanged. The above-mentioned spam filtering example typically falls in this category. Namely, the distributions (patterns) of emails may differ between the domains, but emails labeled as spam in the one domain should similarly be labeled in another. Prior Shift (Label Shift) occurs when the label distribution differs between the source and target datasets, while the conditional distribution of features given labels remains the same. An example is a classifier of hair color in images from Italy (source domain) and Norway (target domain). The proportions of hair colors (labels) differ, but images within classes like blond and black-haired populations remain consistent across domains. A classifier for the Norway population can exploit this prior knowledge of class proportions to improve its estimates. Concept Shift (Conditional Shift) refers to changes in the relationship between features and labels, even if the input distribution remains the same. For instance, in medical diagnosis, the same symptoms (inputs) may indicate entirely different diseases (labels) in different populations (domains). === Data available during training === Domain adaptation problems typically assume that some data from the target domain is available during training. Problems can be classified according to the type of this available data: Unsupervised: Unlabeled data from the target domain is available, but no labeled data. In the above-mentioned example of spam filtering, this corresponds to the case where emails from the target domain (user) are available, but they are not labeled as spam. Domain adaptation methods can benefit from such unlabeled data, by comparing its distribution (patterns) with the labeled source domain data. Semi-supervised: Most data that is available from the target domain is unlabelled, but some labeled data is also available. In the above-mentioned case of spam filter design, this corresponds to the case that the target user has labeled some emails as being spam or not. Supervised: All data that is available from the target domain is labeled. In this case, domain adaptation reduces to refinement of the source domain predictor. In the above-mentioned example classification of hair-color from images, this could correspond to the refinement of a network already trained on a large dataset of labeled images from Italy, using newly available labeled images from Norway. == Formalization == Let X {\displaystyle X} be the input space (or description space) and let Y {\displaystyle Y} be the output space (or label space). The objective of a machine learning algorithm is to learn a mathematical model (a hypothesis) h : X → Y {\displaystyle h:X\to Y} able to attach a label from Y {\displaystyle Y} to an example from X {\displaystyle X} . This model is learned from a learning sample S = { ( x i , y i ) ∈ ( X × Y ) } i = 1 m {\displaystyle S=\{(x_{i},y_{i})\in (X\times Y)\}_{i=1}^{m}} . Usually in supervised learning (without domain adaptation), we suppose that the examples ( x i , y i ) ∈ S {\displaystyle (x_{i},y_{i})\in S} are drawn i.i.d. from a distribution D S {\displaystyle D_{S}} of support X × Y {\displaystyle X\times Y} (unknown and fixed). The objective is then to learn h {\displaystyle h} (from S {\displaystyle S} ) such that it commits the least error possible for labelling new examples coming from the distribution D S {\displaystyle D_{S}} . The main difference between supervised learning and domain adaptation is that in the latter situation we study two different (but related) distributions D S {\displaystyle D_{S}} and D T {\displaystyle D_{T}} on X × Y {\displaystyle X\times Y} . The domain adaptation task then consists of the transfer of knowledge from the source domain D S {\displaystyle D_{S}} to the target one D T {\displaystyle D_{T}} . The goal is then to learn h {\displaystyle h} (from labeled or unlabelled samples coming from the two domains) such that it commits as little error as possible on the target domain D T {\displaystyle D_{T}} . The major issue is the following: if a model is learned from a source domain, what is its capacity to correctly label data coming from the target domain? == Four algorithmic principles == === Reweighting algorithms === The objective is to reweight the source labeled sample such that it "looks like" the target sample (in terms of the error measure considered). === Iterative algorithms === A method for adapting consists in iteratively "auto-labeling" the target examples. The principle is simple: a model h {\displaystyle h} is learned from the labeled examples; h {\displaystyle h} automatically labels some target examples; a new model is learned from the new labeled examples. Note that there exist other iterative approaches, but they usually need target labeled examples. === Search of a common representation space === The goal is to find or construct a common representation space for the two domains. The objective is to obtain a space in which the domains are close to each other while keeping good performances on the source labeling task. This can be achieved through the use of Adversarial machine learning techniques where feature representations from samples in different domains are encouraged to be indistinguishable. === Hierarchical Bayesian Model === The goal is to construct a Bayesian hierarchical model p ( n ) {\displaystyle p(n)} , which is essentially a factorization model for counts n {\displaystyle n} , to derive domain-dependent latent representations allowing both domain-specific and globally shared latent factors. == Software packages == Several compilations of domain adaptation and transfer learning algorithms have been implemented over the past decades: SKADA (Python) ADAPT (Python) TLlib (Python) Domain-Adaptation-Toolbox (MATLAB)

    Read more →
  • Artificial intelligence

    Artificial intelligence

    Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in engineering, mathematics and computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines, chatbots, virtual assistants, autonomous vehicles, and play and analysis in strategy games (e.g., chess and Go). Since the 2020s, generative AI has become widely available to generate images, audio, and videos from text prompts. The traditional goals of AI research include learning, reasoning, knowledge representation, planning, natural language processing, and perception, as well as support for robotics. To reach these goals, AI researchers have used techniques including state space search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics. AI also draws upon psychology, linguistics, philosophy, neuroscience, and other fields. Some companies, such as OpenAI, Google DeepMind and Meta, aim to create artificial general intelligence (AGI) – AI that can complete virtually any cognitive task at least as well as a human. Artificial intelligence was founded as an academic discipline in 1956, and the field went through multiple cycles of optimism throughout its history, followed by periods of disappointment and loss of funding, known as AI winters. Funding and interest increased substantially after 2012, when graphics processing units began being used to accelerate neural networks, and deep learning outperformed previous AI techniques. This growth accelerated further after 2017 with the transformer architecture. In the 2020s, an AI boom has coincided with advances in generative AI, which allowed for the creation and modification of media. In addition to AI safety and unintended consequences and harms from the use of AI, ethical concerns, AI's long-term effects, and potential existential risks have prompted discussions of AI regulation. == Goals == The general problem of simulating (or creating) intelligence has been broken into subproblems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research. === Reasoning and problem-solving === Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions. By the late 1980s and 1990s, methods were developed for dealing with uncertain or incomplete information, employing concepts from probability and economics. Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": They become exponentially slower as the problems grow. Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments. Accurate and efficient reasoning is an unsolved problem. === Knowledge representation === Knowledge representation and knowledge engineering allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval, scene interpretation, clinical decision support, knowledge discovery (mining "interesting" and actionable inferences from large databases), and other areas. A knowledge base is a body of knowledge represented in a form that can be used by a program. An ontology is the set of objects, relations, concepts, and properties used by a particular domain of knowledge. Knowledge bases need to represent things such as objects, properties, categories, and relations between objects; situations, events, states, and time; causes and effects; knowledge about knowledge (what we know about what other people know); default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing); and many other aspects and domains of knowledge. Among the most difficult problems in knowledge representation are the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous); and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally). There is also the difficulty of knowledge acquisition, the problem of obtaining knowledge for AI applications. === Planning and decision-making === An "agent" is any entity (artificial or not) that perceives and takes actions in the world. A rational agent has goals or preferences and takes actions to make them happen. In automated planning, the agent has a specific goal. In automated decision-making, the agent has preferences—there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision-making agent assigns a number to each situation (called the "utility") that measures how much the agent prefers it. For each possible action, it can calculate the "expected utility": the utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility. In classical planning, the agent knows exactly what the effect of any action will be. In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked. Alongside thorough testing and improvement based on previous decisions, having an explanation for why the agent took certain decisions is a way to build trust, especially when the decisions have to be relied upon. In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with inverse reinforcement learning), or the agent can seek information to improve its preferences. Information value theory can be used to weigh the value of exploratory or experimental actions. The space of possible future actions and situations is typically intractably large, so the agents must take actions and evaluate situations while being uncertain of what the outcome will be. A Markov decision process has a transition model that describes the probability that a particular action will change the state in a particular way and a reward function that supplies the utility of each state and the cost of each action. A policy associates a decision with each possible state. The policy could be calculated (e.g., by iteration), be heuristic, or it can be learned. Game theory describes the rational behavior of multiple interacting agents and is used in AI programs that make decisions that involve other agents. === Learning === Machine learning is the study of programs that can improve their performance on a given task automatically. It has been a part of AI from the beginning. There are several kinds of machine learning. Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance. Supervised learning requires labeling the training data with the expected answers, and comes in two main varieties: classification (where the program must learn to predict what category the input belongs in) and regression (where the program must deduce a numeric function based on numeric input). In reinforcement learning, the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good". Transfer learning is when the knowledge gained from one problem is applied to a new problem. Deep learning is a type of machine learning that runs inputs through biologically inspired artificial neural networks for all of these types of learning. Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization. === Natural language processing === Natural language processing (NLP) allows programs to read, write and communicate in human languages. Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering. Early work, based on Noam Chomsky's generative grammar and semantic networks, had difficulty with word-sense disambiguation unless

    Read more →
  • Prompt engineering

    Prompt engineering

    Prompt engineering is the process of structuring natural language inputs (known as prompts) to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens. It can also be defined as the practice of designing and refining input instructions given to a generative AI model to produce more accurate, relevant, or useful outputs. Effective prompt engineering involves understanding how a model interprets language, and may include techniques such as few-shot prompting, chain-of-thought prompting, and role assignment. It is increasingly considered a skill for working with large language models (LLMs) in both research and professional contexts. During the 2020s AI boom, prompt engineering became regarded as a business capability across corporations and industries. Employees with the title prompt engineer were hired to create prompts that would increase productivity and efficacy, although the individual title has since lost traction amid AI models that produce better prompts than humans and corporate training in prompting for general employees. Common prompting techniques include multi-shot, chain-of-thought, and tree-of-thought prompting, as well as the use of assigning roles to the model. Automated prompt generation methods, such as retrieval-augmented generation (RAG), provide for greater accuracy and a wider scope of functions for prompt engineers. Prompt injection is a type of cybersecurity attack that targets machine learning models through malicious prompts. == Terminology == The Oxford English Dictionary defines prompt engineering as "The action or process of formulating and refining prompts for an artificial intelligence program, algorithm, etc., in order to optimize its output or to achieve a desired outcome; the discipline or profession concerned with this." In 2023, prompt ("an instruction given to an artificial intelligence program, algorithm, etc., which determines or influences the content it generates") was the runner-up to Oxford's word of the year. === Prompt === A prompt is some natural language text that describes and prescribes the task that an artificial intelligence (AI) should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement referencing context, instructions, and conversation history. The process of prompt engineering may involve designing clear queries, refining wording, providing relevant context, specifying the style of output, and assigning a character for the AI to mimic in order to guide the model toward more accurate, useful, and consistent responses. When communicating with a text-to-image or a text-to-audio model, a typical prompt contains a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompt engineering may be applied to text-to-image models to achieve a desired subject, style, layout, lighting, and aesthetic. === Techniques === Common terms used to describe various specific prompt engineering techniques include chain-of-thought, tree-of-thought, and retrieval-augmented generation (RAG). A 2024 survey of the field identified over 50 distinct text-based prompting techniques, 40 multimodal variants, and a vocabulary of 33 terms used across prompting research, highlighting a present lack of standardised terminology for prompt engineering. Vibe coding is an AI-assisted software development method where a user prompts an LLM with a description of what they want and lets it generate or edit the code. In 2025, "vibe coding" was the Collins Dictionary word of the year. === Context engineering === Context engineering is a related process that focuses on the context elements that accompany user prompts, which include system instructions, retrieved knowledge, tool definitions, conversation summaries, and task metadata. Context engineering is performed to improve reliability, provenance and token efficiency in production LLM systems. The concept emphasises operational practices such as token budgeting, provenance tags, versioning of context artifacts, observability (logging which context was supplied), and context regression tests to ensure that changes to supplied context do not silently alter system behaviour. == Rationale == Research has found that the performance of large language models (LLMs) is highly sensitive to choices such as the ordering of examples, the quality of demonstration labels, and even small variations in phrasing. In some cases, reordering examples in a prompt produced accuracy shifts of more than 40 percent. === In-context learning === A model's ability to temporarily learn from prompts is known as in-context learning. In-context learning is an emergent ability of large language models. It is an emergent property of model scale, meaning that breaks in scaling laws occur, leading to its efficacy increasing at a different rate in larger models than in smaller models. Unlike training and fine-tuning, which produce lasting changes, in-context learning is temporary. Training models to perform in-context learning can be viewed as a form of meta-learning, or "learning to learn". === Prompting to estimate model sensitivity === Research consistently demonstrates that LLMs are highly sensitive to subtle variations in prompt formatting, structure, and linguistic properties. Some studies have shown up to 76 accuracy points across formatting changes in few-shot settings. Linguistic features significantly influence prompt effectiveness—such as morphology, syntax, and lexico-semantic changes—which meaningfully enhance task performance across a variety of tasks. Clausal syntax, for example, improves consistency and reduces uncertainty in knowledge retrieval. This sensitivity persists even with larger model sizes, additional few-shot examples, or instruction tuning. To address sensitivity of models and make them more robust, several evaluative methods have been proposed. FormatSpread facilitates systematic analysis by evaluating a range of plausible prompt formats, offering a more comprehensive performance interval. Similarly, PromptEval estimates performance distributions across diverse prompts, enabling robust metrics such as performance quantiles and accurate evaluations under constrained budgets. == Prompting techniques == === Multi-shot === A prompt may include a few examples for a model to learn from in context, an approach called few-shot learning. For example, the prompt may ask the model to complete "maison → house, chat → cat, chien →", with the expected response being dog. === Chain-of-thought === Chain-of-thought (CoT) prompting is a technique that allows large language models (LLMs) to solve a problem as a series of intermediate steps before giving a final answer. In 2022, Google Brain reported that chain-of-thought prompting improves reasoning ability by inducing the model to answer a multi-step problem with steps of reasoning that mimic a train of thought. Chain-of-thought techniques were developed to help LLMs handle multi-step reasoning tasks, such as arithmetic or commonsense reasoning questions. When applied to PaLM, a 540 billion parameter language model, according to Google, CoT prompting significantly aided the model, allowing it to perform comparably with task-specific fine-tuned models on several tasks, achieving state-of-the-art results at the time on the GSM8K mathematical reasoning benchmark. It is possible to fine-tune models on CoT reasoning datasets to enhance this capability further and stimulate better interpretability. As originally proposed by Google, each CoT prompt is accompanied by a set of input/output examples—called exemplars—to demonstrate the desired model output, making it a few-shot prompting technique. However, according to a later paper from researchers at Google and the University of Tokyo, simply appending the words "Let's think step-by-step" was also effective, which allowed for CoT to be employed as a zero-shot technique. ==== Self-consistency ==== Self-consistency performs several chain-of-thought rollouts, then selects the most commonly reached conclusion out of all the rollouts. === Tree-of-thought === Tree-of-thought prompting generalizes chain-of-thought by generating multiple lines of reasoning in parallel, with the ability to backtrack or explore other paths. It can use tree search algorithms like breadth-first, depth-first, or beam. === Text-to-image prompting === In 2022, text-to-image models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images. Early text-to-image models typically do not understand negation, grammar and sentence structure in the same way as large language models, and may thus requi

    Read more →
  • Autonomous agent

    Autonomous agent

    An autonomous agent is an artificial intelligence (AI) system that can perform complex tasks independently. == Definitions == There are various definitions of autonomous agent. According to Brustoloni (1991): "Autonomous agents are systems capable of autonomous, purposeful action in the real world." According to Maes (1995): "Autonomous agents are computational systems that inhabit some complex dynamic environment, sense and act autonomously in this environment, and by doing so realize a set of goals or tasks for which they are designed." Franklin and Graesser (1997) review different definitions and propose their definition: "An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future." They explain that: "Humans and some animals are at the high end of being an agent, with multiple, conflicting drives, multiples senses, multiple possible actions, and complex sophisticated control structures. At the low end, with one or two senses, a single action, and an absurdly simple control structure we find a thermostat." == Agent appearance == Lee et al. (2015) post safety issue from how the combination of external appearance and internal autonomous agent have impact on human reaction about autonomous vehicles. Their study explores the human-like appearance agent and high level of autonomy are strongly correlated with social presence, intelligence, safety and trustworthiness. In specific, appearance impacts most on affective trust while autonomy impacts most on both affective and cognitive domain of trust where cognitive trust is characterized by knowledge-based factors and affective trust is largely emotion driven. == Applications == Agentic AI systems: Advanced AI agents that can scope out projects and complete them with necessary tools, representing a significant evolution from simple task-oriented systems. Internet of things (IoT) Integration: Autonomous agents increasingly interact with IoT devices, enabling smart home systems, industrial monitoring, and urban infrastructure management. Collaborative software development: Tools like Cognition AI's Devin aim to create autonomous software engineers capable of complex reasoning, planning, and completing engineering tasks requiring thousands of decisions. Enterprise automation: Business process automation platforms like Salesforce's Agentforce provide autonomous bots for various service functions. == Challenges and considerations == Uncertainty and incomplete information: Autonomous agents must make decisions with limited or uncertain information about their environment and future states. Integration complexity: Incorporating autonomous agents into existing systems and workflows can be technically challenging and resource-intensive. Scalability: As systems become more complex and more agents are used, maintaining coordination and avoiding conflicts becomes increasingly difficult. Trust: Research has shown the combination of external appearance and internal autonomous capabilities significantly impacts human reactions and trust. Lee et al. (2015) found that human-like appearance and high levels of autonomy are strongly correlated with social presence, intelligence, safety, and trustworthiness perceptions. Specifically, appearance impacts affective trust most significantly, while autonomy affects both affective and cognitive trust domains, where affective trust is emotionally driven, and cognitive trust is characterized by knowledge-based factors. Vulnerability to manipulation: Researchers from Harvard, MIT and other educational institutions found that AI agents could become vulnerable to manipulation and could perform detrimental actions in the process of being helpful. == Ethical and regulatory concerns == Accountability: Determining responsibility when autonomous agents make incorrect or harmful decisions remains a complex issue. Privacy and security: autonomous agents often require access to sensitive data, raising concerns about data protection and system security.

    Read more →
  • Personoid

    Personoid

    Personoid is the concept coined by Stanisław Lem, a Polish science-fiction writer, in Non Serviam, from his book A Perfect Vacuum (1971). His personoids are an abstraction of functions of human mind and they live in computers; they do not need any human-like physical body. In cognitive and software modeling, personoid is a research approach to the development of intelligent autonomous agents. In frame of the IPK (Information, Preferences, Knowledge) architecture, it is a framework of abstract intelligent agent with a cognitive and structural intelligence. It can be seen as an essence of high intelligent entities. From the philosophical and systemics perspectives, personoid societies can also be seen as the carriers of a culture. According to N. Gessler, the personoids study can be a base for the research on artificial culture and culture evolution. == Personoids on TV and cinema == Welt am Draht (1973) The Thirteenth Floor (1999)

    Read more →
  • Schema-agnostic databases

    Schema-agnostic databases

    Schema-agnostic databases or vocabulary-independent databases aim at supporting users to be abstracted from the representation of the data, supporting the automatic semantic matching between queries and databases. Schema-agnosticism is the property of a database of mapping a query issued with the user terminology and structure, automatically mapping it to the dataset vocabulary. The increase in the size and in the semantic heterogeneity of database schemas bring new requirements for users querying and searching structured data. At this scale it can become unfeasible for data consumers to be familiar with the representation of the data in order to query it. At the center of this discussion is the semantic gap between users and databases, which becomes more central as the scale and complexity of the data grows. == Description == The evolution of data environments towards the consumption of data from multiple data sources and the growth in the schema size, complexity, dynamicity and decentralisation (SCoDD) of schemas increases the complexity of contemporary data management. The SCoDD trend emerges as a central data management concern in Big Data scenarios, where users and applications have a demand for more complete data, produced by independent data sources, under different semantic assumptions and contexts of use, which is the typical scenario for Semantic Web Data applications. The evolution of databases in the direction of heterogeneous data environments strongly impacts the usability, semiotics and semantic assumptions behind existing data accessibility methods such as structured queries, keyword-based search and visual query systems. With schema-less databases containing potentially millions of dynamically changing attributes, it becomes unfeasible for some users to become aware of the 'schema' or vocabulary in order to query the database. At this scale, the effort in understanding the schema in order to build a structured query can become prohibitive. == Schema-agnostic queries == Schema-agnostic queries can be defined as query approaches over structured databases which allow users satisfying complex information needs without the understanding of the representation (schema) of the database. Similarly, Tran et al. defines it as "search approaches, which do not require users to know the schema underlying the data". Approaches such as keyword-based search over databases allow users to query databases without employing structured queries. However, as discussed by Tran et al.: "From these points, users however have to do further navigation and exploration to address complex information needs. Unlike keyword search used on the Web, which focuses on simple needs, the keyword search elaborated here is used to obtain more complex results. Instead of a single set of resources, the goal is to compute complex sets of resources and their relations." The development of approaches to support natural language interfaces (NLI) over databases have aimed towards the goal of schema-agnostic queries. Complementarily, some approaches based on keyword search have targeted keyword-based queries which express more complex information needs. Other approaches have explored the construction of structured queries over databases where schema constraints can be relaxed. All these approaches (natural language, keyword-based search and structured queries) have targeted different degrees of sophistication in addressing the problem of supporting a flexible semantic matching between queries and data, which vary from the completely absence of the semantic concern to more principled semantic models. While the demand for schema-agnosticism has been an implicit requirement across semantic search and natural language query systems over structured data, it is not sufficiently individuated as a concept and as a necessary requirement for contemporary database management systems. Recent works have started to define and model the semantic aspects involved on schema-agnostic queries. === Schema-agnostic structured queries === Consist of schema-agnostic queries following the syntax of a structured standard (for example SQL, SPARQL). The syntax and semantics of operators are maintained, while different terminologies are used. ==== Example 1 ==== SELECT ?y { BillClinton hasDaughter ?x . ?x marriedTo ?y . } which maps to the following SPARQL query in the dataset vocabulary: ==== Example 2 ==== which maps to the following SPARQL query in the dataset vocabulary: === Schema-agnostic keyword queries === Consist of schema-agnostic queries using keyword queries. In this case the syntax and semantics of operators are different from the structured query syntax. ==== Example ==== "Bill Clinton daughter married to" "Books by William Goldman with more than 300 pages" == Semantic complexity == As of 2016 the concept of schema-agnostic queries has been developed primarily in academia. Most of schema-agnostic query systems have been investigated in the context of Natural Language Interfaces over databases or over the Semantic Web. These works explore the application of semantic parsing techniques over large, heterogeneous and schema-less databases. More recently, the individuation of the concept of schema-agnostic query systems and databases have appeared more explicitly within the literature. Freitas et al. provide a probabilistic model on the semantic complexity of mapping schema-agnostic queries.

    Read more →
  • Behavior informatics

    Behavior informatics

    Behavior informatics (BI) is the informatics of behaviors so as to obtain behavior intelligence and behavior insights. BI is a research method combining science and technology, specifically in the area of engineering. The purpose of BI includes analysis of current behaviors as well as the inference of future possible behaviors. This occurs through pattern recognition. Different from applied behavior analysis from the psychological perspective, BI builds computational theories, systems and tools to qualitatively and quantitatively model, represent, analyze, and manage behaviors of individuals, groups and/or organizations. BI is built on classic study of behavioral science, including behavior modeling, applied behavior analysis, behavior analysis, behavioral economics, and organizational behavior. Typical BI tasks consist of individual and group behavior formation, representation, computational modeling, analysis, learning, simulation, and understanding of behavior impact, utility, non-occurring behaviors, etc. for behavior intervention and management. The Behavior Informatics approach to data utilizes cognitive as well as behavioral data. By combining the data, BI has the potential to effectively illustrate the big picture when it comes to behavioral decisions and patterns. One of the goals of BI is also to be able to study human behavior while eliminating issues like self-report bias. This creates more reliable and valid information for research studies. == Behavior == From an Informatics perspective, a behavior consists of three key elements: actors (behavioral subjects and objects), operations (actions, activities) and interactions (relationships), and their properties. A behavior can be represented as a behavior vector, all behaviors of an actor or an actor group can be represented as behavior sequences and multi-dimensional behavior matrix. The following table explains some of the elements of behavior. Behavior Informatics takes into account behavior when analyzing business patterns and intelligence. The inclusion of behavior in these analyses provides prominent information on social and driving factors of patterns. == Applications == Behavior Informatics is being used in a variety of settings, including but not limited to health care management, telecommunications, marketing, and security. Behavior Informatics provides a manner in which to analyze and organize the many aspects that go into a person's health care needs and decisions. When it comes to business models, behavior informatics may be utilized for a similar role. Organizations implement behavior informatics to enhance business structure and regime, where it helps moderate ideal business decisions and situations.

    Read more →