AI App Q

AI App Q — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Scan line

    Scan line

    A scan line (also scanline) is one line, or row, in a raster scanning pattern, such as a line of video on a cathode-ray tube (CRT) display of a television set or computer monitor. On CRT screens the horizontal scan lines are visually discernible, even when viewed from a distance, as alternating colored lines and black lines, especially when a progressive scan signal with below maximum vertical resolution is displayed. This is sometimes used today as a visual effect in computer graphics. The term is used, by analogy, for a single row of pixels in a raster graphics image. Scan lines are important in representations of image data, because many image file formats have special rules for data at the end of a scan line. For example, there may be a rule that each scan line starts on a particular boundary (such as a byte or word; see for example BMP file format). This means that even otherwise compatible raster data may need to be analyzed at the level of scan lines in order to convert between formats.

    Read more →
  • Universal psychometrics

    Universal psychometrics

    Universal psychometrics encompasses psychometrics instruments that could measure the psychological properties of any intelligent agent. Up until the early 21st century, psychometrics relied heavily on psychological tests that require the subject to cooperate and answer questions, the most famous example being an intelligence test. Such methods are only applicable to the measurement of human psychological properties. As a result, some researchers have proposed the idea of universal psychometrics - they suggest developing testing methods that allow for the measurement of non-human entities' psychological properties. For example, it has been suggested that the Turing test is a form of universal psychometrics. This test involves having testers (without any foreknowledge) attempt to distinguish a human from a machine by interacting with both (while not being to see either individuals). It is supposed that if the machine is equally intelligent to a human, the testers will not be able to distinguish between the two, i.e., their guesses will not be better than chance. Thus, Turing test could measure the intelligence (a psychological variable) of an AI. Other instruments proposed for universal psychometrics include reinforcement learning and measuring the ability to predict complexity.

    Read more →
  • Multimodal representation learning

    Multimodal representation learning

    Multimodal representation learning is a subfield of representation learning focused on integrating and interpreting information from different modalities, such as text, images, audio, or video, by projecting them into a shared latent space. This allows for semantically similar content across modalities to be mapped to nearby points within that space, facilitating a unified understanding of diverse data types. By automatically learning meaningful features from each modality and capturing their inter-modal relationships, multimodal representation learning enables a unified representation that enhances performance in cross-media analysis tasks such as video classification, event detection, and sentiment analysis. It also supports cross-modal retrieval and translation, including image captioning, video description, and text-to-image synthesis. == Motivation == The primary motivations for multimodal representation learning arise from the inherent nature of real-world data and the limitations of unimodal approaches. Since multimodal data offers complementary and supplementary information about an object or event from different perspectives, it is more informative than relying on a single modality. A key motivation is to narrow the heterogeneity gap that exists between different modalities by projecting their features into a shared semantic subspace. This allows semantically similar content across modalities to be represented by similar vectors, facilitating the understanding of relationships and correlations between them. Multimodal representation learning aims to leverage the unique information provided by each modality to achieve a more comprehensive and accurate understanding of concepts. These unified representations are crucial for improving performance in various cross-media analysis tasks such as video classification, event detection, and sentiment analysis. They also enable cross-modal retrieval, allowing users to search and retrieve content across different modalities. Additionally, it facilitates cross-modal translation, where information can be converted from one modality to another, as seen in applications like image captioning and text-to-image synthesis. The abundance of ubiquitous multimodal data in real-world applications, including understudied areas like healthcare, finance, and human-computer interaction (HCI), further motivates the development of effective multimodal representation learning techniques. == Approaches and methods == === Canonical-correlation analysis based methods === Canonical-correlation analysis (CCA) was first introduced in 1936 by Harold Hotelling and is a fundamental approach for multimodal learning. CCA aims to find linear relationships between two sets of variables. Given two data matrices X ∈ R n × p {\displaystyle X\in \mathbb {R} ^{n\times p}} and Y ∈ R n × q {\displaystyle Y\in \mathbb {R} ^{n\times q}} representing different modalities, CCA finds projection vectors w x ∈ R p {\displaystyle w_{x}\in \mathbb {R} ^{p}} and w y ∈ R q {\displaystyle w_{y}\in \mathbb {R} ^{q}} that maximizes the correlation between the projected variables: ρ = max w x , w y w x ⊤ Σ x y w y w x ⊤ Σ x x w x w y ⊤ Σ y y w y {\displaystyle \rho =\max _{w_{x},w_{y}}{\frac {w_{x}^{\top }\Sigma _{xy}w_{y}}{{\sqrt {w_{x}^{\top }\Sigma _{xx}w_{x}}}{\sqrt {w_{y}^{\top }\Sigma _{yy}w_{y}}}}}} such that Σ x x {\displaystyle \Sigma _{xx}} and Σ y y {\displaystyle \Sigma _{yy}} are the within-modality covariance matrices, and Σ x y {\displaystyle \Sigma _{xy}} is the between-modality covariance matrix. However, standard CCA is limited by its linearity, which led to the development of nonlinear extensions, such as kernel CCA and deep CCA. ==== Kernel CCA ==== Kernel canonical correlation analysis (KCCA) extends traditional CCA to capture nonlinear relationships between modalities by implicitly mapping the data into high dimensional feature spaces using kernel functions. Given kernel functions K x {\displaystyle K_{x}} and K y {\displaystyle K_{y}} with corresponding Gram matrices K x ∈ R n × n {\displaystyle K_{x}\in \mathbb {R} ^{n\times n}} and K y ∈ R n × n {\displaystyle K_{y}\in \mathbb {R} ^{n\times n}} , KCCA seeks coefficients α {\displaystyle \alpha } and β {\displaystyle \beta } that maximize: ρ = max α , β α ⊤ K x K y β α ⊤ K x 2 α β ⊤ K y 2 β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{\top }K_{x}Ky\beta }{{\sqrt {\alpha ^{\top }K_{x}^{2}\alpha }}{\sqrt {\beta ^{\top }K_{y}^{2}\beta }}}}} To prevent overfitting, regularization terms are typically added, resulting in: ρ = max α , β α T K x K y β α T ( K x 2 + λ x K x ) α β T ( K y 2 + λ y K y ) β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{T}K_{x}K_{y}\beta }{{\sqrt {\alpha ^{T}\left(K_{x}^{2}+\lambda _{x}K_{x}\right)\alpha }}{\sqrt {\;\beta ^{T}\left(K_{y}^{2}+\lambda _{y}K_{y}\right)\beta }}}}} where λ x {\displaystyle \lambda _{x}} and λ y {\displaystyle \lambda _{y}} are regularization parameters. KCCA has proven effective for tasks such as cross-modal retrieval and semantic analysis, though it faces computational challenges with large datasets due to its O ( n 2 ) {\displaystyle O(n^{2})} memory requirement for sorting kernel matrices. KCCA was proposed independently by several researchers. ==== Deep CCA ==== Deep canonical correlation analysis (DCCA), introduced in 2013, employs neural networks to learn nonlinear transformations for maximizing the correlation between modalities. DCCA uses separate neural networks f x {\displaystyle f_{x}} and f y {\displaystyle f_{y}} for each modality to transform the original data before applying CCA: max W x , W y , θ x , θ y corr ⁡ ( f x ( X ; θ x ) , f y ( Y ; θ y ) ) {\displaystyle \max _{W_{x},W_{y},\theta _{x},\theta _{y}}\operatorname {corr} \left(f_{x}(X;\theta _{x}),f_{y}(Y;\theta _{y})\right)} where θ x {\displaystyle \theta _{x}} and θ y {\displaystyle \theta _{y}} represent the parameters of the neural networks, and W x {\displaystyle W_{x}} and W y {\displaystyle W_{y}} are the CCA projection matrices. The correlation objective is computed as: corr ⁡ ( H x , H y ) = tr ⁡ ( T − 1 / 2 H x T H y S − 1 / 2 ) {\displaystyle \operatorname {corr} (H_{x},H_{y})=\operatorname {tr} \left(T^{-1/2}H_{x}^{T}H_{y}S^{-1/2}\right)} where H x = f x ( X ) {\displaystyle H_{x}=f_{x}(X)} and H y = f y ( Y ) {\displaystyle H_{y}=f_{y}(Y)} are the network outputs, T = H x T H x + r x I {\displaystyle T=H_{x}^{T}H_{x}+r_{x}I} , S = H y T H y + r y I {\displaystyle S=H_{y}^{T}H_{y}+r_{y}I} and r x , r y {\displaystyle r_{x},r_{y}} are the regularization parameters. DCCA overcomes the limitations of linear CCA and kernel CCA by learning complex nonlinear relationships while maintaining computational efficiency for large datasets through mini-batch optimization. === Graph-based methods === Graph-based approaches for multimodal representation learning leverage graph structure to model relationships between entities across different modalities. These methods typically represent each modality as a graph and then learn embedding that preserve cross-modal similarities, enabling more effective joint representation of heterogeneous data. One such method is cross-modal graph neural networks (CMGNNs) that extend traditional graph neural networks (GNNs) to handle data from multiple modalities by constructing graphs that capture both intra-modal and inter-modal relationships. These networks model interactions across modalities by representing them as nodes and their relationships as edges. Other graph-based methods include Probabilistic Graphical Models (PGMs) such as deep belief networks (DBN) and deep Boltzmann machines (DBM). These models can learn a joint representation across modalities, for instance, a multimodal DBN achieves this by adding a shared restricted Boltzmann Machine (RBM) hidden layer on top of modality-specific DBNs. Additionally, the structure of data in some domains like Human-Computer Interaction (HCI), such as the view hierarchy of app screens, can potentially be modeled using graph-like structures. The field of graph representation learning is also relevant, with ongoing progress in developing evaluation benchmarks. === Diffusion maps === Another set of methods relevant to multimodal representation learning are based on diffusion maps and their extensions to handle multiple modalities. ==== Multi-view diffusion maps ==== Multi-view diffusion maps address the challenge of achieving multi-view dimensionality reduction by effectively utilizing the availability of multiple views to extract a coherent low-dimensional representation of the data. The core idea is to exploit both the intrinsic relations within each view and the mutual relations between the different views, defining a cross-view model where a random walk process implicitly hops between objects in different views. A multi-view kernel matrix is constructed by combining these relations, defining a cross-view diffusion process and associ

    Read more →
  • Highway network

    Highway network

    In machine learning, the Highway Network was the first working very deep feedforward neural network with hundreds of layers, much deeper than previous neural networks. It uses skip connections modulated by learned gating mechanisms to regulate information flow, inspired by long short-term memory (LSTM) recurrent neural networks. The advantage of the Highway Network over other deep learning architectures is its ability to overcome or partially prevent the vanishing gradient problem, thus improving its optimization. Gating mechanisms are used to facilitate information flow across the many layers ("information highways"). Highway Networks have found use in text sequence labeling and speech recognition tasks. In 2014, the state of the art was training deep neural networks with 20 to 30 layers. Stacking too many layers led to a steep reduction in training accuracy, known as the "degradation" problem. In 2015, two techniques were developed to train such networks: the Highway Network (published in May), and the residual neural network, or ResNet (December). ResNet behaves like an open-gated Highway Net. == Model == The model has two gates in addition to the H ( W H , x ) {\displaystyle H(W_{H},x)} gate: the transform gate T ( W T , x ) {\displaystyle T(W_{T},x)} and the carry gate C ( W C , x ) {\displaystyle C(W_{C},x)} . The latter two gates are non-linear transfer functions (specifically sigmoid by convention). The function H {\displaystyle H} can be any desired transfer function. The carry gate is defined as: C ( W C , x ) = 1 − T ( W T , x ) {\displaystyle C(W_{C},x)=1-T(W_{T},x)} while the transform gate is just a gate with a sigmoid transfer function. == Structure == The structure of a hidden layer in the Highway Network follows the equation: y = H ( x , W H ) ⋅ T ( x , W T ) + x ⋅ C ( x , W C ) = H ( x , W H ) ⋅ T ( x , W T ) + x ⋅ ( 1 − T ( x , W T ) ) {\displaystyle {\begin{aligned}y=H(x,W_{H})\cdot T(x,W_{T})+x\cdot C(x,W_{C})\\=H(x,W_{H})\cdot T(x,W_{T})+x\cdot (1-T(x,W_{T}))\end{aligned}}} == Related work == Sepp Hochreiter analyzed the vanishing gradient problem in 1991 and attributed to it the reason why deep learning did not work well. To overcome this problem, Long Short-Term Memory (LSTM) recurrent neural networks have residual connections with a weight of 1.0 in every LSTM cell (called the constant error carrousel) to compute y t + 1 = F ( x t ) + x t {\textstyle y_{t+1}=F(x_{t})+x_{t}} . During backpropagation through time, this becomes the residual formula y = F ( x ) + x {\textstyle y=F(x)+x} for feedforward neural networks. This enables training very deep recurrent neural networks with a very long time span t. A later LSTM version published in 2000 modulates the identity LSTM connections by so-called "forget gates" such that their weights are not fixed to 1.0 but can be learned. In experiments, the forget gates were initialized with positive bias weights, thus being opened, addressing the vanishing gradient problem. As long as the forget gates of the 2000 LSTM are open, it behaves like the 1997 LSTM. The Highway Network of May 2015 applies these principles to feedforward neural networks. It was reported to be "the first very deep feedforward network with hundreds of layers". It is like a 2000 LSTM with forget gates unfolded in time, while the later Residual Nets have no equivalent of forget gates and are like the unfolded original 1997 LSTM. If the skip connections in Highway Networks are "without gates," or if their gates are kept open (activation 1.0), they become Residual Networks. The residual connection is a special case of the "short-cut connection" or "skip connection" by Rosenblatt (1961) and Lang & Witbrock (1988) which has the form x ↦ F ( x ) + A x {\displaystyle x\mapsto F(x)+Ax} . Here the randomly initialized weight matrix A does not have to be the identity mapping. Every residual connection is a skip connection, but almost all skip connections are not residual connections. The original Highway Network paper not only introduced the basic principle for very deep feedforward networks, but also included experimental results with 20, 50, and 100 layers networks, and mentioned ongoing experiments with up to 900 layers. Networks with 50 or 100 layers had lower training error than their plain network counterparts, but no lower training error than their 20 layers counterpart (on the MNIST dataset, Figure 1 in ). No improvement on test accuracy was reported with networks deeper than 19 layers (on the CIFAR-10 dataset; Table 1 in ). The ResNet paper, however, provided strong experimental evidence of the benefits of going deeper than 20 layers. It argued that the identity mapping without modulation is crucial and mentioned that modulation in the skip connection can still lead to vanishing signals in forward and backward propagation (Section 3 in ). This is also why the forget gates of the 2000 LSTM were initially opened through positive bias weights: as long as the gates are open, it behaves like the 1997 LSTM. Similarly, a Highway Net whose gates are opened through strongly positive bias weights behaves like a ResNet. The skip connections used in modern neural networks (e.g., Transformers) are dominantly identity mappings.

    Read more →
  • Mobile cloud computing

    Mobile cloud computing

    Mobile Cloud Computing (MCC) is the combination of cloud computing and mobile computing to bring rich computational resources to mobile users, network operators, as well as cloud computing providers. The ultimate goal of MCC is to enable execution of rich mobile applications on a plethora of mobile devices, with a rich user experience. MCC provides business opportunities for mobile network operators as well as cloud providers. More comprehensively, MCC can be defined as "a rich mobile computing technology that leverages unified elastic resources of varied clouds and network technologies toward unrestricted functionality, storage, and mobility to serve a multitude of mobile devices anywhere, anytime through the channel of Ethernet or Internet regardless of heterogeneous environments and platforms based on the pay-as-you-use principle." == Architecture == MCC uses computational augmentation approaches (computations are executed remotely instead of on the device) by which resource-constraint mobile devices can utilize computational resources of varied cloud-based resources. In MCC, there are four types of cloud-based resources, namely distant immobile clouds, proximate immobile computing entities, proximate mobile computing entities, and hybrid (combination of the other three model). Giant clouds such as Amazon EC2 are in the distant immobile groups whereas cloudlet or surrogates are member of proximate immobile computing entities. Smartphones, tablets, handheld devices, and wearable computing devices are part of the third group of cloud-based resources which is proximate mobile computing entities. Vodafone, Orange and Verizon have started to offer cloud computing services for companies. == Challenges == In the MCC landscape, an amalgam of mobile computing, cloud computing, and communication networks (to augment smartphones) creates several complex challenges such as Mobile Computation Offloading, Seamless Connectivity, Long WAN Latency, Mobility Management, Context-Processing, Energy Constraint, Vendor/data Lock-in, Security and Privacy, Elasticity that hinder MCC success and adoption. === Open research issues === Although significant research and development in MCC is available in the literature, efforts in the following domains is still lacking: Architectural issues: A reference architecture for heterogeneous MCC environment is a crucial requirement for unleashing the power of mobile computing towards unrestricted ubiquitous computing. Energy-efficient transmission: MCC requires frequent transmissions between cloud platform and mobile devices, due to the stochastic nature of wireless networks, the transmission protocol should be carefully designed. Context-awareness issues: Context-aware and socially-aware computing are inseparable traits of contemporary handheld computers. To achieve the vision of mobile computing among heterogeneous converged networks and computing devices, designing resource-efficient environment-aware applications is an essential need. Live VM migration issues: Executing resource-intensive mobile application via Virtual Machine (VM) migration-based application offloading involves encapsulation of application in VM instance and migrating it to the cloud, which is a challenging task due to additional overhead of deploying and managing VM on mobile devices. Mobile communication congestion issues: Mobile data traffic is tremendously hiking by ever increasing mobile user demands for exploiting cloud resources which impact on mobile network operators and demand future efforts to enable smooth communication between mobile and cloud endpoints. Trust, security, and privacy issues: Trust is an essential factor for the success of the burgeoning MCC paradigm. It is because the data along with code/component/application/complete VM is offloaded to the cloud for execution. Moreover, just like software and mobile application piracy, the MCC application development models are also affected by the piracy issue. Pirax is known to be the first specialized framework for controlling application piracy in MCC requirements == MCC research groups and activities == Several academic and industrial research groups in MCC have been emerging since last few years. Some of the MCC research groups in academia with large number of researchers and publications include: MDC, Mobile and Distributed Computing research group is at Faculty of Computer and Information Science, King Saud University. MDC research group focuses on architectures, platforms, and protocols for mobile and distributed computing. The group has developed algorithms, tools, and technologies which offer energy efficient, fault tolerant, scalable, secure, and high performance computing on mobile devices. MobCC lab, Faculty of Computer Science and Information Technology, University Malaya. The lab was established in 2010 under the High Impact Research Grant, Ministry of Higher Education, Malaysia. It has 17 researchers and has track of 22 published articles in international conference and peer-reviewed CS journals. ICCLAB, Zürich University of Applied Sciences has a segment working on MCC. The InIT Cloud Computing Lab is a research lab within the Institute of Applied Information Technology (InIT) of Zürich University of Applied Sciences (ZHAW). It covers topic areas across the entire cloud computing technology stack. Mobile & Cloud Lab, Institute of Computer Science, University of Tartu. Mobile & Cloud Lab conducts research and teaching in the mobile computing and cloud computing domains. The research topics of the group include cloud computing, mobile application development, mobile cloud, mobile web services and migrating scientific computing and enterprise applications to the cloud. SmartLab, Data Management Systems Laboratory, Department of Computer Science, University of Cyprus. SmartLab is a first-of-a-kind open cloud of smartphones that enables a new line of systems-oriented mobile computing research. Mobile Cloud Networking: Mobile Cloud Networking (MCN) was an EU FP7 Large-scale Integrating Project (IP, 15m Euro) funded by the European Commission. The MCN project was launched in November 2012 for the period of 36 month. The project was coordinated by SAP Research and the ICCLab at the Zurich University of Applied Science. In total 19 partners from industry and academia established the first vision of Mobile Cloud Computing. The project was primarily motivated by an ongoing transformation that drives the convergence between the Mobile Communications and Cloud Computing industry enabled by the Internet and is considered the first pioneer in the area of Network Function Virtualization.

    Read more →
  • Belief–desire–intention model

    Belief–desire–intention model

    For popular psychology, the belief–desire–intention (BDI) model of human practical reasoning was developed by Michael Bratman as a way of explaining future-directed intention. BDI is fundamentally reliant on folk psychology (the 'theory theory'), which is the notion that our mental models of the world are theories. It was used as a basis for developing the belief–desire–intention software model. == Applications == BDI was part of the inspiration behind the BDI software architecture, which Bratman was also involved in developing. Here, the notion of intention was seen as a way of limiting time spent on deliberating about what to do, by eliminating choices inconsistent with current intentions. BDI has also aroused some interest in psychology. BDI formed the basis for a computational model of childlike reasoning CRIBB.

    Read more →
  • Cognitive philology

    Cognitive philology

    Cognitive philology is the science that studies written and oral texts as the product of human mental processes. Studies in cognitive philology compare documentary evidence emerging from textual investigations with results of experimental research, especially in the fields of cognitive and ecological psychology, neurosciences and artificial intelligence. "The point is not the text, but the mind that made it". Cognitive Philology aims to foster communication between literary, textual, philological disciplines on the one hand and researches across the whole range of the cognitive, evolutionary, ecological and human sciences on the other. Cognitive philology: investigates transmission of oral and written text, and categorization processes which lead to classification of knowledge, mostly relying on the information theory; studies how narratives emerge in so called natural conversation and selective process which lead to the rise of literary standards for storytelling, mostly relying on embodied semantics; explores the evolutive and evolutionary role played by rhythm and metre in human ontogenetic and phylogenetic development and the pertinence of the semantic association during processing of cognitive maps; Provides the scientific ground for multimedia critical editions of literary texts. Among the founding thinkers and noteworthy scholars devoted to such investigations are: Alan Richardson: Studies Theory of Mind in early-modern and contemporary literature. Anatole Pierre Fuksas Benoît de Cornulier David Herman: Professor of English at North Carolina State University and an adjunct professor of linguistics at Duke University. He is the author of "Universal Grammar and Narrative Form" and the editor of "Narratologies: New Perspectives on Narrative Analysis". Domenico Fiormonte François Recanati Gilles Fauconnier, a professor in Cognitive science at the University of California, San Diego. He was one of the founders of cognitive linguistics in the 1970s through his work on pragmatic scales and mental spaces. His research explores the areas of conceptual integration and compressions of conceptual mappings in terms of the emergent structure in language. Julián Santano Moreno Luca Nobile Manfred Jahn in Germany Mark Turner Paolo Canettieri

    Read more →
  • Workplace impact of artificial intelligence

    Workplace impact of artificial intelligence

    The impact of artificial intelligence on workers includes both applications to improve worker safety and health, and potential hazards that must be controlled. One potential application is using AI to eliminate hazards by removing humans from hazardous situations that involve risk of stress, overwork, or musculoskeletal injuries. Predictive analytics may also be used to identify conditions that may lead to hazards such as fatigue, repetitive strain injuries, or toxic substance exposure, leading to earlier interventions. Another is to streamline workplace safety and health workflows through automating repetitive tasks, enhancing safety training programs through virtual reality, or detecting and reporting near misses. When used in the workplace, AI also presents the possibility of new hazards. These may arise from machine learning techniques leading to unpredictable behavior and inscrutability in their decision-making, or from cybersecurity and information privacy issues. Many hazards of AI are psychosocial due to its potential to cause changes in work organization. These include increased monitoring leading to micromanagement, algorithms unintentionally or intentionally mimicking undesirable human biases, and assigning blame for machine errors to the human operator instead. AI may also lead to physical hazards in the form of human–robot collisions, and ergonomic risks of control interfaces and human–machine interactions. Hazard controls include cybersecurity and information privacy measures, communication and transparency with workers about data usage, and limitations on collaborative robots. From a workplace safety and health perspective, only "weak" or "narrow" AI that is tailored to a specific task is relevant, as there are many examples that are currently in use or expected to come into use in the near future. Certain digital technologies are predicted to result in job losses. Starting in the 2020s, the adoption of modern robotics has led to net employment growth. However, many businesses anticipate that automation, or employing robots would result in job losses in the future. This is especially true for companies in Central and Eastern Europe. Other digital technologies, such as platforms or big data, are projected to have a more neutral impact on employment. A large number of tech workers have been laid off starting in 2023; many such job cuts have been attributed to artificial intelligence. == Health and safety applications == In order for any potential AI health and safety application to be adopted, it requires acceptance by both managers and workers. For example, worker acceptance may be diminished by concerns about information privacy, or from a lack of trust and acceptance of the new technology, which may arise from inadequate transparency or training. Alternatively, managers may emphasize increases in economic productivity rather than gains in worker safety and health when implementing AI-based systems. === Eliminating hazardous tasks === AI may increase the scope of work tasks where a worker can be removed from a situation that carries risk. In a sense, while traditional automation can replace the functions of a worker's body with a robot, AI effectively replaces the functions of their brain with a computer. Hazards that can be avoided include stress, overwork, musculoskeletal injuries, and boredom. This can expand the range of affected job sectors into white-collar and service sector jobs such as in medicine, finance, and information technology. === Analytics to reduce risk === Machine learning is used for people analytics to make predictions about worker behavior to assist management decision-making, such as hiring and performance assessment. These could also be used to improve worker health. The analytics may be based on inputs such as online activities, monitoring of communications, location tracking, and voice analysis and body language analysis of filmed interviews. For example, sentiment analysis may be used to spot fatigue to prevent overwork. Decision support systems have a similar ability to be used to, for example, prevent industrial disasters or make disaster response more efficient. For manual material handling workers, predictive analytics and artificial intelligence may be used to reduce musculoskeletal injury. Traditional guidelines are based on statistical averages and are geared towards anthropometrically typical humans. The analysis of large amounts of data from wearable sensors may allow real-time, personalized calculation of ergonomic risk and fatigue management, as well as better analysis of the risk associated with specific job roles. Wearable sensors may also enable earlier intervention against exposure to toxic substances than is possible with area or breathing zone testing on a periodic basis. Furthermore, the large data sets generated could improve workplace health surveillance, risk assessment, and research. === Streamlining safety and health workflows === AI has also been used to attempt to make the workplace safety and health workflow more efficient. One example is coding of workers' compensation claims, which are submitted in a prose narrative form and must manually be assigned standardized codes. AI is being investigated to perform this task faster, more cheaply, and with fewer errors. == Hazards == There are several broad aspects of AI that may give rise to specific hazards. The risks depend on implementation rather than the mere presence of AI. Systems using sub-symbolic AI such as machine learning may behave unpredictably and are more prone to inscrutability in their decision-making. This is especially true if a situation is encountered that was not part of the AI's training dataset, and is exacerbated in environments that are less structured. Undesired behavior may also arise from flaws in the system's perception (arising either from within the software or from sensor degradation), knowledge representation and reasoning, or from software bugs. They may arise from improper training, such as a user applying the same algorithm to two problems that do not have the same requirements. Machine learning applied during the design phase may have different implications than that applied at runtime. Systems using symbolic AI are less prone to unpredictable behavior. The use of AI also increases cybersecurity risks relative to platforms that do not use AI, and information privacy concerns about collected data may pose a hazard to workers. === Psychosocial === Psychosocial hazards are those that arise from the way work is designed, organized, and managed, or its economic and social contexts, rather than arising from a physical substance or object. They cause not only psychiatric and psychological outcomes such as occupational burnout, anxiety disorders, and depression, but they can also cause physical injury or illness such as cardiovascular disease or musculoskeletal injury. Many hazards of AI are psychosocial in nature due to its potential to cause changes in work organization, in terms of increasing complexity and interaction between different organizational factors. However, psychosocial risks are often overlooked by designers of advanced manufacturing systems. Einola and Khoreva explore how different organizational groups perceive and interact with AI technologies. Their research shows that successful AI integration depends on human ownership and contextual understanding. They caution against blind technological optimism and stress the importance of tailoring AI use to specific workplace ecosystems. This perspective reinforces the need for inclusive design and transparent implementation strategies. ==== Changes in work practices ==== Over-reliance on AI tools may lead to deskilling of some professions. When AI becomes a substitute for traditional peer collaboration and mentorship, there is a risk of diminishing opportunities for interpersonal skill development and team-based learning. Increased monitoring may lead to micromanagement and thus to stress and anxiety. A perception of surveillance may also lead to stress. Controls for these include consultation with worker groups, extensive testing, and attention to introduced bias. Wearable sensors, activity trackers, and augmented reality may also lead to stress from micromanagement, both for assembly line workers and gig workers. Gig workers also lack the legal protections and rights of formal workers. Newell & Marabelli argue that AI alters power dynamics and employee autonomy, requiring a more nuanced understanding of its social and organizational implications. There is also the risk of people being forced to work at a robot's pace, or to monitor robot performance at nonstandard hours. A 2025 preprint paper based on users' interactions with the AI chatbot Microsoft Copilot identified forty jobs that the author's claimed had high overlaps with the capabilities of AI. Some media outlets used this paper to report on jobs becoming obsolete. Cri

    Read more →
  • Inception (deep learning architecture)

    Inception (deep learning architecture)

    Inception is a family of convolutional neural network (CNN) for computer vision, introduced by researchers at Google in 2014 as GoogLeNet (later renamed Inception v1). The series was historically important as an early CNN that separates the stem (data ingest), body (data processing), and head (prediction), an architectural design that persists in all modern CNN. == Version history == === Inception v1 === In 2014, a team at Google developed the GoogLeNet architecture, an instance of which won the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The name came from the LeNet of 1998, since both LeNet and GoogLeNet are CNNs. They also called it "Inception" after a "we need to go deeper" internet meme, a phrase from Inception (2010) the film. Because later, more versions were released, the original Inception architecture was renamed again as "Inception v1". The models and the code were released under Apache 2.0 license on GitHub. The Inception v1 architecture is a deep CNN composed of 22 layers. Most of these layers were "Inception modules". The original paper stated that Inception modules are a "logical culmination" of Network in Network and (Arora et al, 2014). Since Inception v1 is deep, it suffered from the vanishing gradient problem. The team solved it by using two "auxiliary classifiers", which are linear-softmax classifiers inserted at 1/3-deep and 2/3-deep within the network, and the loss function is a weighted sum of all three: L = 0.3 L a u x , 1 + 0.3 L a u x , 2 + L r e a l {\displaystyle L=0.3L_{aux,1}+0.3L_{aux,2}+L_{real}} These were removed after training was complete. This was later solved by the ResNet architecture. The architecture consists of three parts stacked on top of one another: The stem (data ingestion): The first few convolutional layers perform data preprocessing to downscale images to a smaller size. The body (data processing): The next many Inception modules perform the bulk of data processing. The head (prediction): The final fully-connected layer and softmax produces a probability distribution for image classification. This structure is used in most modern CNN architectures. === Inception v2 === Inception v2 was released in 2015, in a paper that is more famous for proposing batch normalization. It had 13.6 million parameters. It improves on Inception v1 by adding batch normalization, and removing dropout and local response normalization which they found became unnecessary when batch normalization is used. === Inception v3 === Inception v3 was released in 2016. It improves on Inception v2 by using factorized convolutions. As an example, a single 5×5 convolution can be factored into 3×3 stacked on top of another 3×3. Both has a receptive field of size 5×5. The 5×5 convolution kernel has 25 parameters, compared to just 18 in the factorized version. Thus, the 5×5 convolution is strictly more powerful than the factorized version. However, this power is not necessarily needed. Empirically, the research team found that factorized convolutions help. It also uses a form of dimension-reduction by concatenating the output from a convolutional layer and a pooling layer. As an example, a tensor of size 35 × 35 × 320 {\displaystyle 35\times 35\times 320} can be downscaled by a convolution with stride 2 to 17 × 17 × 320 {\displaystyle 17\times 17\times 320} , and by maxpooling with pool size 2 × 2 {\displaystyle 2\times 2} to 17 × 17 × 320 {\displaystyle 17\times 17\times 320} . These are then concatenated to 17 × 17 × 640 {\displaystyle 17\times 17\times 640} . Other than this, it also removed the lowest auxiliary classifier during training. They found that the auxiliary head worked as a form of regularization. They also proposed label-smoothing regularization in classification. For an image with label c {\displaystyle c} , instead of making the model to predict the probability distribution δ c = ( 0 , 0 , … , 0 , 1 ⏟ c -th entry , 0 , … , 0 ) {\displaystyle \delta _{c}=(0,0,\dots ,0,\underbrace {1} _{c{\text{-th entry}}},0,\dots ,0)} , they made the model predict the smoothed distribution ( 1 − ϵ ) δ c + ϵ / K {\displaystyle (1-\epsilon )\delta _{c}+\epsilon /K} where K {\displaystyle K} is the total number of classes. === Inception v4 === In 2017, the team released Inception v4, Inception ResNet v1, and Inception ResNet v2. Inception v4 is an incremental update with even more factorized convolutions, and other complications that were empirically found to improve benchmarks. Inception ResNet v1 and v2 are both modifications of Inception v4, where residual connections are added to each Inception module, inspired by the ResNet architecture. === Xception === Xception ("Extreme Inception") was published in 2017. It is a linear stack of depthwise separable convolution layers with residual connections. The design was proposed on the hypothesis that in a CNN, the cross-channels correlations and spatial correlations in the feature maps can be entirely decoupled. Training each network took 3 days on 60 K80 GPUs, or approximately 0.5 petaFLOP-days.

    Read more →
  • Double descent

    Double descent

    Double descent in statistics and machine learning is the phenomenon where a model's error rate on the test set initially decreases with the number of parameters, then peaks, then decreases again. This phenomenon has been considered surprising, as it contradicts assumptions about overfitting in classical machine learning. The increase usually occurs near the interpolation threshold, where the number of parameters is the same as the number of training data points (the model is just large enough to fit the training data). Or, more precisely, it is the maximum number of samples on which the model/training procedure achieves approximately on average 0 training error. == History == Early observations of what would later be called double descent in specific models date back to 1989. The term "double descent" was coined by Belkin et. al. in 2019, when the phenomenon gained popularity as a broader concept exhibited by many models. The latter development was prompted by a perceived contradiction between the conventional wisdom that too many parameters in the model result in a significant overfitting error (an extrapolation of the bias–variance tradeoff), and the empirical observations in the 2010s that some modern machine learning techniques tend to perform better with larger models. == Theoretical models == Double descent occurs in linear regression with isotropic Gaussian covariates and isotropic Gaussian noise. A model of double descent at the thermodynamic limit has been analyzed using the replica trick, and the result has been confirmed numerically. A number of works have suggested that double descent can be explained using the concept of effective dimension: While a network may have a large number of parameters, in practice only a subset of those parameters are relevant for generalization performance, as measured by the local Hessian curvature. This explanation is formalized through PAC-Bayes compression-based generalization bounds, which show that less complex models are expected to generalize better under a Solomonoff prior.

    Read more →
  • Artificial wisdom

    Artificial wisdom

    Artificial wisdom (AW) is an artificial intelligence (AI) system which is able to display the human traits of wisdom and morals while being able to contemplate its own “endpoint”. Artificial wisdom can be described as artificial intelligence reaching the top-level of decision-making when confronted with the most complex challenging situations. The term artificial wisdom is used when the "intelligence" is based on more than by chance collecting and interpreting data, but by design enriched with smart and conscience strategies that wise people would use. == Overview == The goal of artificial wisdom is to create artificial intelligence that can successfully replicate the “uniquely human trait[s]” of having wisdom and morals as closely as possible. Thus, artificial wisdom, must “incorporate [the] ethical and moral considerations” of the data it uses. There are also many significant ethical and legal implications of AW which are compounded by the rapid advances in AI and related technologies alongside the lack of the development of ethics, guidelines, and regulations without the oversight of any kind of overarching advisory board. Additionally, there are challenges in how to develop, test, and implement AW in real world scenarios. Existing tests do not test the internal thought process by which a computer system reaches its conclusion, only the result of said process. When examining computer-aided wisdom; the partnership of artificial intelligence and contemplative neuroscience, concerns regarding the future of artificial intelligence shift to a more optimistic viewpoint. This artificial wisdom forms the basis of Louis Molnar's monographic article on artificial philosophy, where he coined the term and proposes how artificial intelligence might view its place in the grand scheme of things. == Definitions == There are no universal or standardized definitions for human intelligence, artificial intelligence, human wisdom, or artificial wisdom. However, the DIKW pyramid, describes the continuum of relationship between data, information, knowledge, and wisdom, puts wisdom at the highest level in its hierarchy. Gottfredson defines intelligence as “the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience”. Definitions for wisdom typically include requiring: The ability for emotional regulation, Pro-social behaviors (e.g., empathy, compassion, and altruism), Self-reflection, “A balance between decisiveness and acceptance of uncertainty and diversity of perspectives, and social advising.” As previously defined, Artificial Wisdom would then be an AI system which is able to solve problems via “an understanding of…context, ethics and moral principles,” rather than simple pre-defined inputs or “learned patterns.” Some scientists have also considered the field of artificial consciousness. However, Jeste states that “…it is generally agreed that only humans can have consciousness, autonomy, will, and theory of mind.” An artificially wise system must also be able to contemplate its end goal and recognize its own ignorance. Additionally, to contemplate its end goal, a wise system must have a “correct conception of worthwhile goals (broadly speaking) or well-being (narrowly speaking)”. "Stephen Grimm further suggests that the following three types of knowledge are individually necessary for wisdom: first, "knowledge of what is good or important for well-being", second, "knowledge of one’s standing, relative to what is good or important for well-being", and third, "knowledge of a strategy for obtaining what is good or important for wellbeing."" == Problems == There are notable problems with attempting to create an artificially wise system. Consciousness, autonomy, and will are considered strictly human features. === Values === There are significant ethical and philosophical issues when attempting to create an intelligent or a wise system. Notably, whose moral values will be used to train the system to be wise. Differing moral values and prejudice can already be seen from various organizations and governments in artificial intelligence. Deployment strategies and values of Artificial Wisdom will conflict between leaders, companies, and countries. Nusbaum states, “When values are in conflict, leaders often make choices that are clever or smart about their own needs, but are often not wise.” === Ethics === Science fiction author Isaac Asimov realized the need to control the technology in the 1940s when he wrote the three laws of robotics as follows: A robot may not injure a human directly or indirectly. A robot must obey human’s orders. A robot should seek to protect its own existence. Additionally, the pace at which technology is rapidly advancing artificial intelligence and thus the need for artificial wisdom may “have outpaced the development of societal guidelines have raised serious questions about the ethics and morality of AI, and called for international oversight and regulations to ensure safety.” === Principal impossibility === One argument, coined by Tsai as the “argument against AW,” or AAAW, postulates the principal impossibility of Artificial Wisdom. The argument is based on the philosophical differences between practical wisdom, also called phronesis, and practical intelligence. Said difference isn’t in “selecting the correct means, but reasoning correctly about what ends to follow”. Tsai puts the argument into a logical proposition as follows: “(P1) An agent is genuinely wise only if the agent can deliberate about the final goal of the domain in which the agent is situated.” “(P2) An intelligent agent cannot deliberate about the final goal of the domain in which the agent is situated.” “(C1) An intelligent agent cannot be genuinely wise.” “(P3) An AW is, at its core, intelligent.” “(C2) An AW cannot be genuinely wise.”

    Read more →
  • CHAOS (chess)

    CHAOS (chess)

    CHAOS (Chess Heuristics and Other Stuff) is a chess playing program that was developed by programmers working at the RCA Systems Programming division in the late 1960s. It played competitively in computer chess competitions in the 1970s and 1980s. It differed from other programs of that era in its look-ahead philosophy, choosing to use chess knowledge to evaluate fewer positions and continuations as opposed to simple evaluations that relied on deep look-ahead to avoid bad moves. == Introduction == CHAOS was originally developed by Ira Ruben, Fred Swartz, Victor Berman, Joe Winograd and William Toikka while working at RCA in Cinnaminson, NJ. Its name is an acronym for 'Chess Heuristics and Other Stuff.' Program development moved to the Computing Center of the University of Michigan when Swartz changed jobs, and Mike Alexander joined the development group. Swartz, Alexander and Berman were continuously group members from that point onward in CHAOS' evolution, as others of the original authors left and new members contributed episodically. Chess Senior Master Jack O'Keefe contributed to CHAOS' development from about 1980 onwards. CHAOS was written in Fortran, except for low-level board representation manipulations written in assembly language or C. Due to this portability, it ran on RCA, Univac and IBM-compatible mainframes in its lifetime. CHAOS heralds from the mainframe computing era when only machines of that capacity were able to play at a high level. Consequently, development and testing could only take place at off-peak times for production use of the machine. In a competition, CHAOS had to run on a dedicated mainframe with a telephone link to the match venue. In its later years, CHAOS ran on computers on the machine assembly floor of Amdahl Corporation on MTS. == Background == === Chess and artificial intelligence === Mathematicians Claude Shannon and Alan Turing, working separately, were the first to view playing chess as a challenge to machines. Working for AT&T / Bell Labs with its access to telephone switching equipment, Shannon built a relay-based machine that learned how to work its way through a two-dimensional, 5x5 cell maze in 1949. Shannon viewed this as an analogue of the way that organisms learn things about their natural environment. There is a random element to searching it, a memory element to benefit from the search outcome, and a reward element that reinforces learning when the global outcome is favorable to the organism. Soon afterward, Shannon wrote a mathematical analysis of the game of chess, published in 1950. Like with the maze, he broke down game play into the necessary elements for reinforcement learning. Associated with each board configuration a move will be made from, there is a numerical score. To decide what move to make, a player wants to maximize their own position's score after the move and to minimize their opponent's score (a minimax view). Since there are about 32 possible moves at each of the early stages of the game, and about 40 moves and responses in each game, then there are about 32 80 {\displaystyle 32^{80}} or about 10 120 {\displaystyle 10^{120}} possible games - an impossibly large set to evaluate completely. Therefore, there must be a way to limit the number of moves to look ahead for to find the best one. Reducing the game to these few key elements provided a way to think about human intelligence in general. Shannon became part of a wider group using computing machines to mimic aspects of human intelligence that grew into the general idea of artificial intelligence. (Other members of this group were John McCarthy, Herbert Simon, Allen Newell, Alan Kotok, Alex Bernstein and Richard Greenblatt.) The paradigm that evolved was that there was a quantification of the position on the board into a score, an evaluation method to find favorable outcomes (minimax, later alpha-beta pruning), and a strategy to manage the combinatorial explosion of the look-ahead possibilities. By the early 1960s, there were computer programs that played chess at a rudimentary level. They used very simple evaluation functions for each position and tried to search as far forward as was practical given the time constraints and available compute power. Naturally, programmers optimized their code to use the available computing resources. This led to a major philosophical divide among chess programs: those that tried to evaluate as many positions as possible, and those that tried to evaluate the most promising move sequences as deeply as possible. CHAOS was firmly in the camp believing only the most promising moves should be evaluated in depth. Said Swartz, "The 'brute force people' ... look at every (possible move) no matter what garbage it is. Most moves are just terrible, terrible moves, and most computing time is being spent on pure garbage." The program spent more time evaluating each board position in the expectation that it would find the most promising lines of play to explore in depth. In 1983, the then-fastest chess program (Belle) evaluated 110,000 positions per second, and typical programs 1000–50,000 per second, whereas CHAOS evaluated about 50-100 per second. === Machine learning and strategies to manage search === From about 1949 onward, Arthur Samuel began work for IBM on machine learning, culminating in a checkers-playing program in 1952 and publications on the topic. Concurrently, Christopher Strachey created Checkers, a program to play the board game of checkers in 1951, but it had no capacity to learn from its play. Checkers was chosen by both authors because it was simpler than chess yet contained the basic characteristics of an intellectual activity, and, in Samuel's view, was a test-bed in which heuristic procedures and learning processes could be evaluated quickly. Checker playing programs introduced the notion of the game tree and evaluating play to various depths to choose the best move. The complexity of chess, however, promoted it to the status of an analogue for human intelligence, and it attracted computer scientists' attention, who referred to it as research into artificial intelligence (AI). Like checkers, it required a numerical assessment of each arrangement of chess pieces on a board. It also required looking ahead to future moves to decide how to play the present position. Due to the enormous number of possible moves, there had to be a way to confine the look-ahead search to the most promising lines of play. From these factors, the notion of minimax score evaluation developed and, later, alpha-beta tree pruning to abandon looking at positions worse than any that have already been examined. === Chess search strategies === The AI community viewed artificial intelligence as comprising two parts: a way to symbolically quantify the knowledge in hand (a chess board position), and a set of heuristics to limit look-ahead to the consequences of a move. The early chess playing programs attempted to look forward as far as possible, perhaps to 3 moves ahead by each player, and to choose the best outcome. This led to the horizon effect, whereby a key move 4 or more moves ahead would be unexamined and therefore missed. Consequently, the programs were quite weak and heuristics to manage the search became important in their development. CHAOS used a selective search strategy with iterative widening. As chess programs evolved, they incorporated books of opening lines of play from historic sources. Nowadays, book moves are catalogued in machine-readable form, but originally programmers had to type them in. CHAOS had an extensive book for its time of around 10,000 moves that O'Keefe helped to develop. A problem with play from an opening book is the behavior of the program when the play leaves the book: the positional advantage may be so subtle that the evaluation scheme may be unable to understand it, leading to very wide and shallow searches to establish a line of play. The horizon effect again plagues move selection after leaving the book. CHAOS mitigated these problems by only using book lines that it could understand, and by relying on cached analyses of continuations out of the book made while the opponent's clock was running. == Game Play History == CHAOS played in twelve ACM computer chess tournaments and four World Computer Chess Championships (WCCC). Its debut was the ACM computer chess tournament in 1973, taking 2nd place. In 1974, it again won 2nd place in the WCCC, defeating the tournament favorite Chess 4.0 but losing to Kaissa. CHAOS was close to winning the 1980 WCCC, but lost to Belle in a playoff. The 1985 ACM computer chess tournament was CHAOS' last competition. One of CHAOS' notable victories was over Chess 4.0 at the 1974 WCCC tournament. Chess 4.0 was unbeaten by any other program up until then. Playing as white, CHAOS made a knight sacrifice (16 Nd4-e6!!) that traded material for open lines of attack and eventually won the game. CHAOS’ authors thought the move was due to a

    Read more →
  • TimeTiger

    TimeTiger

    TimeTiger is a time and project tracking app developed by Indigo Technologies Ltd. in Toronto, Ontario, Canada. Indigo was founded in 1997 and initially released TimeTiger in 1998. == Company == The company was incorporated in 1997 and began operations as a custom software developer. TimeTiger (internally called TaskMaster) was developed as a tool to help with Indigo's own project planning and estimating. After releasing TimeTiger as a commercial product in 1998, Indigo shifted its focus to time and project management solutions. TimeTiger first introduced support for web-based time logging in 2000, to appeal to workers who were not already tracking their time for billing reasons. Subsequent development emphasized project analysis tools. == Features == Web-based electronic time log "To Do" list to monitor project and non-project activities Pivot table report designer Role-based access control == Software integration == Reports can be exported to Microsoft Excel or saved as Excel-compatible HTML files. Microsoft Project files can be imported and exported. A Software Development Kit is available.

    Read more →
  • STIT logic

    STIT logic

    STIT logic (from seeing to it that) is a family of modal and branching-time logics for reasoning about agency and choice. A typical STIT operator has the form [ i s t i t : φ ] {\displaystyle [i\ {\mathsf {stit}}:\varphi ]} , usually read as "agent i {\displaystyle i} sees to it that φ {\displaystyle \varphi } ", and is interpreted in models where agents choose between alternative possible futures. STIT logics are used in action theory, deontic logic, epistemic logic, and the theory of intelligent agents to formalise notions such as "could have done otherwise", responsibility, joint action, and strategic ability in an indeterministic world. == Etymology == The acronym STIT comes from the English phrase "seeing to it that", introduced in influential work by Nuel Belnap and Michael Perloff on the logical analysis of agentive expressions. In this tradition, "to see to it that φ {\displaystyle \varphi } " is treated as a primitive agency operator, rather than being reduced to ordinary modal necessity. == History == Modern STIT logic arose in the 1980s in the context of branching-time semantics and formal theories of agency. Belnap and Perloff's article "Seeing to it that: A canonical form for agentives" introduced the idea of treating expressions of the form "agent i sees to it that φ" as a primitive modal operator, and analysed such sentences using a branching tree of moments and histories. This approach was further developed in a series of papers on indeterminism and agency and provided the conceptual core for later STIT formalisms. In the 1990s the basic formal systems of STIT logic were worked out. Horty and Belnap's influential paper on the deliberative STIT operator distinguished between a "Chellas" STIT that merely records the result of an agent's present choice and a "deliberative" STIT that requires the agent's choice to make a difference, and connected STIT with issues of action, omission, ability and obligation. Around the same time, Ming Xu proved completeness and decidability results for basic STIT systems, including a single-agent logic with Kripke-style semantics and axiomatizations for multi-agent deliberative STIT, thereby establishing STIT as a well-behaved normal modal framework. This early work was systematised in Belnap, Perloff and Xu's monograph Facing the Future: Agents and Choices in Our Indeterminist World, which presents a general branching-time semantics for individual and group STIT operators, discusses independence-of-agents conditions and articulates the metaphysical picture of an indeterministic "tree" of moments. At roughly the same time, Horty's book Agency and Deontic Logic developed deontic STIT logics in which obligations are tied to agents' available choices rather than to static states of affairs, and used the resulting systems to analyse "ought implies can", contrary-to-duty obligations and deontic paradoxes. These works helped to position STIT at the intersection of action theory, temporal logic and deontic logic. From the late 1990s and 2000s onward, STIT logics were combined with epistemic, temporal and strategic modalities. Broersen introduced complete STIT logics for knowledge and action and deontic-epistemic STIT systems that distinguish different modes of mens rea, with applications to responsibility and the specification of multi-agent systems. Work on group and coalitional agency investigated axiomatisations and complexity results for group STIT logics, and related STIT-based analyses of agency to coalition logic and alternating-time temporal logic (ATL) by exhibiting formal embeddings between the frameworks. Explicit temporal operators were added to STIT in so-called temporal STIT logics. Lorini proposed a temporal STIT with "next" and "until" operators along histories and showed how it can be applied to normative reasoning about ongoing behaviour and commitments. Ciuni and Lorini compared different semantics for temporal STIT, clarifying the relationships between branching-time, game-based and epistemic approaches, while Boudou and Lorini gave a semantics for temporal STIT based on concurrent game structures, thus strengthening links with standard models of multi-agent interaction used for ATL and strategy logic. In parallel, complexity-theoretic work by Balbiani, Herzig and Troquard and by Schwarzentruber and co-authors investigated the satisfiability and model-checking problems for various STIT fragments, showing for instance that many expressive group STIT logics are undecidable or of high computational complexity. In the 2010s, STIT ideas were combined with justification logic, imagination operators and refined deontic notions. Justification STIT logics, developed by Olkhovikov and others, merge explicit justifications with STIT-style agency so that producing a proof can itself be treated as an action that brings about knowledge, and they come with completeness and decidability results. Olkhovikov and Wansing introduced STIT imagination logics, together with axiomatic systems and tableau calculi, to model acts of voluntary imagining and their role in doxastic control. Other authors have proposed STIT-based logics of responsibility, blameworthiness and intentionality for use in philosophical and AI settings. Xu's survey article "Combinations of STIT with Ought and Know" (2015) reviews many of these developments and emphasises the interplay between deontic and epistemic STIT logics. Current research on STIT focuses on proof theory, automated reasoning and richer expressive resources. Lyon and van Berkel, building on earlier work on labelled calculi for STIT, have developed cut-free sequent systems and proof-search algorithms that yield syntactic decision procedures for a range of deontic and non-deontic multi-agent STIT logics and support applications such as duty checking and compliance checking in autonomous systems. Sawasaki has proposed first-order cstit-based STIT logics that can distinguish de re and de dicto readings of agency statements and has proved strong completeness results for Hilbert systems over finite models, moving the STIT programme beyond the purely propositional level. Further work investigates interpreted-system and computationally grounded semantics for STIT and its extensions in order to model the behaviour of autonomous agents in multi-agent settings, and proposes STIT-based semantics for epistemic notions based on patterns of information disclosure in interactive systems. == Branching-time semantics == STIT logics are usually interpreted over branching-time models. A standard STIT frame consists of: a non-empty set of moments T {\displaystyle T} , partially ordered by < {\displaystyle <} so that ( T , < ) {\displaystyle (T,<)} forms a tree (every pair of moments with a common predecessor has a greatest lower bound); a set of histories, each history being a maximal linearly ordered subset of T {\displaystyle T} ; a non-empty set of agents A g {\displaystyle Ag} ; for each agent i ∈ A g {\displaystyle i\in Ag} and moment m {\displaystyle m} , a choice function c h o i c e i m {\displaystyle {\mathsf {choice}}_{i}^{m}} that partitions the set of histories passing through m {\displaystyle m} into choice cells. The idea is that a moment represents a time at which choices are made, and histories represent complete possible future courses of events. At each moment, each agent's choice corresponds to selecting one of the available cells of histories determined by their choice function. Formulas are evaluated at pairs ( m , h ) {\displaystyle (m,h)} of a moment and a history through that moment (sometimes written m / h {\displaystyle m/h} ). A valuation assigns truth-values to atomic propositions at such indices; Boolean connectives are interpreted pointwise as in Kripke-style modal logic. == Chellas and deliberative STIT operators == Several STIT operators have been distinguished in the literature. A common approach uses two closely related operators, often called Chellas STIT and deliberative STIT. Let H m {\displaystyle H_{m}} be the set of histories passing through a moment m {\displaystyle m} , and write H m {\displaystyle H_{m}} ⟦ φ ⟧ m = { h ∈ H m ∣ M , m / h ⊨ φ } {\displaystyle {\text{⟦}}\varphi {\text{⟧}}_{m}=\{h\in H_{m}\mid M,m/h\models \varphi \}} for the set of histories at m {\displaystyle m} where φ {\displaystyle \varphi } holds. The Chellas STIT operator, often written [ i c s t i t : φ ] {\displaystyle [i\ {\mathsf {cstit}}:\varphi ]} , is given by M , m / h ⊨ [ i c s t i t : φ ] iff c h o i c e i m ( h ) ⊆ ⟦ φ ⟧ m . {\displaystyle M,m/h\models [i\ {\mathsf {cstit}}:\varphi ]\quad {\text{iff}}\quad {\mathsf {choice}}_{i}^{m}(h)\subseteq {\text{⟦}}\varphi {\text{⟧}}_{m}.} Intuitively, agent i {\displaystyle i} sees to it that φ {\displaystyle \varphi } if φ {\displaystyle \varphi } holds at all histories compatible with their present choice. The deliberative STIT operator, [ i d s t i t : φ ] {\displaystyle [i\ {\mathsf {dstit}}:\varphi ]} , adds

    Read more →
  • The AI Con

    The AI Con

    The AI Con: How to Fight Big Tech's Hype and Create the Future We Want is a 2025 non-fiction book by linguist Emily M. Bender and sociologist Alex Hanna. It argues that much of what is labeled "artificial intelligence" is a misleading term that obscures ordinary automation while concentrating power in a small number of technology firms. The book was published in May 2025 by Harper in the United States and Bodley Head in the United Kingdom. It was developed alongside the authors' long-running podcast Mystery AI Hype Theater 3000, which critiques exaggerated claims about AI. == Synopsis == The authors present AI as a marketing umbrella that encourages audiences to infer understanding and agency where none exist. They argue readers should treat such language skeptically and to separate specific automated tasks from broad claims of intelligence. The book describes a recurring hype cycle in which corporate narratives justify data and labor extraction, the replacement of human services with cheaper substitutes, and the diversion of attention from present harms to speculative futures. While acknowledging limited uses such as pattern recognition, the authors argue that contemporary systems are best understood as text and media generators shaped by training data and human labor, not as thinking or reasoning entities. A central theme is the social and environmental cost of scaling these systems, including increased energy and water use, the appropriation of creative work for training, and the outsourcing of ghost work to low-paid data workers worldwide. These costs are linked to workplace effects, with the authors arguing that automation rarely eliminates jobs outright and more often degrades them through surveillance, work intensification, and unpaid oversight. As alternatives to passive adoption, the authors propose concrete responses: asking precise questions about what is being automated and why, demanding transparency about data and evaluation, and practicing what they call strategic refusal when deployment conflicts with evidence or values. The book also develops a vocabulary for public debate, rejecting both boosterish and doomerish narratives as grounded in the same assumption that AI is a singular, autonomous force. The authors recommend reading strategies such as favoring trusted human sources over automated summaries and using humor to deflate inflated claims. They describe a link between language to policy and power, arguing that precise terminology can help policymakers and the public resist austerity-driven automation and demand accountability for errors and harms. == Reception == The Guardian praised the book's myth-busting approach and its analysis of how hype erodes cultural and civic life by normalizing synthetic media as a substitute for human judgment. Kirkus Reviews described it as a contrarian account that catalogs concrete risks while cutting through speculative predictions. An interview in Business Insider highlighted the authors' accessible frameworks, including their proposal to describe chatbots as conversation simulators and to evaluate systems in terms of values, labor, and evidence. Coverage in GeekWire emphasized the book's call for resistance through collective bargaining, stronger data rights, and a norm of rejecting deployments that fail basic standards of necessity and evaluation. Some reviews were more critical. A review in LLRX argued that the book's tone could be overly polemical and that it gave limited attention to potential benefits claimed for generative systems. Coverage in the Financial Times, focused on Bender's broader public scholarship, situated the book within her long-standing critique of anthropomorphic narratives about large language models and her advocacy for more democratic oversight of automated systems.

    Read more →