Fuzzy measure theory

Fuzzy measure theory

In mathematics, fuzzy measure theory considers generalized measures in which the additive property is replaced by the weaker property of monotonicity. The central concept of fuzzy measure theory is the fuzzy measure (also capacity, see ), which was introduced by Choquet in 1953 and independently defined by Sugeno in 1974 in the context of fuzzy integrals. There exists a number of different classes of fuzzy measures including plausibility/belief measures, possibility/necessity measures, and probability measures, which are a subset of classical measures. == Definitions == Let X {\displaystyle \mathbf {X} } be a universe of discourse, C {\displaystyle {\mathcal {C}}} be a class of subsets of X {\displaystyle \mathbf {X} } , and E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} . A function g : C → R {\displaystyle g:{\mathcal {C}}\to \mathbb {R} } where ∅ ∈ C ⇒ g ( ∅ ) = 0 {\displaystyle \emptyset \in {\mathcal {C}}\Rightarrow g(\emptyset )=0} E ⊆ F ⇒ g ( E ) ≤ g ( F ) {\displaystyle E\subseteq F\Rightarrow g(E)\leq g(F)} is called a fuzzy measure. A fuzzy measure is called normalized or regular if g ( X ) = 1 {\displaystyle g(\mathbf {X} )=1} . == Properties of fuzzy measures == A fuzzy measure is: additive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) = g ( E ) + g ( F ) . {\displaystyle g(E\cup F)=g(E)+g(F).} ; supermodular if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have g ( E ∪ F ) + g ( E ∩ F ) ≥ g ( E ) + g ( F ) {\displaystyle g(E\cup F)+g(E\cap F)\geq g(E)+g(F)} ; submodular if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have g ( E ∪ F ) + g ( E ∩ F ) ≤ g ( E ) + g ( F ) {\displaystyle g(E\cup F)+g(E\cap F)\leq g(E)+g(F)} ; superadditive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) ≥ g ( E ) + g ( F ) {\displaystyle g(E\cup F)\geq g(E)+g(F)} ; subadditive if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} such that E ∩ F = ∅ {\displaystyle E\cap F=\emptyset } , we have g ( E ∪ F ) ≤ g ( E ) + g ( F ) {\displaystyle g(E\cup F)\leq g(E)+g(F)} ; symmetric if for any E , F ∈ C {\displaystyle E,F\in {\mathcal {C}}} , we have | E | = | F | {\displaystyle |E|=|F|} implies g ( E ) = g ( F ) {\displaystyle g(E)=g(F)} ; Boolean if for any E ∈ C {\displaystyle E\in {\mathcal {C}}} , we have g ( E ) = 0 {\displaystyle g(E)=0} or g ( E ) = 1 {\displaystyle g(E)=1} . Understanding the properties of fuzzy measures is useful in application. When a fuzzy measure is used to define a function such as the Sugeno integral or Choquet integral, these properties will be crucial in understanding the function's behavior. For instance, the Choquet integral with respect to an additive fuzzy measure reduces to the Lebesgue integral. In discrete cases, a symmetric fuzzy measure will result in the ordered weighted averaging (OWA) operator. Submodular fuzzy measures result in convex functions, while supermodular fuzzy measures result in concave functions when used to define a Choquet integral. == Möbius representation == Let g be a fuzzy measure. The Möbius representation of g is given by the set function M, where for every E , F ⊆ X {\displaystyle E,F\subseteq X} , M ( E ) = ∑ F ⊆ E ( − 1 ) | E ∖ F | g ( F ) . {\displaystyle M(E)=\sum _{F\subseteq E}(-1)^{|E\backslash F|}g(F).} The equivalent axioms in Möbius representation are: M ( ∅ ) = 0 {\displaystyle M(\emptyset )=0} . ∑ F ⊆ E | i ∈ F M ( F ) ≥ 0 {\displaystyle \sum _{F\subseteq E|i\in F}M(F)\geq 0} , for all E ⊆ X {\displaystyle E\subseteq \mathbf {X} } and all i ∈ E {\displaystyle i\in E} A fuzzy measure in Möbius representation M is called normalized if ∑ E ⊆ X M ( E ) = 1. {\displaystyle \sum _{E\subseteq \mathbf {X} }M(E)=1.} Möbius representation can be used to give an indication of which subsets of X interact with one another. For instance, an additive fuzzy measure has Möbius values all equal to zero except for singletons. The fuzzy measure g in standard representation can be recovered from the Möbius form using the Zeta transform: g ( E ) = ∑ F ⊆ E M ( F ) , ∀ E ⊆ X . {\displaystyle g(E)=\sum _{F\subseteq E}M(F),\forall E\subseteq \mathbf {X} .} == Simplification assumptions for fuzzy measures == Fuzzy measures are defined on a semiring of sets or monotone class, which may be as granular as the power set of X, and even in discrete cases the number of variables can be as large as 2|X|. For this reason, in the context of multi-criteria decision analysis and other disciplines, simplification assumptions on the fuzzy measure have been introduced so that it is less computationally expensive to determine and use. For instance, when it is assumed the fuzzy measure is additive, it will hold that g ( E ) = ∑ i ∈ E g ( { i } ) {\displaystyle g(E)=\sum _{i\in E}g(\{i\})} and the values of the fuzzy measure can be evaluated from the values on X. Similarly, a symmetric fuzzy measure is defined uniquely by |X| values. Two important fuzzy measures that can be used are the Sugeno- or λ {\displaystyle \lambda } -fuzzy measure and k-additive measures, introduced by Sugeno and Grabisch respectively. === Sugeno λ-measure === The Sugeno λ {\displaystyle \lambda } -measure is a special case of fuzzy measures defined iteratively. It has the following definition: ==== Definition ==== Let X = { x 1 , … , x n } {\displaystyle \mathbf {X} =\left\lbrace x_{1},\dots ,x_{n}\right\rbrace } be a finite set and let λ ∈ ( − 1 , + ∞ ) {\displaystyle \lambda \in (-1,+\infty )} . A Sugeno λ {\displaystyle \lambda } -measure is a function g : 2 X → [ 0 , 1 ] {\displaystyle g:2^{X}\to [0,1]} such that g ( X ) = 1 {\displaystyle g(X)=1} . if A , B ⊆ X {\displaystyle A,B\subseteq \mathbf {X} } (alternatively A , B ∈ 2 X {\displaystyle A,B\in 2^{\mathbf {X} }} ) with A ∩ B = ∅ {\displaystyle A\cap B=\emptyset } then g ( A ∪ B ) = g ( A ) + g ( B ) + λ g ( A ) g ( B ) {\displaystyle g(A\cup B)=g(A)+g(B)+\lambda g(A)g(B)} . As a convention, the value of g at a singleton set { x i } {\displaystyle \left\lbrace x_{i}\right\rbrace } is called a density and is denoted by g i = g ( { x i } ) {\displaystyle g_{i}=g(\left\lbrace x_{i}\right\rbrace )} . In addition, we have that λ {\displaystyle \lambda } satisfies the property λ + 1 = ∏ i = 1 n ( 1 + λ g i ) {\displaystyle \lambda +1=\prod _{i=1}^{n}(1+\lambda g_{i})} . Tahani and Keller as well as Wang and Klir have shown that once the densities are known, it is possible to use the previous polynomial to obtain the values of λ {\displaystyle \lambda } uniquely. === k-additive fuzzy measure === The k-additive fuzzy measure limits the interaction between the subsets E ⊆ X {\displaystyle E\subseteq X} to size | E | = k {\displaystyle |E|=k} . This drastically reduces the number of variables needed to define the fuzzy measure, and as k can be anything from 1 (in which case the fuzzy measure is additive) to X, it allows for a compromise between modelling ability and simplicity. ==== Definition ==== A discrete fuzzy measure g on a set X is called k-additive ( 1 ≤ k ≤ | X | {\displaystyle 1\leq k\leq |\mathbf {X} |} ) if its Möbius representation verifies M ( E ) = 0 {\displaystyle M(E)=0} , whenever | E | > k {\displaystyle |E|>k} for any E ⊆ X {\displaystyle E\subseteq \mathbf {X} } , and there exists a subset F with k elements such that M ( F ) ≠ 0 {\displaystyle M(F)\neq 0} . == Shapley and interaction indices == In game theory, the Shapley value or Shapley index is used to indicate the weight of a game. Shapley values can be calculated for fuzzy measures in order to give some indication of the importance of each singleton. In the case of additive fuzzy measures, the Shapley value will be the same as each singleton. For a given fuzzy measure g, and | X | = n {\displaystyle |\mathbf {X} |=n} , the Shapley index for every i , … , n ∈ X {\displaystyle i,\dots ,n\in X} is: ϕ ( i ) = ∑ E ⊆ X ∖ { i } ( n − | E | − 1 ) ! | E | ! n ! [ g ( E ∪ { i } ) − g ( E ) ] . {\displaystyle \phi (i)=\sum _{E\subseteq \mathbf {X} \backslash \{i\}}{\frac {(n-|E|-1)!|E|!}{n!}}[g(E\cup \{i\})-g(E)].} The Shapley value is the vector ϕ ( g ) = ( ψ ( 1 ) , … , ψ ( n ) ) . {\displaystyle \mathbf {\phi } (g)=(\psi (1),\dots ,\psi (n)).}

Automated decision-making

Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra

Generalized iterative scaling

In statistics, generalized iterative scaling (GIS) and improved iterative scaling (IIS) are two early algorithms used to fit log-linear models, notably multinomial logistic regression (MaxEnt) classifiers and extensions of it such as MaxEnt Markov models and conditional random fields. These algorithms have been largely surpassed by gradient-based methods such as L-BFGS and coordinate descent algorithms.

Soft independent modelling of class analogies

Soft independent modelling by class analogy (SIMCA) is a statistical method for supervised classification of data. The method requires a training data set consisting of samples (or objects) with a set of attributes and their class membership. The term soft refers to the fact the classifier can identify samples as belonging to multiple classes and not necessarily producing a classification of samples into non-overlapping classes. == Method == In order to build the classification models, the samples belonging to each class need to be analysed using principal component analysis (PCA); only the significant components are retained. For a given class, the resulting model then describes either a line (for one Principal Component or PC), plane (for two PCs) or hyper-plane (for more than two PCs). For each modelled class, the mean orthogonal distance of training data samples from the line, plane, or hyper-plane (calculated as the residual standard deviation) is used to determine a critical distance for classification. This critical distance is based on the F-distribution and is usually calculated using 95% or 99% confidence intervals. New observations are projected into each PC model and the residual distances calculated. An observation is assigned to the model class when its residual distance from the model is below the statistical limit for the class. The observation may be found to belong to multiple classes and a measure of goodness of the model can be found from the number of cases where the observations are classified into multiple classes. The classification efficiency is usually indicated by Receiver operating characteristics. In the original SIMCA method, the ends of the hyper-plane of each class are closed off by setting statistical control limits along the retained principal components axes (i.e., score value between plus and minus 0.5 times score standard deviation). More recent adaptations of the SIMCA method close off the hyper-plane by construction of ellipsoids (e.g. Hotelling's T2 or Mahalanobis distance). With such modified SIMCA methods, classification of an object requires both that its orthogonal distance from the model and its projection within the model (i.e. score value within the region defined by the ellipsoid) are not significant. == Application == SIMCA as a method of classification has gained widespread use especially in applied statistical fields such as chemometrics and spectroscopic data analysis.

Optical neural network

An optical neural network is a physical implementation of an artificial neural network with optical components. Early optical neural networks used a photorefractive Volume hologram to interconnect arrays of input neurons to arrays of output with synaptic weights in proportion to the multiplexed hologram's strength. Volume holograms were further multiplexed using spectral hole burning to add one dimension of wavelength to space to achieve four dimensional interconnects of two dimensional arrays of neural inputs and outputs. This research led to extensive research on alternative methods using the strength of the optical interconnect for implementing neuronal communications. Some artificial neural networks that have been implemented as optical neural networks include the Hopfield neural network and the Kohonen self-organizing map with liquid crystal spatial light modulators Optical neural networks can also be based on the principles of neuromorphic engineering, creating neuromorphic photonic systems. Typically, these systems encode information in the networks using spikes, mimicking the functionality of spiking neural networks in optical and photonic hardware. Photonic devices that have demonstrated neuromorphic functionalities include (among others) vertical-cavity surface-emitting lasers, integrated photonic modulators, optoelectronic systems based on superconducting Josephson junctions or systems based on resonant tunnelling diodes. == Electrochemical vs. optical neural networks == Biological neural networks function on an electrochemical basis, while optical neural networks use electromagnetic waves. Optical interfaces to biological neural networks can be created with optogenetics, but is not the same as an optical neural networks. In biological neural networks there exist a lot of different mechanisms for dynamically changing the state of the neurons, these include short-term and long-term synaptic plasticity. Synaptic plasticity is among the electrophysiological phenomena used to control the efficiency of synaptic transmission, long-term for learning and memory, and short-term for short transient changes in synaptic transmission efficiency. Implementing this with optical components is difficult, and ideally requires advanced photonic materials. Properties that might be desirable in photonic materials for optical neural networks include the ability to change their efficiency of transmitting light, based on the intensity of incoming light. == Rising Era of Optical Neural Networks == With the increasing significance of computer vision in various domains, the computational cost of these tasks has increased, making it more important to develop the new approaches of the processing acceleration. Optical computing has emerged as a potential alternative to GPU acceleration for modern neural networks, particularly considering the looming obsolescence of Moore's Law. Consequently, optical neural networks have garnered increased attention in the research community. Presently, two primary methods of optical neural computing are under research: silicon photonics-based and free-space optics. Each approach has its benefits and drawbacks; while silicon photonics may offer superior speed, it lacks the massive parallelism that free-space optics can deliver. Given the substantial parallelism capabilities of free-space optics, researchers have focused on taking advantage of it. One implementation, proposed by Lin et al., involves the training and fabrication of phase masks for a handwritten digit classifier. By stacking 3D-printed phase masks, light passing through the fabricated network can be read by a photodetector array of ten detectors, each representing a digit class ranging from 1 to 10. Although this network can achieve terahertz-range classification, it lacks flexibility, as the phase masks are fabricated for a specific task and cannot be retrained. An alternative method for classification in free-space optics, introduced by Cahng et al., employs a 4F system that is based on the convolution theorem to perform convolution operations. This system uses two lenses to execute the Fourier transforms of the convolution operation, enabling passive conversion into the Fourier domain without power consumption or latency. However, the convolution operation kernels in this implementation are also fabricated phase masks, limiting the device's functionality to specific convolutional layers of the network only. In contrast, Li et al. proposed a technique involving kernel tiling to use the parallelism of the 4F system while using a Digital Micromirror Device (DMD) instead of a phase mask. This approach allows users to upload various kernels into the 4F system and execute the entire network's inference on a single device. Unfortunately, modern neural networks are not designed for the 4F systems, as they were primarily developed during the CPU/GPU era. Mostly because they tend to use a lower resolution and a high number of channels in their feature maps. == Other Implementations == In 2007 there was one model of Optical Neural Network: the Programmable Optical Array/Analogic Computer (POAC). It had been implemented in the year 2000 and reported based on modified Joint Fourier Transform Correlator (JTC) and Bacteriorhodopsin (BR) as a holographic optical memory. Full parallelism, large array size and the speed of light are three promises offered by POAC to implement an optical CNN. They had been investigated during the last years with their practical limitations and considerations yielding the design of the first portable POAC version. The practical details – hardware (optical setups) and software (optical templates) – were published. However, POAC is a general purpose and programmable array computer that has a wide range of applications including: image processing pattern recognition target tracking real-time video processing document security optical switching == Progress in the 2020s == Taichi from Tsinghua University in Beijing is a hybrid ONN that combines the power efficiency and parallelism of optical diffraction and the configurability of optical interference. Taichi offers 13.96 million parameters. Taichi avoids the high error rates that afflict deep (multi-layer) networks by combining clusters of fewer-layer diffractive units with arrays of interferometers for reconfigurable computation. Its encoding protocol divides large network models into sub-models that can be distributed across multiple chiplets in parallel. Taichi achieved 91.89% accuracy in tests with the Omniglot database. It was also used to generate music Bach and generate images the styles of Van Gogh and Munch. The developers claimed energy efficiency of up to 160 trillion operations second−1 watt−1 and an area efficiency of 880 trillion multiply-accumulate operations mm−2 or 103 more energy efficient than the NVIDIA H100, and 102 times more energy efficient and 10 times more area efficient than previous ONNs. Time dimension has recently been introduced into diffractive neural network by fs laser lithography of perovskite hydration. The temporal behaviour of the neuron can be modulated by the fs laser at the nanoscale, enabling a programmable holographic neural network with temporal evolution functionality, i.e., the functionality can change with time under the hydration stimuli. An in-memory temporal inference functionality was demonstrated to mimic the function evolution of the human brain, i.e., the functionality can change from simple digit image classification to more complicated digit and clothing product image classification with time. This is the first time of introducing time dimension into the optical neural network, laying a foundation for future brain-like photonic chip development.

Resisting AI

Resisting AI: An Anti-fascist Approach to Artificial Intelligence is a book on artificial intelligence (AI) by Dan McQuillan, published in 2022 by Bristol University Press. == Content == Resisting AI takes the form of an extended essay, which contrasts optimistic visions about AI's potential by arguing that AI may best be seen as a continuation and reinforcement of bureaucratic forms of discrimination and violence, ultimately fostering authoritarian outcomes. For McQuillan, AI's promise of objective calculability is antithetical to an egalitarian and just society. McQuillan uses the expression "AI violence" to describe how – based on opaque algorithms – various actors can discriminate against categories of people in accessing jobs, loans, medical care, and other benefits. The book suggests that AI has a political resonance with soft eugenic approaches to the valuation of life by modern welfare states, and that AI exhibits eugenic features in its underlying logic, as well as in its technical operations. The parallel is with historical eugenicists achieving saving to the state by sterilizing defectives so the state would not have to care for their offspring. The analysis of McQuillan goes beyond the known critique of AI systems fostering precarious labour markets, addressing "necropolitics", the politics of who is entitled to live, and who to die. Although McQuillan offers a brief history of machine learning at the beginning of the book – with its need for "hidden and undercompensated labour", he is concerned more with the social impacts of AI rather than with its technical aspects. McQuillan sees AI as the continuation of existing bureaucratic systems that already marginalize vulnerable groups – aggravated by the fact that AI systems trained on existing data are likely to reinforce existing discriminations, e.g. in attempting to optimize welfare distribution based on existing data patterns, ultimately creating a system of "self-reinforcing social profiling". In elaborating on the continuation between existing bureaucratic violence and AI, McQuillan connects to Hannah Arendt's concept of the thoughtless bureaucrat in Eichmann in Jerusalem: A Report on the Banality of Evil, which now becomes the algorithm that, lacking intent, cannot be accountable, and is thus endowed with an "algorithmic thoughtlessness". McQuillan defends the "fascist" in the title of the work by arguing that while not all AI is fascist, this emerging technology of control may end up being deployed by fascist or authoritarian regimes. For McQuillan, AI can support the diffusion of states of exception, as a technology impossible to properly regulate and a mechanism for multiplying exceptions more widely. An example of a scenario where AI systems of surveillance could bring discrimination to a new high is the initiative to create LGBT-free zones in Poland. Skeptical of ethical regulations to control the technology, McQuillan suggests people's councils and workers' councils, and other forms of citizens' agency to resist AI. A chapter titled "Post-Machine Learning" makes an appeal for resistance via currents of thought from feminist science (standpoint theory), post-normal science (extended peer communities), and new materialism; McQuillan encourages the reader to question the meaning of "objectivity" and calls for the necessity of alternative ways of knowing. Among the virtuous examples of resistance – possibly to be adopted by the AI workers themselves – McQuillan notes the Lucas Plan of the workers of Lucas Aerospace Corporation, in which a workforce declared redundant took control, reorienting the enterprise toward useful products. McQuillan advocates for what he calls decomputing, an opposition to the sweeping application and expansion of artificial intelligence. Similar to degrowth, the approach criticizes AI as an outgrowth of the systemic issues within capitalist systems. McQuillan argues that a different future is possible, in which distance between people is reduced rather than increased through AI intermediaries. The work of McQuillan warns against "watered-down forms of engagement" with AI, such as citizen juries, which superficially look like democratic deliberation but may actually obscure important decisions about AI that are outside the purview of the engagement situation (McQuillan 2022, 128). In an interview about the book, McQuillan describes himself as an "AI abolitionist". == Reception == The book has been praised for how it "masterfully disassembles AI as an epistemological, social, and political paradigm". On the critical side, a review in the academic journal Justice, Power and Resistance took exception to the "nightmarish visions of Big Brother" offered by McQuillan, and argued that while many elements of AI may pose concern, a critique should not be based on a caricature of what AI is, concluding that McQuillan's work is "less of a theory and more of a Manifesto". Another review notes "a disconnect between the technical aspects of AI and the socio-political analysis McQuillan provides." Although the book was published before the ChatGPT and large language model debate heated up, the book has not lost relevance to the AI discussion. It is noted for suggesting a link between beliefs in artificial intelligence and beliefs in a racialised and gendered visions of intelligence overall, whereby a certain type of rational, measurable intelligence is privileged, leading to "historical notions of hierarchies of being". The blog Reboot praised McQuillan for offering a theory of harm of AI (why AI could end up hurting people and society) that does not just encourage tackling in isolation specific predicted problems with AI-centric systems: bias, non-inclusiveness, exploitativeness, environmental destructiveness, opacity, and non-contestability. For educational policies could also look at AI following the reading of McQuillan: In his book Resisting AI, Dan McQuillan argues that "When we're thinking about the actuality of AI, we can't separate the calculations in the code from the social context of its application" .... McQuillan's particular concern is how many contemporary applications of AI are amplifying existing inequalities and injustices as well as deepening social divisions and instabilities. His book makes a powerful case for anticipating these effects and actively resisting them for the good of societies. Videos and podcasts with an interest in AI and emerging technology have discussed the book.

Multilinear principal component analysis

Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA) that is used to analyze M-way arrays, also informally referred to as "data tensors". M-way arrays may be modeled by linear tensor models, such as CANDECOMP/Parafac, or by multilinear tensor models, such as multilinear principal component analysis (MPCA) or multilinear (tensor) independent component analysis (MICA). In 2005, Vasilescu and Terzopoulos introduced the Multilinear PCA terminology as a way to better differentiate between multilinear data models that employed 2nd order statistics versus higher order statistics to compute a set of independent components for each mode, such as Multilinear ICA Multilinear PCA may be applied to compute the causal factors of data formation, or as signal processing tool on data tensors whose individual observation have either been vectorized, or whose observations are treated as a collection of column/row observations, an "observation as a matrix", and concatenated into a data tensor. The latter approach is suitable for compression and reducing redundancy in the rows, columns and fibers that are unrelated to the causal factors of data formation. Vasilescu and Terzopoulos in their paper "TensorFaces" introduced the M-mode SVD algorithm which are algorithms misidentified in the literature as the HOSVD or the Tucker which employ the power method or gradient descent, respectively. Vasilescu and Terzopoulos framed the data analysis, recognition and synthesis problems as multilinear tensor problems. Data is viewed as the compositional consequence of several causal factors, that are well suited for multi-modal tensor factor analysis. The power of the tensor framework was showcased by analyzing human motion joint angles, facial images or textures in the following papers: Human Motion Signatures (CVPR 2001, ICPR 2002), face recognition – TensorFaces, (ECCV 2002, CVPR 2003, etc.) and computer graphics – TensorTextures (Siggraph 2004). == The algorithm == The MPCA solution follows the alternating least square (ALS) approach. It is iterative in nature. As in PCA, MPCA works on centered data. Centering is a little more complicated for tensors, and it is problem dependent. == Feature selection == MPCA features: Supervised MPCA is employed in causal factor analysis that facilitates object recognition while a semi-supervised MPCA feature selection is employed in visualization tasks. == Extensions == Various extension of MPCA: Robust MPCA (RMPCA) Multi-Tensor Factorization, that also finds the number of components automatically (MTF)