AI Code Base

AI Code Base — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Texture compression

    Texture compression

    Texture compression is a specialized form of image compression designed for storing texture maps in 3D computer graphics rendering systems. Unlike conventional image compression algorithms, texture compression algorithms are optimized for random access. Texture compression can be applied to reduce memory usage at runtime. Texture data is often the largest source of memory usage in a mobile application. == Tradeoffs == In their seminal paper on texture compression, Beers, Agrawala and Chaddha list four features that tend to differentiate texture compression from other image compression techniques. These features are: Decoding Speed It is highly desirable to be able to render directly from the compressed texture data and so, in order not to impact rendering performance, decompression must be fast. Random Access Since predicting the order that a renderer accesses texels would be difficult, any texture compression scheme must allow fast random access to decompressed texture data. This tends to rule out many better-known image compression schemes such as JPEG or run-length encoding. Compression Rate and Visual Quality In a rendering system, lossy compression can be more tolerable than for other use cases. Some texture compression libraries, such as crunch, allow the developer to flexibly trade off compression rate vs. visual quality, using methods such as rate–distortion optimization (RDO). Encoding Speed Texture compression is more tolerant of asymmetric encoding/decoding rates as the encoding process is often done only once during the application authoring process. Given the above, most texture compression algorithms involve some form of fixed-rate lossy vector quantization of small fixed-size blocks of pixels into small fixed-size blocks of coding bits, sometimes with additional extra pre-processing and post-processing steps. Block Truncation Coding is a very simple example of this family of algorithms. Because their data access patterns are well-defined, texture decompression may be executed on-the-fly during rendering as part of the overall graphics pipeline, reducing overall bandwidth and storage needs throughout the graphics system. As well as texture maps, texture compression may also be used to encode other kinds of rendering map, including bump maps and surface normal maps. Texture compression may also be used together with other forms of map processing such as mipmaps and anisotropic filtering. == Availability == Some examples of practical texture compression systems are S3 Texture Compression (S3TC), PVRTC, Ericsson Texture Compression (ETC) and Adaptive Scalable Texture Compression (ASTC); these may be supported by special function units in modern graphics processing units (GPUs). OpenGL and OpenGL ES, as implemented on many video accelerator cards and mobile GPUs, can support multiple common kinds of texture compression - generally through the use of vendor extensions. == Supercompression == A compressed-texture can be further compressed in what is called "supercompression". Fixed-rate texture compression formats are optimized for random access and are much less efficient compared to image formats such as PNG. By adding further compression, a programmer can reduce the efficiency gap. The extra layer can be decompressed by the CPU so that the GPU receives a normal compressed texture, or in newer methods, decompressed by the GPU itself. Supercompression saves the same amount of VRAM as regular texture compression, but saves more disk space and download size. == Neural Texture Compression == Random-Access Neural Compression of Material Textures (Neural Texture Compression) is a Nvidia's technology which enables two additional levels of detail (16× more texels, so four times higher resolution) while maintaining similar storage requirements as traditional texture compression methods. The key idea is compressing multiple material textures and their mipmap chains together, and using a small neural network, that is optimized for each material, to decompress them.

    Read more →
  • Automated decision-making

    Automated decision-making

    Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra

    Read more →
  • Rabbit r1

    Rabbit r1

    The Rabbit r1 is an artificial intelligence personal assistant device developed by the American technology startup Rabbit Inc and co-designed by Teenage Engineering. It was announced at the 2024 Consumer Electronics Show as a handheld device intended to perform digital tasks through voice commands, touch interaction, and web-based AI agents. The r1 was marketed around Rabbit's concept of a "large action model" (LAM), which the company described as software able to operate websites and services on behalf of users. The device runs rabbitOS, an operating system based on the Android Open Source Project. Its services have included AI search, image recognition, voice interaction, music playback, rideshare and food-ordering integrations, and later experimental web-agent features such as LAM Playground and teach mode. Initial reviews were largely negative, with reviewers criticizing the device's limited functionality, bugs, and unclear advantages over a smartphone. Critics also questioned Rabbit's claims after the r1 software was shown to run on an Android phone. Rabbit continued to issue software updates after launch, including rabbitOS 2 in September 2025, which introduced a redesigned card-based interface, gesture navigation, and a "creations" feature for generating small software tools and experiences on the device. Rabbit Inc was founded by Jesse Lyu Cheng. == Hardware == Display: A 2.88-inch touchscreen for interactive user input. Input: push-to-talk button to activate voice commands; scroll wheel; Gyroscope; Magnetometer; Accelerometer; GPS. Camera: 8 MP single camera, with a resolution of 3264x2448, allowing for the connected external AI to use computer vision. Audio: Equipped with a speaker and dual microphones for audio interaction. Connectivity: Supports Wi-Fi and cellular connections via a SIM card slot to access internet services. Processor: Runs on a 2.3GHz MediaTek Helio P35 processor. Memory: Contains 4GB of RAM for operational tasks. Storage: Offers 128GB of internal storage for data. Ports: Utilizes a USB-C port for charging and data connections. == Software == The Rabbit r1 runs rabbitOS, which is based on the Android Open Source Project (AOSP), specifically Android 13. Rabbit founder Jesse Lyu described rabbitOS as a "very bespoke AOSP" after reports that the r1's software could be run on a conventional Android phone. Rabbit described the r1 as using a large action model (LAM), a type of AI agent intended to perform tasks across software interfaces rather than only answer questions. At launch, the device supported a limited set of services, including AI search, vision features, music playback, and some third-party integrations. Perplexity.ai was one of the AI services used to answer user queries. In 2024, Rabbit released several software updates that added features and attempted to address early criticism of the device. In July 2024, the company launched "beta rabbit", an advanced search and conversation mode for more complex queries. In October 2024, it released LAM Playground, a web-based agent feature intended to let the r1 operate websites on behalf of users. Reviewers found the feature experimental; Android Authority reported that it could perform some navigation tasks but struggled with CAPTCHAs, loops, and unintended behavior. In November 2024, Rabbit introduced a beta "teach mode", which allowed users to demonstrate web-based tasks in the Rabbithole web portal and later ask the r1 to repeat them. The company described teach mode as experimental, and The Verge noted that Rabbit warned users that results could be unpredictable and that CAPTCHA-protected sites could cause problems. Rabbit released rabbitOS 2 in September 2025. The update redesigned the interface around a card-based layout, added additional touchscreen gestures, and introduced "creations", a feature that lets users generate simple software tools, games, and interfaces through natural-language prompts. Coverage of the update described it as a major software overhaul rather than new hardware. == Reception == === Funding === Rabbit raised $20 million in funding from Khosla Ventures, Synergis Capital and Kakao Investment in October 2023. The company announced an additional $10 million in funding in December 2023. === Sales === Following its announcement at the 2024 Consumer Electronics Show, 130,000 units were sold. On August 13, 2024, Rabbit announced that sales of r1 had expanded to the entire European Union (except Malta) and United Kingdom. On August 21, 2024, sales of r1 expanded to Singapore. === Reviews === The r1 was met with strong criticism immediately after Rabbit began shipping the device. Some reviews questioned what the device was able to do that a smartphone could not, while comparing it to the similar Humane Ai Pin. YouTuber Marques Brownlee called the device "barely reviewable". Android Authority's Mishaal Rahman managed to install Rabbit r1's software on a Pixel 6a smartphone, after a tipster shared an APK file. The Verge echoed the claims made by Rahman. In response, Lyu published statements confirming its use of Android, but denying that the r1 is an Android app. Mashable called its Vision features impressive, but said that "these praise-worthy features are overshadowed by buggy performance". Ars Technica wrote a blog post claiming "the company is blocking access from bootleg APKs". TechCrunch gave a slightly more positive review, calling the device a "fun peep at a possible future", but could not "advise anyone to buy one now." Shortly after the launch of r1, Rabbit began a weekly cadence of software updates to address much of the criticism from the early reviews, including "battery and GPS performance, time zone selection, and more". Digital Trends said the Magic Camera feature "takes the most mundane, ordinary, and badly composed photos and makes something fun and eye-catching from them." Mashable said the "beta rabbit" feature "makes Rabbit R1 more conversational and intelligent". Later coverage noted that Rabbit continued to update the r1 after its poorly received launch. The Verge reported in September 2024 that about 5,000 of roughly 100,000 purchasers were using the device at any given moment, citing Lyu, and described the product as having launched before it was ready. In 2025, coverage of rabbitOS 2 described the update as an attempt to reset the device's software experience after the criticism of its original release. == Controversies == === GAMA project === Rabbit Inc has garnered attention due to allegations surrounding its funding and the company's past projects. The company came under scrutiny when Stephen Findeisen, known as Coffeezilla on YouTube, published a video in May 2024, alleging that Rabbit Incorporation was "built on a scam". Rabbit Incorporation, initially named Cyber Manufacturing Co, rebranded just two months before launching the Rabbit R1. The company, under its former name, raised $6 million in November 2021 for a project called GAMA, described as a "Next Generation NFT Project." Jesse Lyu, the CEO of Rabbit Incorporation, referred to GAMA as a "fun little project." Coffeezilla, who investigates influencer scams, highlighted old Clubhouse recordings of Jesse Lyu discussing the GAMA project. In these recordings, Lyu emphasized the substantial funding behind GAMA and its potential to be a revolutionary, carbon-negative cryptocurrency. Coffeezilla questioned the whereabouts of the funds raised for GAMA, estimating that approximately $1 million in refunds to investors remained unresolved. He suggested that the rebranding to Rabbit Incorporation and the shift to developing the Rabbit R1 were attempts to divert from the GAMA project's issues. In response to Coffeezilla's inquiries, Rabbit Incorporation stated that the $6 million raised was used for the GAMA project. The company said that NFTs cannot be refunded unless the owner agrees to "burn" them on the blockchain. Rabbit Incorporation also said that the GAMA project was open-sourced and returned to the community, aligning with community feedback. They also mentioned that efforts to buy back NFTs were made to counteract malicious trading and maintain market stability. === Security === In June 2024, Engadget reported that the Rabbitude team, a community reverse engineering project, had gained access to the r1's codebase revealing that r1's software contained several hardcoded API keys in its code for ElevenLabs, Microsoft Azure, Yelp, and Google Maps, potentially allowing unauthorized access to r1 responses, including those containing the users' personal information. For a short time, Rabbit immediately began revoking and rotating those secrets and confirmed that the code was leaked by an employee who had "been terminated and remains under investigation". In July 2024, the company revealed that all user chats and device pairing data were logged on the r1 with no ability to delete them. This meant that lost or stolen devices could be used to extract user

    Read more →
  • Meta-Labeling

    Meta-Labeling

    Meta-labeling, also known as corrective AI, is a machine learning (ML) technique utilized in quantitative finance to enhance the performance of investment and trading strategies, developed in 2017 by Marcos López de Prado at Guggenheim Partners and Cornell University. The core idea is to separate the decision of trade direction (side) from the decision of trade sizing, addressing the inefficiencies of simultaneously learning both side and size predictions. The side decision involves forecasting market movements (long, short, neutral), while the size decision focuses on risk management and profitability. It serves as a secondary decision-making layer that evaluates the signals generated by a primary predictive model. By assessing the confidence and likely profitability of those signals, meta-labeling allows investors and algorithms to dynamically size positions and suppress false positives. == Motivation == Meta-labeling is designed to improve precision without sacrificing recall. As noted by López de Prado, attempting to model both the direction and the magnitude of a trade using a single algorithm can result in poor generalization. By separating these tasks, meta-labeling enables greater flexibility and robustness: Enhances control over capital allocation. Reduces overfitting by limiting model complexity. Allows the use of interpretability tools and tailored thresholds to manage risk. Enables dynamic trade suppression in unfavorable regimes. == Applications == Meta-labeling has been applied in a variety of financial ML contexts, including: Algorithmic trading: Filtering and sizing trades to reduce false positives. Portfolio optimization: Scaling exposure across multiple signals with differing confidence levels. Risk management: Dynamically disabling strategies in adverse market conditions. Model validation: Interpreting when and why a model may be underperforming due to regime shifts. == General architecture == Meta-labeling decouples two core components of systematic trading strategies: directional prediction and position sizing. The process involves training a primary model to generate trade signals (e.g., buy, sell, or hold) and then training a secondary model to determine whether each signal is likely to lead to a profitable trade. The second model outputs a probability that is interpreted as the confidence in the forecast, which can be used to adjust the position size or to filter out unreliable trades. Meta-labeling is typically implemented as a three-stage process: Primary model (M1): Predicts the direction or label of a financial outcome using features such as market prices, returns, or volatility indicators. A typical output is directional, e.g., Y ∈ {−1,0,1}, representing short, neutral, or long positions. Secondary model (M2): A binary classifier trained to predict whether the primary model's prediction will be profitable. The target variable is a binary meta-label F ∈ { 0 , 1 } {\displaystyle F\in \{0,1\}} . Inputs can include features used in the primary model, performance diagnostics, or market regime data. Position sizing algorithm (M3): Translates the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure. === Stage 1: Forecasting side === Primary model architecture Figure 1 Figure 1 presents the architecture of a primary model. It focuses on forecasting the side of the trade. Following the example, this model (M1) takes in input data – such as open-high-low-close data and determines the side of the position to take: a negative number is a short position, and positive number is a long position, the range is set between −1 and 1 (the closer it is to −1 or 1, the stronger the models conviction is). When training the model, the labels are −1 and 1, based on the direction of forward returns for some predefined investment horizon. The researcher may decide to apply a recall check (τ: "Tau") by setting a minimum threshold that the initial output needs to be to qualify of a short or long position (if the threshold is not met, no side forecast is predicted, leading to closing of any open positions), this leads to the primary model output which is one of three possible side forecasts: −1, 0, or 1. The primary model also generates evaluation data which can be used by the secondary model, to improve performance of size forecasts. Some examples of evaluation data include rolling accuracy, F1, recall, precision, and AUC scores. === Stage 2: Filtering out false positives === General meta-labeling architecture Figure 2 Next comes the phase of filtering out false positives, by applying a secondary machine learning model (M2), which is a binary classifier trained to determine if the trade will be profitable or not. The model takes as input four general groupings of data: General input data which is predictive of a false positive. For example the last 30 days rolling volatility of the underlying asset. Evaluation data. Market state and regime data, one may find that macro economic data or clustering the market into regimes may help as specific trading strategies are known to perform better in particular regimes. Example: momentum based strategies perform best in periods with low volatility and strong directional moves. Primary models initial input which is a value between −1 and 1. This highlights the strength of the primary models conviction. The output of the model is a value between −1 and 1 (if using a Tanh function) which will indicate the strength of the conviction that a short or long position is profitable, or it could simply be between 0 and 1 (using a sigmoid function) if one only wanted to know if it made money or not. This output allows filtering out trades that are likely to lead to losses. One could stop at this point or use the outputs of the secondary model as inputs to a position sizing algorithm (M3) which could further enhance strategy performance metrics by translating the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure. === Stage 3: Optimizing position sizes === ==== Position sizing methods (M3) ==== Various algorithms have been proposed for transforming predicted probabilities into trade sizes: All-or-nothing: Allocate 100% of capital if the probability exceeds a predefined threshold (e.g., 0.5); otherwise, do not trade. Model confidence: Use the probability score directly as the fraction of capital allocated. Linear scaling: Rescale the model's probabilities using min-max normalization based on the training data. Normal CDF (NCDF): Use a normal cumulative distribution function applied to a z-statistic derived from the predicted probability. Empirical CDF (ECDF): Rank probabilities based on their percentile in the training data to ensure relative allocation. Sigmoid Optimal Position Sizing (SOPS): Applies a smooth non-linear sigmoid transformation optimized to maximize risk-adjusted returns (Sharpe ratio). ==== Model calibration ==== Each machine learning algorithm used in meta-labeling tends to produce outputs with different characteristic distributions; for example, some are approximately normally distributed, whereas others exhibit a pronounced U-shape, concentrating probabilities near the extremes. Due to these varying distributions, simply summing the outputs of different models can inadvertently lead to uneven weighting of signals, biasing trade decisions. To address this, model calibration techniques are essential to adjust the predicted probabilities towards frequentist probabilities, ensuring that model outputs reflect true likelihoods more accurately. Two common calibration techniques are: Platt scaling (Sigmoid scaling): Suitable for correcting S-shaped calibration plots typically produced by models such as support vector machines (SVMs). Isotonic regression: Fits a non-decreasing step function to probabilities and is effective particularly with larger datasets, though it can sometimes lead to overfitting. Transforming predictions to frequentist probabilities is crucial as it provides probabilistic outputs that are directly interpretable as the actual likelihood of an event occurring. Such calibration significantly enhances the effectiveness of fixed position sizing methods, reducing maximum drawdowns and increasing risk-adjusted returns. However, calibration has less impact on position sizing methods that directly estimate parameters from the training data, such as ECDF and SOPS, suggesting that calibration is a critical step mainly for fixed methods that rely heavily on raw model outputs. =

    Read more →
  • N-jet

    N-jet

    An N-jet is the set of (partial) derivatives of a function f ( x ) {\displaystyle f(x)} up to order N. Specifically, in the area of computer vision, the N-jet is usually computed from a scale space representation L {\displaystyle L} of the input image f ( x , y ) {\displaystyle f(x,y)} , and the partial derivatives of L {\displaystyle L} are used as a basis for expressing various types of visual modules. For example, algorithms for tasks such as feature detection, feature classification, stereo matching, tracking and object recognition can be expressed in terms of N-jets computed at one or several scales in scale space.

    Read more →
  • Singularity studies

    Singularity studies

    Singularity studies is an interdisciplinary academic field which examines the idea of technological singularity — the hypothesised point at which artificial intelligence may surpass human intelligence, might be attained by artificial intelligence (AI), robotics, and other technologies and sciences, and its social impacts. In this academic field, the study and research are conducted across a broad array of terrains such as information science, robotics, social informatics, economics, philosophy, and ethics. The primary aim of singularity studies is to gain an integrative understanding of the transformation of social systems occurring in tandem with the explosive evolution of AI and also the changes to be effected by such transformation in the view of humans, ethics, and legal systems. == History == An academic work on technological singurality has appeared in computer science, philosophy, sociology, and law since the early 1990s. Early discussions of an intelligence explosion were popularised by science-fiction writer Vernor Vinge in 1993 and later systematised by futurist Ray Kurzweil. Since the 2010s, universities such as Oxford, Stanford, and Keio have established dedicated programmes, while peer-reviewed journals have begun to publish scenario analyses and policy studies. Ongoing debates question the predictive value of singularity scenarios and warn against a deterministic view of technology. == Characteristics of research == Singularity studies extends beyond mere future predictions and offer an intellectual foundation for proactively designing and creating a desirable future. Principal research themes in this realm include: Ethics of AI; Social implications of technologies; Possibility of harmonious coexistence of humans and AI; Communication with AI; and Redesign of social systems. == Technologists and academics == Vernor Vinge: Propounded the concept of singularity in 1993, making a massive impact on the academic and science-fiction spheres. Ray Kurzweil: Predicted the advent around 2045 of the technological singularity in his 2005 book The Singularity Is Near. Nick Bostrom: Offered philosophical reflections on superintelligence and the risks posed by AI. He is the founding director of the now-dissolved Future of Humanity Institute at the University of Oxford. === Japan === Kento Sasano: A social informatician, AI educator, and inventor. He is the president of the Japan Society of Singularity Studies. == Challenges and outlook == Singularity studies is still evolving as an academic field, and quite a few challenges remain unresolved in regard to the systematization of their theories, research methods, and educational curricula. That said, in this day and age of accelerating technological and societal shifts, interdisciplinary approaches have gained in importance and are drawing much attention in the arenas of scholarly research, intercorporate collaboration, and policy planning.

    Read more →
  • Machine learning

    Machine learning

    Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without being explicitly programmed. Advances in the field of deep learning have allowed neural networks, a class of statistical algorithms, to surpass many previous machine learning approaches in performance. Statistics and mathematical optimisation methods compose the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysis (EDA) through unsupervised learning. From a theoretical viewpoint, probably approximately correct learning provides a mathematical and statistical framework for describing machine learning. Most traditional machine learning and deep learning algorithms can be described as empirical risk minimisation under this framework. == History == The term machine learning was coined in 1959 by Arthur Samuel, an IBM employee and pioneer in the field of computer gaming and artificial intelligence. The synonym self-teaching computers was also used during this time period. The earliest machine learning program was introduced in the 1950s, when Samuel invented a computer program that calculated the chance of winning in checkers for each side, but the history of machine learning is rooted in decades of efforts to study human cognitive processes. In 1949, Canadian psychologist Donald Hebb published the book The Organization of Behavior, in which he introduced a theoretical neural structure formed by certain interactions among nerve cells. The Hebbian theory of neuron interaction set the groundwork for how many machine learning algorithms work, with connected artificial neurons changing the strength of their connections based on data. Other researchers who have studied human cognitive systems contributed to the modern machine learning technologies as well, including Walter Pitts and Warren McCulloch, who proposed the first mathematical model of neural networks including algorithms that mirror human thought processes. By the early 1960s, an experimental "learning machine" with punched tape memory, called Cybertron, had been developed by Raytheon Company to analyse sonar signals, electrocardiograms, and speech patterns using rudimentary reinforcement learning. It was repetitively "trained" by a human operator/teacher to recognise patterns and equipped with a "goof" button to cause it to reevaluate incorrect decisions. A representative book on research into machine learning during the 1960s was Nils Nilsson's book "Learning Machines", dealing mostly with machine learning for pattern classification. Interest related to pattern recognition continued into the 1970s, as described by Duda and Hart in 1973. In 1981, a report was given on using teaching strategies so that an artificial neural network learns to recognise 40 characters (26 letters, 10 digits, and 4 special symbols) from a computer terminal. Tom M. Mitchell provided a widely quoted, more formal definition of the algorithms studied in the machine learning field: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E." This definition of the tasks in which machine learning is concerned is fundamentally operational rather than defining the field in cognitive terms. This follows Alan Turing's proposal in his paper "Computing Machinery and Intelligence", in which the question, "Can machines think?", is replaced by asking whether machines can convincingly imitate a human in its responses to human-posed questions. In 2014 Ian Goodfellow and others introduced generative adversarial networks (GANs) which could produce realistic synthetic data. By 2016 AlphaGo had won against top human players in Go using reinforcement learning techniques. == Relationships to other fields == === Artificial intelligence === As a scientific endeavour, machine learning grew out of the quest for artificial intelligence (AI). In the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed "neural networks"; these were mostly perceptrons and other models that were later found to be reinventions of the generalised linear models of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis. However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation. By 1980, expert systems had come to dominate AI, and statistics was out of favour. Work on symbolic/knowledge-based learning continued within AI, leading to inductive logic programming (ILP), but the more statistical line of research was now outside the field of AI proper, in pattern recognition and information retrieval. Neural network research was abandoned by AI and computer science around the same time. This subfield, termed "connectionism", was continued by researchers from other disciplines, including John Hopfield, David Rumelhart, and Geoffrey Hinton. Their main success came in the mid-1980s with the reinvention of backpropagation. Machine learning (ML), reorganised and recognised as its own field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics, fuzzy logic, and probability theory. === Data compression === === Data mining === Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction based on known properties learned from the training data, data mining focuses on the discovery of previously unknown properties in the data (this is the analysis step of knowledge discovery in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. Machine learning also has intimate ties to optimization: Many learning problems are formulated as minimisation of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the preassigned labels of a set of examples). === Generalization === Characterizing the generalisation of various learning algorithms is an active topic of current research, especially for deep learning algorithms. === Statistics === Machine learning and statistics are closely related fields in terms of methods, but distinct in their principal goal: statistics draws population inferences from a sample, while machine learning finds generalisable predictive patterns. Conventional statistical analyses require the a priori selection of a model most suitable for the study data set. In addition, only significant or theoretically relevant variables based on previous experience are included for analysis. In contrast, machine learning is not built on a pre-structured model; rather, the data shape the model by detecting underlying patterns. The more variables (input) used to train the model, the more accurate the ultimate model will be. Leo Breiman distinguished two statistical modelling paradigms: the data model and the algorithmic model, wherein "algorithmic model" means more or less the machine learning algorithms like Random forest. Some statisticians have adopted methods from machine learning, producing the field of statistical learning. === Statistical physics === Analytical and computational techniques derived from deep-rooted physics of disordered systems can be extended to large-scale problems, including machine learning, e.g., to analyse the weight space of deep neural networks. Statistical physics is thus

    Read more →
  • Elements of AI

    Elements of AI

    Elements of AI is a massive open online course (MOOC) teaching the basics of artificial intelligence. The course, originally launched in 2018, is designed and organized by the University of Helsinki and learning technology company MinnaLearn. The course includes modules on machine learning, neural networks, the philosophy of artificial intelligence, and using artificial intelligence to solve problems. It consists of two parts: Introduction to AI and its sequel, Building AI, that was released in late 2020. In November 2019, the course was named one of four winners of MIT’s Inclusive Innovation Challenge. University of Helsinki's computer science department is known as the alma mater of Linus Torvalds, a Finnish-American software engineer who is the creator of the Linux kernel, which is the kernel for Linux operating systems. == EU’s AI pledge == The government of Finland has pledged to offer the course for all EU citizens by the end of 2021, as the course is made available in all the official EU languages. The initiative was launched as part of Finland's Presidency of the Council of the European Union in 2019, with the European Commission providing translations of the course materials. In 2017, Finland launched an AI strategy to stay competitive in the field of AI amid growing competition between China and the United States. With the support of private companies and the government, Finland's now-realized goal was to get 1 percent of its citizens to participate in Elements of AI. Other governments have also given their support to the course. For instance, Germany's Federal Minister for Economic Affairs and Energy Peter Altmeier has encouraged citizens to take part in the course to help Germany gain a competitive advantage in AI. Sweden's Minister for Energy and Minister for Digital Development Anders Ygeman has said that Sweden aims to teach 1 percent of its population the basics of AI like Finland has. == Participants == Elements of AI had enrolled more than 1 million students from more than 110 countries by May 2023. A quarter of the course's participants are aged 45 and over, and some 40 percent are women. Among Nordic participants, the share of women is nearly 60 percent. In September 2022, the course was available in Finnish, Swedish, Estonian, English, German, Latvian, Norwegian, French, Belgian, Czech, Greek, Slovakian, Slovenian, Latvian, Lithuanian, Portuguese, Spanish, Irish, Icelandic, Maltese, Croatian, Romanian, Italian, Dutch, Polish, and Danish.

    Read more →
  • New York Institute of Technology Computer Graphics Lab

    New York Institute of Technology Computer Graphics Lab

    The New York Institute of Technology Computer Graphics Lab is a computer lab located at the New York Institute of Technology (NYIT), founded by Alexander Schure. It was originally located at the "pink building" on the NYIT campus. It has played an important role in the history of computer graphics and animation, as founders of Pixar and Lucasfilm Limited, including Turing Award winners Edwin Catmull and Patrick Hanrahan, began their research there. It is the birthplace of entirely 3D CGI films. The lab was initially founded to produce a short high-quality feature film with the project name of The Works. The feature, which was never completed, was a 90-minute feature that was to be the first entirely computer-generated CGI movie. Production mainly focused around DEC PDP and VAX machines. Many of the original CGL team now form the elite of the CG and computer world with members going on to Silicon Graphics, Microsoft, Cisco, NVIDIA and others, including Pixar president, co-founder and Turing laureate Ed Catmull, Pixar co-founder and Microsoft graphics fellow Alvy Ray Smith, Pixar co-founder Ralph Guggenheim, Walt Disney Animation Studios chief scientist Lance Williams, Netscape and Silicon Graphics founder Jim Clark, Tableau co-founder and Turing laureate Pat Hanrahan, Microsoft graphics fellow Jim Blinn, Thad Beier, Oscar and Bafta nominee Jacques Stroweis, Andrew Glassner, and Tom Brigham. Systems programmer Bruce Perens went on to co-found the Open Source Initiative. Researchers at the New York Institute of Technology Computer Graphics Lab created the tools that made entirely 3D CGI films possible. Among NYIT CG Lab's many innovations was an eight-bit paint system to ease computer animation. NYIT CG Lab was regarded as the top computer animation research and development group in the world during the late 70s and early 80s. == The 21st century == The lab is presently located at NYIT's Long Island campus, and NYIT currently offers a Ph.D. program in Computer Science.

    Read more →
  • Virtual intelligence

    Virtual intelligence

    Virtual intelligence (VI) is the term given to artificial intelligence that exists within a virtual world. Many virtual worlds have options for persistent avatars that provide information, training, role-playing, and social interactions. The immersion in virtual worlds provides a platform for VI beyond the traditional paradigm of past user interfaces (UIs). What Alan Turing established as a benchmark for telling the difference between human and computerized intelligence was devoid of visual influences. With today's VI bots, virtual intelligence has evolved past the constraints of past testing into a new level of the machine's ability to demonstrate intelligence. The immersive features of these environments provide nonverbal elements that affect the realism provided by virtually intelligent agents. Virtual intelligence is the intersection of these two technologies: Virtual environments: Immersive 3D spaces provide for collaboration, simulations, and role-playing interactions for training. Many of these virtual environments are currently being used for government and academic projects, including Second Life, VastPark, Olive, OpenSim, Outerra, Oracle's Open Wonderland, Duke University's Open Cobalt, and many others. Some of the commercial virtual worlds are also taking this technology into new directions, including the high-definition virtual world Blue Mars. Artificial intelligence (AI): AI is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. VI is a type of AI that operates within virtual environments to simulate human-like interactions and responses. == Applications == Cutlass Bomb Disposal Robot: Northrop Grumman developed a virtual training opportunity because of the prohibitive real-world cost and dangers associated with bomb disposal. By replicating a complicated system without having to learn advanced code, the virtual robot has no risk of damage, trainee safety hazards, or accessibility constraints. MyCyberTwin: NASA is among the companies that have used the MyCyberTwin AI technologies. They used it for the Phoenix rover in the virtual world Second Life. Their MyCyberTwin used a programmed profile to relay information about what the Phoenix rover was doing and its purpose. Second China: The University of Florida developed the "Second China" project as an immersive training experience for learning how to interact with the culture and language in a foreign country. Students are immersed in an environment that provides role-playing challenges coupled with language and cultural sensitivities magnified during country-level diplomatic missions or during times of potential conflict or regional destabilization. The virtual training provides participants with opportunities to access information, take part in guided learning scenarios, communicate, collaborate, and role-play. While China was the country for the prototype, this model can be modified for use with any culture to help better understand social and cultural interactions and see how other people think and what their actions imply. Duke School of Nursing Training Simulation: Extreme Reality developed virtual training to test critical thinking with a nurse performing trained procedures to identify critical data to make decisions and performing the correct steps for intervention. Bots are programmed to respond to the nurse's actions as the patient with their conditions improving if the nurse performs the correct actions.

    Read more →
  • Winner-take-all in action selection

    Winner-take-all in action selection

    Winner-take-all is a computer science concept that has been widely applied in behavior-based robotics as a method of action selection for intelligent agents. Winner-take-all systems work by connecting modules (task-designated areas) in such a way that when one action is performed it stops all other actions from being performed, so only one action is occurring at a time. The name comes from the idea that the "winner" action takes all of the motor system's power. == History == In the 1980s and 1990s, many roboticists and cognitive scientists were attempting to find speedier and more efficient alternatives to the traditional world modeling method of action selection. In 1982, Jerome A. Feldman and D.H. Ballard published the "Connectionist Models and Their Properties", referencing and explaining winner-take-all as a method of action selection. Feldman's architecture functioned on the simple rule that in a network of interconnected action modules, each module will set its own output to zero if it reads a higher input than its own in any other module. In 1986, Rodney Brooks introduced behavior-based artificial intelligence. Winner-take-all architectures for action selection soon became a common feature of behavior-based robots, because selection occurred at the level of the action modules (bottom-up) rather than at a separate cognitive level (top-down), producing a tight coupling of stimulus and reaction. == Types of winner-take-all architectures == === Hierarchy === In the hierarchical architecture, actions or behaviors are programmed in a high-to-low priority list, with inhibitory connections between all the action modules. The agent performs low-priority behaviors until a higher-priority behavior is stimulated, at which point the higher behavior inhibits all other behaviors and takes over the motor system completely. Prioritized behaviors are usually key to the immediate survival of the agent, while behaviors of lower priority are less time-sensitive. For example, "run away from predator" would be ranked above "sleep." While this architecture allows for clear programming of goals, many roboticists have moved away from the hierarchy because of its inflexibility. === Heterarchy and fully distributed === In the heterarchy and fully distributed architecture, each behavior has a set of pre-conditions to be met before it can be performed, and a set of post-conditions that will be true after the action has been performed. These pre- and post-conditions determine the order in which behaviors must be performed and are used to causally connect action modules. This enables each module to receive input from other modules as well as from the sensors, so modules can recruit each other. For example, if the agent's goal were to reduce thirst, the behavior "drink" would require the pre-condition of having water available, so the module would activate the module in charge of "find water". The activations organize the behaviors into a sequence, even though only one action is performed at a time. The distribution of larger behaviors across modules makes this system flexible and robust to noise. Some critics of this model hold that any existing set of division rules for the predecessor and conflictor connections between modules produce sub-par action selection. In addition, the feedback loop used in the model can in some circumstances lead to improper action selection. === Arbiter and centrally coordinated === In the arbiter and centrally coordinated architecture, the action modules are not connected to each other but to a central arbiter. When behaviors are triggered, they begin "voting" by sending signals to the arbiter, and the behavior with the highest number of votes is selected. In these systems, bias is created through the "voting weight", or how often a module is allowed to vote. Some arbiter systems take a different spin on this type of winner-take-all by using a "compromise" feature in the arbiter. Each module is able to vote for or against each smaller action in a set of actions, and the arbiter selects the action with the most votes, meaning that it benefits the most behavior modules. This can be seen as violating the general rule against creating representations of the world in behavior-based AI, established by Brooks. By performing command fusion, the system is creating a larger composite pool of knowledge than is obtained from the sensors alone, forming a composite inner representation of the environment. Defenders of these systems argue that forbidding world-modeling puts unnecessary constraints on behavior-based robotics, and that agents benefits from forming representations and can still remain reactive.

    Read more →
  • Histogram of oriented displacements

    Histogram of oriented displacements

    Histogram of oriented displacements (HOD) is a 2D trajectory descriptor. The trajectory is described using a histogram of the directions between each two consecutive points. Given a trajectory T = {P1, P2, P3, ..., Pn}, where Pt is the 2D position at time t. For each pair of positions Pt and Pt+1, calculate the direction angle θ(t, t+1). Value of θ is between 0 and 360. A histogram of the quantized values of θ is created. If the histogram is of 8 bins, the first bin represents all θs between 0 and 45. The histogram accumulates the lengths of the consecutive moves. For each θ, a specific histogram bin is determined. The length of the line between Pt and Pt+1 is then added to the specific histogram bin. To show the intuition behind the descriptor, consider the action of waving hands. At the end of the action, the hand falls down. When describing this down movement, the descriptor does not care about the position from which the hand started to fall. This fall will affect the histogram with the appropriate angles and lengths, regardless of the position where the hand started to fall. HOD records for each moving point: how much it moves in each range of directions. HOD has a clear physical interpretation. It proposes that, a simple way to describe the motion of an object, is to indicate how much distance it moves in each direction. If the movement in all directions are saved accurately, the movement can be repeated from the initial position to the final destination regardless of the displacements order. However, the temporal information will be lost, as the order of movements is not stored-this is what we solve by applying the temporal pyramid, as shown in section \ref{sec:temp-pyramid}. If the angles quantization range is small, classifiers that use the descriptor will overfit. Generalization needs some slack in directions-which can be done by increasing the quantization range.

    Read more →
  • Play Integrity API

    Play Integrity API

    Play Integrity API (formerly known as SafetyNet) consists of several application programming interfaces (APIs) offered by the Google Play Services to support security sensitive applications and enforce DRM. Currently, these APIs include device integrity verification, app verification, recaptcha and web address verification. It uses an environment called DroidGuard to perform the attestation. == Attestation == The SafetyNet Attestation API, one of the APIs under the SafetyNet umbrella, provides verification that the integrity of the device is not compromised. In practice, non-official ROMs such as LineageOS fail the hardware attestation and thus prevent the user from using a non-compliant ROM with third-party apps (mainly banking) that require the API. Due to this, some consider this a monopolistic practice deterring the entrance of competing mobile operating systems in the market. It requires a network connection to Google servers and validates the hardware signatures. Amongst the checks, the API looks for bootloader unlock status, ROM signatures, kernel strings, it also uses AVB2.0 and dm-verity attestations. Upon successful checks, Google Play will mark the device as Certified. The attestation runs in an environment called DroidGuard (com.google.android.gms.unstable). The SafetyNet Attestation API (one of the four APIs under the SafetyNet umbrella) has been deprecated. As of 6 October 2023, Google planned to replace it with the Play Integrity API by the end of January 2025. The transition ended on 20 May 2025, breaking applications which hadn't been updated. These attestations are offered by Google Play Services and thus are not available on free Android environments, like AOSP. Therefore, developers can require the API to be available and may refuse to execute on AOSP builds. == Google Play Protect == Under the same umbrella, Play Protect is a mechanism to find and remove "vulnerable" apps from one's Android device as well as store apps. Although it's meant to scan for malware-containing apps, it also looks for non-DRM compliant apps. == Criticism == Multiple groups have criticised SafetyNet and the Play Integrity API. Criticisms include that it offers weaker protection compared to alternatives such as Android's hardware attestation API, which provides a stronger form of verification while having the ability to remain compatible with more secure Android operating systems like GrapheneOS. Critics argued it undermines competition by effectively requiring developers to rely on Google's proprietary services, strengthening its monopoly over the Android ecosystem and disadvantaging alternative, privacy-focused operating systems. Users have also developed tools, such as the Play Integrity Fix module for Magisk/KernelSU/APatch, which tricks the attestation using leaked fingerprints of vulnerable devices. Furthermore, some have questioned the effectiveness of the attestation, claiming it does not deliver the level of security promised by Google and instead serves more as a form of vendor lock-in than a meaningful security measure. Activists have also raised concerns that it may violate antitrust and competition laws, like the Digital Markets Act.

    Read more →
  • Computational intelligence

    Computational intelligence

    In computer science, computational intelligence (CI) refers to concepts, paradigms, algorithms and implementations of systems that are designed to show "intelligent" behavior in complex and changing environments. These systems are aimed at mastering complex tasks in a wide variety of technical or commercial areas and offer solutions that recognize and interpret patterns, control processes, support decision-making or autonomously manoeuvre vehicles or robots in unknown environments, among other things. These concepts and paradigms are characterized by the ability to learn or adapt to new situations, to generalize, to abstract, to discover and associate. Nature-analog or nature-inspired methods play a key role in this. CI approaches primarily address those complex real-world problems for which traditional or mathematical modeling is not appropriate for various reasons: the processes cannot be described exactly with complete knowledge, the processes are too complex for mathematical reasoning, they contain some uncertainties during the process, such as unforeseen changes in the environment or in the process itself, or the processes are simply stochastic in nature. Thus, CI techniques are properly aimed at processes that are ill-defined, complex, nonlinear, time-varying and/or stochastic. A recent definition of the IEEE Computational Intelligence Societey describes CI as the theory, design, application and development of biologically and linguistically motivated computational paradigms. Traditionally the three main pillars of CI have been Neural Networks, Fuzzy Systems and Evolutionary Computation. ... CI is an evolving field and at present in addition to the three main constituents, it encompasses computing paradigms like ambient intelligence, artificial life, cultural learning, artificial endocrine networks, social reasoning, and artificial hormone networks. ... Over the last few years there has been an explosion of research on Deep Learning, in particular deep convolutional neural networks. Nowadays, deep learning has become the core method for artificial intelligence. In fact, some of the most successful AI systems are based on CI. However, as CI is an emerging and developing field there is no final definition of CI, especially in terms of the list of concepts and paradigms that belong to it. The general requirements for the development of an “intelligent system” are ultimately always the same, namely the simulation of intelligent thinking and action in a specific area of application. To do this, the knowledge about this area must be represented in a model so that it can be processed. The quality of the resulting system depends largely on how well the model was chosen in the development process. Sometimes data-driven methods are suitable for finding a good model and sometimes logic-based knowledge representations deliver better results. Hybrid models are usually used in real applications. According to actual textbooks, the following methods and paradigms, which largely complement each other, can be regarded as parts of CI: Fuzzy systems Neural networks and, in particular, convolutional neural networks Evolutionary computation and, in particular, multi-objective evolutionary optimization Swarm intelligence Bayesian networks Artificial immune systems Learning theory Probabilistic methods == Relationship between hard and soft computing and artificial and computational intelligence == Artificial intelligence (AI) is used in the media, but also by some of the scientists involved, as a kind of umbrella term for the various techniques associated with it or with CI. Craenen and Eiben state that attempts to define or at least describe CI can usually be assigned to one or more of the following groups: "Relative definition” comparing CI to AI Conceptual treatment of key notions and their roles in CI Listing of the (established) areas that belong to it The relationship between CI and AI has been a frequently discussed topic during the development of CI. While the above list implies that they are synonyms, the vast majority of AI/CI researchers working on the subject consider them to be distinct fields, where either CI is an alternative to AI AI includes CI CI includes AI The view of the first of the above three points goes back to Zadeh, the founder of the fuzzy set theory, who differentiated machine intelligence into hard and soft computing techniques, which are used in artificial intelligence on the one hand and computational intelligence on the other. In hard computing (HC) and traditional AI (e.g. expert systems), inaccuracy and uncertainty are undesirable characteristics of a system, while soft computing (SC) and thus CI focus on dealing with these characteristics. The adjacent figure illustrates this view and lists the most important CI techniques. Another frequently mentioned distinguishing feature is the representation of information in symbolic form in AI and in sub-symbolic form in CI techniques. Hard computing is a conventional computing method based on the principles of certainty and accuracy and it is deterministic. It requires a precisely stated analytical model of the task to be processed and a prewritten program, i.e. a fixed set of instructions. The models used are based on Boolean logic (also called crisp logic), where e.g. an element can be either a member of a set or not and there is nothing in between. When applied to real-world tasks, systems based on HC result in specific control actions defined by a mathematical model or algorithm. If an unforeseen situation occurs that is not included in the model or algorithm used, the action will most likely fail. Soft computing, on the other hand, is based on the fact that the human mind is capable of storing information and processing it in a goal-oriented way, even if it is imprecise and lacks certainty. SC is based on the model of the human brain with probabilistic thinking, fuzzy logic and multi-valued logic. Soft computing can process a wealth of data and perform a large number of computations, which may not be exact, in parallel. For hard problems for which no satisfying exact solutions based on HC are available, SC methods can be applied successfully. SC methods are usually stochastic in nature i.e., they are a randomly defined processes that can be analyzed statistically but not with precision. Up to now, the results of some CI methods, such as deep learning, cannot be verified and it is also not clear what they are based on. This problem represents an important scientific issue for the future. AI and CI are catchy terms, but they are also so similar that they can be confused. The meaning of both terms has developed and changed over a long period of time, with AI being used first. Bezdek describes this impressively and concludes that such buzzwords are frequently used and hyped by the scientific community, science management and (science) journalism. Not least because AI and biological intelligence are emotionally charged terms and it is still difficult to find a generally accepted definition for the basic term intelligence. == History == In 1950, Alan Turing, one of the founding fathers of computer science, developed a test for computer intelligence known as the Turing test. In this test, a person can ask questions via a keyboard and a monitor without knowing whether his counterpart is a human or a computer. A computer is considered intelligent if the interrogator cannot distinguish the computer from a human. This illustrates the discussion about intelligent computers at the beginning of the computer age. The term Computational Intelligence was first used as the title of the journal of the same name in 1985 and later by the IEEE Neural Networks Council (NNC), which was founded 1989 by a group of researchers interested in the development of biological and artificial neural networks. On November 21, 2001, the NNC became the IEEE Neural Networks Society, to become the IEEE Computational Intelligence Society two years later by including new areas of interest such as fuzzy systems and evolutionary computation. The NNC helped organize the first IEEE World Congress on Computational Intelligence in Orlando, Florida in 1994. On this conference the first clear definition of Computational Intelligence was introduced by Bezdek: A system is computationally intelligent when it: deals with only numerical (low-level) data, has pattern-recognition components, does not use knowledge in the AI sense; and additionally when it (begins to) exhibit (1) computational adaptivity; (2) computational fault tolerance; (3) speed approaching human-like turnaround and (4) error rates that approximate human performance. Today, with machine learning and deep learning in particular utilizing a breadth of supervised, unsupervised, and reinforcement learning approaches, the CI landscape has been greatly enhanced, with novell intelligent approaches. == The main algorithmic approaches of CI and their applicati

    Read more →
  • Embodied cognitive science

    Embodied cognitive science

    Embodied cognitive science is an interdisciplinary field of research, the aim of which is to explain the mechanisms underlying intelligent behavior. It comprises three main methodologies: the modeling of psychological and biological systems in a holistic manner that considers the mind and body as a single entity; the formation of a common set of general principles of intelligent behavior; and the experimental use of robotic agents in controlled environments. == Contributors == Embodied cognitive science borrows heavily from embodied philosophy and the related research fields of cognitive science, psychology, neuroscience and artificial intelligence. Contributors to the field include: From the perspective of neuroscience, Gerald Edelman of the Neurosciences Institute at La Jolla, Francisco Varela of CNRS in France, and J. A. Scott Kelso of Florida Atlantic University From the perspective of psychology, Lawrence Barsalou, Michael Turvey, Vittorio Guidano and Eleanor Rosch From the perspective of linguistics, Gilles Fauconnier, George Lakoff, Mark Johnson, Leonard Talmy and Mark Turner From the perspective of language acquisition, Eric Lenneberg and Philip Rubin at Haskins Laboratories From the perspective of anthropology, Edwin Hutchins, Bradd Shore, James Wertsch and Merlin Donald. From the perspective of autonomous agent design, early work is sometimes attributed to Rodney Brooks or Valentino Braitenberg From the perspective of artificial intelligence, Understanding Intelligence by Rolf Pfeifer and Christian Scheier or How the Body Shapes the Way We Think, by Rolf Pfeifer and Josh C. Bongard From the perspective of philosophy, Andy Clark, Dan Zahavi, Shaun Gallagher, and Evan Thompson In 1950, Alan Turing proposed that a machine may need a human-like body to think and speak: It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English. That process could follow the normal teaching of a child. Things would be pointed out and named, etc. Again, I do not know what the right answer is, but I think both approaches should be tried. == Traditional cognitive theory == Embodied cognitive science is an alternative theory to cognition in which it minimizes appeals to computational theory of mind in favor of greater emphasis on how an organism's body determines how and what it thinks. Traditional cognitive theory is based mainly around symbol manipulation, in which certain inputs are fed into a processing unit that produces an output. These inputs follow certain rules of syntax, from which the processing unit finds semantic meaning. Thus, an appropriate output is produced. For example, a human's sensory organs are its input devices, and the stimuli obtained from the external environment are fed into the nervous system which serves as the processing unit. From here, the nervous system is able to read the sensory information because it follows a syntactic structure, thus an output is created. This output then creates bodily motions and brings forth behavior and cognition. Of particular note is that cognition is sealed away in the brain, meaning that mental cognition is cut off from the external world and is only possible by the input of sensory information. == The embodied cognitive approach == Embodied cognitive science differs from the traditionalist approach in that it denies the input-output system. This is chiefly due to the problems presented by the Homunculus argument, which concluded that semantic meaning could not be derived from symbols without some kind of inner interpretation. If some little man in a person's head interpreted incoming symbols, then who would interpret the little man's inputs? Because of the specter of an infinite regress, the traditionalist model began to seem less plausible. Thus, embodied cognitive science aims to avoid this problem by defining cognition in three ways. === Physical attributes of the body === The first aspect of embodied cognition examines the role of the physical body, particularly how its properties affect its ability to think. This part attempts to overcome the symbol manipulation component that is a feature of the traditionalist model. Depth perception, for instance, can be better explained under the embodied approach due to the sheer complexity of the action. Depth perception requires that the brain detect the disparate retinal images obtained by the distance of the two eyes. In addition, body and head cues complicate this further. When the head is turned in a given direction, objects in the foreground will appear to move against objects in the background. From this, it is said that some kind of visual processing is occurring without the need of any kind of symbol manipulation. This is because the objects appearing to move the foreground are simply appearing to move. This observation concludes then that depth can be perceived with no intermediate symbol manipulation necessary. A more poignant example exists through examining auditory perception. Generally speaking the greater the distance between the ears, the greater the possible auditory acuity. Also relevant is the amount of density in between the ears, for the strength of the frequency wave alters as it passes through a given medium. The brain's auditory system takes these factors into account as it process information, but again without any need for a symbolic manipulation system. This is because the distance between the ears for example does not need symbols to represent it. The distance itself creates the necessary opportunity for greater auditory acuity. The amount of density between the ears is similar, in that it is the actual amount itself that simply forms the opportunity for frequency alteration. Thus under consideration of the physical properties of the body, a symbolic system is unnecessary and an unhelpful metaphor. === The body's role in the cognitive process === The second aspect draws heavily from George Lakoff's and Mark Johnson's work on concepts. They argued that humans use metaphors whenever possible to better explain their external world. Humans also have a basic stock of concepts in which other concepts can be derived from. These basic concepts include spatial orientations such as up, down, front, and back. Humans can understand what these concepts mean because they can directly experience them from their own bodies. For example, because human movement revolves around standing erect and moving the body in an up-down motion, humans innately have these concepts of up and down. Lakoff and Johnson contend this is similar with other spatial orientations such as front and back too. As mentioned earlier, these basic stocks of spatial concepts are the basis in which other concepts are constructed. Happy and sad for instance are seen now as being up or down respectively. When someone says they are feeling down, what they are really saying is that they feel sad for example. Thus the point here is that true understanding of these concepts is contingent on whether one can have an understanding of the human body. So the argument goes that if one lacked a human body, they could not possibly know what up or down could mean, or how it could relate to emotional states. [I]magine a spherical being living outside of any gravitational field, with no knowledge or imagination of any other kind of experience. What could UP possibly mean to such a being? While this does not mean that such beings would be incapable of expressing emotions in other words, it does mean that they would express emotions differently from humans. Human concepts of happiness and sadness would be different because human would have different bodies. So then an organism's body directly affects how it can think, because it uses metaphors related to its body as the basis of concepts. === Interaction of local environment === A third component of the embodied approach looks at how agents use their immediate environment in cognitive processing. Meaning, the local environment is seen as an actual extension of the body's cognitive process. The example of a personal digital assistant (PDA) is used to better imagine this. Echoing functionalism (philosophy of mind), this point claims that mental states are individuated by their role in a much larger system. So under this premise, the information on a PDA is similar to the information stored in the brain. So then if one thinks information in the brain constitutes mental states, then it must follow that information in the PDA is a cognitive state too. Consider also the role of pen and paper in a complex multiplication problem. The pen and paper are so involved in the cognitive process of solving the problem that it seems ridiculous to say they are somehow different from the process, in very much the same way the PDA is used for information like the brain. Another example examines how humans control and manipulate their environment

    Read more →