AI Coding Scaffold

AI Coding Scaffold — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • EfficientNet

    EfficientNet

    EfficientNet is a family of convolutional neural networks (CNNs) for computer vision published by researchers at Google AI in 2019. Its key innovation is compound scaling, which uniformly scales all dimensions of depth, width, and resolution using a single parameter. EfficientNet models have been adopted in various computer vision tasks, including image classification, object detection, and segmentation. == Compound scaling == EfficientNet introduces compound scaling, which, instead of scaling one dimension of the network at a time, such as depth (number of layers), width (number of channels), or resolution (input image size), uses a compound coefficient ϕ {\displaystyle \phi } to scale all three dimensions simultaneously. Specifically, given a baseline network, the depth, width, and resolution are scaled according to the following equations: depth multiplier: d = α ϕ width multiplier: w = β ϕ resolution multiplier: r = γ ϕ {\displaystyle {\begin{aligned}{\text{depth multiplier: }}d&=\alpha ^{\phi }\\{\text{width multiplier: }}w&=\beta ^{\phi }\\{\text{resolution multiplier: }}r&=\gamma ^{\phi }\end{aligned}}} subject to α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} and α ≥ 1 , β ≥ 1 , γ ≥ 1 {\displaystyle \alpha \geq 1,\beta \geq 1,\gamma \geq 1} . The α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} condition is such that increasing ϕ {\displaystyle \phi } by a factor of ϕ 0 {\displaystyle \phi _{0}} would increase the total FLOPs of running the network on an image approximately 2 ϕ 0 {\displaystyle 2^{\phi _{0}}} times. The hyperparameters α {\displaystyle \alpha } , β {\displaystyle \beta } , and γ {\displaystyle \gamma } are determined by a small grid search. The original paper suggested 1.2, 1.1, and 1.15, respectively. Architecturally, they optimized the choice of modules by neural architecture search (NAS), and found that the inverted bottleneck convolution (which they called MBConv) used in MobileNet worked well. The EfficientNet family is a stack of MBConv layers, with shapes determined by the compound scaling. The original publication consisted of 8 models, from EfficientNet-B0 to EfficientNet-B7, with increasing model size and accuracy. EfficientNet-B0 is the baseline network, and subsequent models are obtained by scaling the baseline network by increasing ϕ {\displaystyle \phi } . == Variants == EfficientNet has been adapted for fast inference on edge TPUs and centralized TPU or GPU clusters by NAS. EfficientNet V2 was published in June 2021. The architecture was improved by further NAS search with more types of convolutional layers. It also introduced a training method, which progressively increases image size during training, and uses regularization techniques like dropout, RandAugment, and Mixup. The authors claim this approach mitigates accuracy drops often associated with progressive resizing.

    Read more →
  • Three-factor learning

    Three-factor learning

    In neuroscience and machine learning, three-factor learning is the combination of Hebbian plasticity with a third modulatory factor to stabilise and enhance synaptic learning. This third factor can represent various signals such as reward, punishment, error, surprise, or novelty, often implemented through neuromodulators. == Description == Three-factor learning introduces the concept of eligibility traces, which flag synapses for potential modification pending the arrival of the third factor, and helps temporal credit assignement by bridging the gap between rapid neuronal firing and slower behavioral timescales, from which learning can be done. Biological basis for Three-factor learning rules have been supported by experimental evidence. This approach addresses the instability of classical Hebbian learning by minimizing autocorrelation and maximizing cross-correlation between inputs.

    Read more →
  • Quantification (machine learning)

    Quantification (machine learning)

    In machine learning, quantification (variously called learning to quantify, or supervised prevalence estimation, or class prior estimation) is the task of using supervised learning in order to train models (quantifiers) that estimate the relative frequencies (also known as prevalence values) of the classes of interest in a sample of unlabelled data items. For instance, in a sample of 100,000 unlabelled tweets known to express opinions about a certain political candidate, a quantifier may be used to estimate the percentage of these tweets which belong to class `Positive' (i.e., which manifest a positive stance towards this candidate), and to do the same for classes `Neutral' and `Negative'. Quantification may also be viewed as the task of training predictors that estimate a (discrete) probability distribution, i.e., that generate a predicted distribution that approximates the unknown true distribution of the items across the classes of interest. Quantification is different from classification, since the goal of classification is to predict the class labels of individual data items, while the goal of quantification it to predict the class prevalence values of sets of data items. Quantification is also different from regression, since in regression the training data items have real-valued labels, while in quantification the training data items have class labels. It has been shown in multiple research works that performing quantification by classifying all unlabelled instances and then counting the instances that have been attributed to each class (the 'classify and count' method) usually leads to suboptimal quantification accuracy. This suboptimality may be seen as a direct consequence of 'Vapnik's principle', which states: If you possess a restricted amount of information for solving some problem, try to solve the problem directly and never solve a more general problem as an intermediate step. It is possible that the available information is sufficient for a direct solution but is insufficient for solving a more general intermediate problem. In our case, the problem to be solved directly is quantification, while the more general intermediate problem is classification. As a result of the suboptimality of the 'classify and count' method, quantification has evolved as a task in its own right, different (in goals, methods, techniques, and evaluation measures) from classification. == Quantification tasks == === Quantification tasks according to the set of classes === The main variants of quantification, according to the characteristics of the set of classes used, are: Binary quantification, corresponding to the case in which there are only n = 2 {\displaystyle n=2} classes and each data item belongs to exactly one of them; Single-label multiclass quantification, corresponding to the case in which there are n > 2 {\displaystyle n>2} classes and each data item belongs to exactly one of them; Multi-label multiclass quantification, corresponding to the case in which there are n ≥ 2 {\displaystyle n\geq 2} classes and each data item can belong to zero, one, or several classes at the same time; Ordinal quantification, corresponding to the single-label multiclass case in which a total order is defined on the set of classes. Regression quantification, a task which stands to 'standard' quantification as regression stands to classification. Strictly speaking, this task is not a quantification task as defined above (since the individual items do not have class labels but are labelled by real values), but has enough commonalities with other quantification tasks to be considered one of them. Most known quantification methods address the binary case or the single-label multiclass case, and only few of them address the multi-label, ordinal, and regression cases. Binary-only methods include the Mixture Model (MM) method, the HDy method, SVM(KLD), and SVM(Q). Methods that can deal with both the binary case and the single-label multiclass case include probabilistic classify and count (PCC), adjusted classify and count (ACC), probabilistic adjusted classify and count (PACC), the Saerens-Latinne-Decaestecker EM-based method (SLD), and KDEy. Methods for multi-label quantification include regression-based quantification (RQ) and label powerset-based quantification (LPQ). Methods for the ordinal case include ordinal versions of the above-mentioned ACC, PACC, and SLD methods, and ordinal versions of the above-mentioned HDy method. Methods for the regression case include Regress and splice and Adjusted regress and sum. === Quantification tasks according to the type of data === Several subtasks of quantification may be identified according to the type of data involved. Example such tasks are: Quantification of networked data. This task consists of performing quantification when the datapoints are members of a relation, i.e., are interlinked. As such, this task is a strict relative of collective classification. Quantification over time. This task consists of performing quantification on sets that become available in a temporal sequence, i.e., as a data stream, and finds application in contexts in which class prevalence values must be monitored over time. == Evaluation measures for quantification == Several evaluation measures can be used for evaluating the error of a quantification method. Since quantification consists of generating a predicted probability distribution that estimates a true probability distribution, these evaluation measures are ones that compare two probability distributions. Most evaluation measures for quantification belong to the class of divergences. Evaluation measures for binary quantification, single-label multiclass quantification, and multi-label quantification, are Absolute Error Squared Error Relative Absolute Error Kullback–Leibler divergence Pearson Divergence Evaluation measures for ordinal quantification are Normalized Match Distance (a particular case of the Earth Mover's Distance) Root Normalized Order-Aware Distance == Applications == Quantification is of special interest in fields such as the social sciences, epidemiology, market research, allocating resources, and ecological modelling, since these fields are inherently concerned with aggregate data. However, quantification is also useful as a building block for solving other downstream tasks, such as improving the accuracy of classifiers on out-of-distribution data, measuring classifier bias and ranker bias, and estimating the accuracy of classifiers on out-of-distribution data. == Resources == LQ 2021: the 1st International Workshop on Learning to Quantify LQ 2022: the 2nd International Workshop on Learning to Quantify LQ 2023: the 3rd International Workshop on Learning to Quantify LQ 2024: the 4th International Workshop on Learning to Quantify LQ 2025: the 5th International Workshop on Learning to Quantify LeQua 2022: the 1st Data Challenge on Learning to Quantify LeQua 2024: the 2nd Data Challenge on Learning to Quantify QuaPy: An open-source Python-based software library for quantification QuantificationLib: A Python library for quantification and prevalence estimation

    Read more →
  • Purged cross-validation

    Purged cross-validation

    Purged cross-validation is a variant of k-fold cross-validation designed to prevent look-ahead bias in time series and other structured data, developed in 2017 by Marcos López de Prado at Guggenheim Partners and Cornell University. It is primarily used in financial machine learning to ensure the independence of training and testing samples when labels depend on future events. It provides an alternative to conventional cross-validation and walk-forward backtesting methods, which often yield overly optimistic performance estimates due to information leakage and overfitting. == Motivation == Standard cross-validation assumes that observations are independently and identically distributed (IID), which often does not hold in time series or financial datasets. If the label of a test sample overlaps in time with the features or labels in the training set, the result may be data leakage and overfitting. Purged cross-validation addresses this issue by removing overlapping observations and, optionally, adding a temporal buffer ("embargo") around the test set to further reduce the risk of leakage. The figure below illustrates standard 5 Fold Cross-Validation == Purging == Purging removes from the training set any observation whose timestamp falls within the time range of formation of a label in the test set. This can be the case for train set observations before and after the test set. Their removal ensures that the algorithm cannot learn during train time information that will be used to assess the performance of the algorithm. See the figure below for an illustration of purging. == Embargoing == Embargoing addresses a more subtle form of leakage: even if an observation does not directly overlap the test set, it may still be affected by test events due to market reaction lag or downstream dependencies. To guard against this, a percentage-based embargo is imposed after each test fold. For example, with a 5% embargo and 1000 observations, the 50 observations following each test fold are excluded from training. Unlike purging, embargoing can only occur after the test set. The figure below illustrates the application of embargo: == Applications == Purged and embargoed cross-validation has been useful in: Backtesting of trading strategies Validation of classifiers on labeled event-driven returns Any machine learning task with overlapping label horizons == Example == To illustrate the effect of purging and embargoing, consider the figures below. Both diagrams show the structure of 5-fold cross-validation over a 20-day period. In each row, blue squares indicate training samples and red squares denote test samples. Each label is defined based on the value of the next two observations, hence creating an overlap. If this overlap is left untreated, test set information leaks into the train set. The second figure applies the Purged CV procedure. Notice how purging removes overlapping observations from the training set and the embargo widens the gap between test and training data. This approach ensures that the evaluation more closely resembles a true out-of-sample test and reduces the risk of backtest overfitting. == Combinatorial Purged Cross-Validation == Walk-forward backtesting analysis, another common cross-validation technique in finance, preserves temporal order but evaluates the model on a single sequence of test sets. This leads to high variance in performance estimation, as results are contingent on a specific historical path. Combinatorial Purged Cross-Validation (CPCV) addresses this limitation by systematically constructing multiple train-test splits, purging overlapping samples, and enforcing an embargo period to prevent information leakage. The result is a distribution of out-of-sample performance estimates, enabling robust statistical inference and more realistic assessment of a model's predictive power. === Methodology === CPCV divides a time-series dataset into N sequential, non-overlapping groups. These groups preserve the temporal order of observations. Then, all combinations of k groups (where k < N) are selected as test sets, with the remaining N − k groups used for training. For each combination, the model is trained and evaluated under strict controls to prevent leakage. To eliminate potential contamination between training and test sets, CPCV introduces two additional mechanisms: Purging: Any training observations whose label horizon overlaps with the test period are excluded. This ensures that future information does not influence model training. Embargoing: After the end of each test period, a fixed number of observations (typically a small percentage) are removed from the training set. This prevents leakage due to delayed market reactions or auto-correlated features. Each data point appears in multiple test sets across different combinations. Because test groups are drawn combinatorially, this process produces multiple backtest "paths," each of which simulates a plausible market scenario. From these paths, practitioners can compute a distribution of performance statistics such as the Sharpe ratio, drawdown, or classification accuracy. === Formal definition === Let N be the number of sequential groups into which the dataset is divided, and let k be the number of groups selected as the test set for each split. Then: The number of unique train-test combinations is given by the binomial coefficient: ( N k ) {\displaystyle {\binom {N}{k}}} Each observation is used in k {\displaystyle k} test sets and contributes to φ [ N , k ] {\displaystyle \varphi [N,k]} unique backtest paths: φ [ N , k ] = k N ( N k ) {\displaystyle \varphi [N,k]={\frac {k}{N}}{\binom {N}{k}}} This yields a distribution of performance metrics rather than a single point estimate, making it possible to apply Monte Carlo-based or probabilistic techniques to assess model robustness. === Illustrative example === Consider the case where N = 6 and k = 2. The number of possible test set combinations is ( 6 2 ) = 15 {\displaystyle {\binom {6}{2}}=15} . Each of the six groups appears in five test splits. Consequently, five distinct backtest paths can be constructed, each incorporating one appearance from every group. ==== Test group assignment matrix ==== This table shows the 15 test combinations. An "x" indicates that the corresponding group is included in the test set for that split. ==== Backtest path assignment ==== Each group contributes to five different backtest paths. The number in each cell indicates the path to which the group's result is assigned for that split. === Advantages === Combinatorial Purged Cross-Validation offers several key benefits over conventional methods: It produces a distribution of performance metrics, enabling more rigorous statistical inference. The method systematically eliminates lookahead bias through purging and embargoing. By simulating multiple historical scenarios, it reduces the dependence on any single market regime or realization. It supports high-confidence comparisons between competing models or strategies. CPCV is commonly used in quantitative strategy research, especially for evaluating predictive models such as classifiers, regressors, and portfolio optimizers. It has been applied to estimate realistic Sharpe ratios, assess the risk of overfitting, and support the use of statistical tools such as the Deflated Sharpe Ratio (DSR). === Limitations === The main limitation of CPCV stems from its high computational cost. However, this cost can be managed by sampling a finite number of splits from the space of all possible combinations.

    Read more →
  • Personal cloud

    Personal cloud

    A personal cloud is a collection of digital content and services that are accessible from any device through the Internet. It is not a tangible entity, but a place that gives users the ability to store, synchronize, stream and share content on a relative core, moving from one platform, screen and location to another. Created on connected services and applications, it reflects and sets consumer expectations for how next-generation computing services will work. The four primary types of personal cloud in use today are: Online cloud, NAS device cloud, server device cloud, and home-made clouds. == Online cloud == The online cloud is sometimes referred to as the public cloud. It is the cloud computing model where online resources like software and data storage are made available over the Internet. Typically, an individual or organization has little control over the ecosystem in which the online cloud is hosted, and the core infrastructure is shared between many individuals and organizations. The data and applications provided by the service provider are logically segregated so that only those authorized are allowed access. == NAS device cloud == A network-attached storage (NAS) device is a computer connected to a network that provides only file-based data storage services to other devices on the network. Although it may technically be possible to run other software on a NAS device, it is not designed to be a general purpose server. Cloud NAS is remote storage that is accessed over the Internet as if it were local. A cloud NAS is often used for backups and archiving. One of the benefits of NAS Cloud is that data in the cloud can be accessed at any time from anywhere. The main drawback, however, is that the speed of the transfer rate is only as fast as the network connection the data is accessed over and can therefore be fairly slow. == Server device cloud == In many ways cloud servers work in the same way as physical servers but the functions they perform can be very different. Typically, the cloud server is an on-premises device that is connected to the Internet and gives users the functions available on the online cloud but with the added benefit and security of the files being in their control on their premises. The server cloud has been historically enterprise-based deployed by businesses needing an in-house cloud. However, there are also in-house options available for individual users. == Home-made clouds == For the more technologically proficient user a common solution for using a personal cloud is to create a home-made cloud system by connecting an external USB hard drive to a Wi-Fi router. This enables both wired and wireless computers to access the USB hard drive and use it for storage or for retrieving files a user needs to share on the network thereby acting like a cloud. Setting up a personal cloud requires a user to have particular skills in technology and network setup. One of the risks associated with improper setup is security, and leaving the files accessible to anyone with technical knowledge. Not every router supports this type of access and modification.

    Read more →
  • Evaluation of binary classifiers

    Evaluation of binary classifiers

    Evaluation of a binary classifier typically assigns a numerical value, or values, to a classifier that represent its accuracy. An example is error rate, which measures how frequently the classifier makes a mistake. There are many metrics that can be used; different fields have different preferences. For example, in medicine sensitivity and specificity are often used, while in computer science precision and recall are preferred. An important distinction is between metrics that are independent of the prevalence or skew (how often each class occurs in the population), and metrics that depend on the prevalence – both types are useful, but they have very different properties. Often, evaluation is used to compare two methods of classification, so that one can be adopted and the other discarded. Such comparisons are more directly achieved by a form of evaluation that results in a single unitary metric rather than a pair of metrics. == Contingency table == Given a data set, a classification (the output of a classifier on that set) gives two numbers: the number of positives and the number of negatives, which add up to the total size of the set. To evaluate a classifier, one compares its output to another reference classification – ideally a perfect classification, but in practice the output of another gold standard test – and cross tabulates the data into a 2×2 contingency table, comparing the two classifications. One then evaluates the classifier relative to the gold standard by computing summary statistics of these 4 numbers. Generally these statistics will be scale invariant (scaling all the numbers by the same factor does not change the output), to make them independent of population size, which is achieved by using ratios of homogeneous functions, most simply homogeneous linear or homogeneous quadratic functions. Say we test some people for the presence of a disease. Some of these people have the disease, and our test correctly says they are positive. They are called true positives (TP). Some have the disease, but the test incorrectly claims they don't. They are called false negatives (FN). Some don't have the disease, and the test says they don't – true negatives (TN). Finally, there might be healthy people who have a positive test result – false positives (FP). These can be arranged into a 2×2 contingency table (confusion matrix), conventionally with the test result on the vertical axis and the actual condition on the horizontal axis. These numbers can then be totaled, yielding both a grand total and marginal totals. Totaling the entire table, the number of true positives, false negatives, true negatives, and false positives add up to 100% of the set. Totaling the columns (adding vertically) the number of true positives and false positives add up to 100% of the test positives, and likewise for negatives. Totaling the rows (adding horizontally), the number of true positives and false negatives add up to 100% of the condition positives (conversely for negatives). The basic marginal ratio statistics are obtained by dividing the 2×2=4 values in the table by the marginal totals (either rows or columns), yielding 2 auxiliary 2×2 tables, for a total of 8 ratios. These ratios come in 4 complementary pairs, each pair summing to 1, and so each of these derived 2×2 tables can be summarized as a pair of 2 numbers, together with their complements. Further statistics can be obtained by taking ratios of these ratios, ratios of ratios, or more complicated functions. The contingency table and the most common derived ratios are summarized below; see sequel for details. Note that the rows correspond to the condition actually being positive or negative (or classified as such by the gold standard), as indicated by the color-coding, and the associated statistics are prevalence-independent, while the columns correspond to the test being positive or negative, and the associated statistics are prevalence-dependent. There are analogous likelihood ratios for prediction values, but these are less commonly used, and not depicted above. == Pairs of metrics == Often accuracy is evaluated with a pair of metrics composed in a standard pattern. === Sensitivity and specificity === The fundamental prevalence-independent statistics are sensitivity and specificity. Sensitivity or True Positive Rate (TPR), also known as recall, is the proportion of people that tested positive and are positive (True Positive, TP) of all the people that actually are positive (Condition Positive, CP = TP + FN). It can be seen as the probability that the test is positive given that the patient is sick. With higher sensitivity, fewer actual cases of disease go undetected (or, in the case of the factory quality control, fewer faulty products go to the market). Specificity (SPC) or True Negative Rate (TNR) is the proportion of people that tested negative and are negative (True Negative, TN) of all the people that actually are negative (Condition Negative, CN = TN + FP). As with sensitivity, it can be looked at as the probability that the test result is negative given that the patient is not sick. With higher specificity, fewer healthy people are labeled as sick (or, in the factory case, fewer good products are discarded). The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the Receiver Operating Characteristic (ROC) curve. In theory, sensitivity and specificity are independent in the sense that it is possible to achieve 100% in both (such as in the red/blue ball example given above). In more practical, less contrived instances, however, there is usually a trade-off, such that they are inversely proportional to one another to some extent. This is because we rarely measure the actual thing we would like to classify; rather, we generally measure an indicator of the thing we would like to classify, referred to as a surrogate marker. The reason why 100% is achievable in the ball example is because redness and blueness is determined by directly detecting redness and blueness. However, indicators are sometimes compromised, such as when non-indicators mimic indicators or when indicators are time-dependent, only becoming evident after a certain lag time. The following example of a pregnancy test will make use of such an indicator. Modern pregnancy tests do not use the pregnancy itself to determine pregnancy status; rather, human chorionic gonadotropin is used, or hCG, present in the urine of gravid females, as a surrogate marker to indicate that a woman is pregnant. Because hCG can also be produced by a tumor, the specificity of modern pregnancy tests cannot be 100% (because false positives are possible). Also, because hCG is present in the urine in such small concentrations after fertilization and early embryogenesis, the sensitivity of modern pregnancy tests cannot be 100% (because false negatives are possible). === Positive and negative predictive values === In addition to sensitivity and specificity, the performance of a binary classification test can be measured with positive predictive value (PPV), also known as precision, and negative predictive value (NPV). The positive prediction value answers the question "If the test result is positive, how well does that predict an actual presence of disease?". It is calculated as TP/(TP + FP); that is, it is the proportion of true positives out of all positive results. The negative prediction value is the same, but for negatives, naturally. ==== Impact of prevalence on predictive values ==== Prevalence has a significant impact on prediction values. As an example, suppose there is a test for a disease with 99% sensitivity and 99% specificity. If 2000 people are tested and the prevalence (in the sample) is 50%, 1000 of them are sick and 1000 of them are healthy. Thus about 990 true positives and 990 true negatives are likely, with 10 false positives and 10 false negatives. The positive and negative prediction values would be 99%, so there can be high confidence in the result. However, if the prevalence is only 5%, so of the 2000 people only 100 are really sick, then the prediction values change significantly. The likely result is 99 true positives, 1 false negative, 1881 true negatives and 19 false positives. Of the 19+99 people tested positive, only 99 really have the disease – that means, intuitively, that given that a patient's test result is positive, there is only 84% chance that they really have the disease. On the other hand, given that the patient's test result is negative, there is only 1 chance in 1882, or 0.05% probability, that the patient has the disease despite the test result. === Precision and recall === Precision and recall can be interpreted as (estimated) conditional probabilities: Precision is given by P ( C = P | C ^ = P ) {\displaystyle P(C=P|{\hat {C}}=P)} while recall is given by P ( C ^ = P | C = P ) {\displaystyle P({\hat {C}}=P|C=P)} , where C ^ {\

    Read more →
  • Isotropic position

    Isotropic position

    In the fields of machine learning, the theory of computation, and random matrix theory, a probability distribution over vectors is said to be in isotropic position if its covariance matrix is proportional to the identity matrix. == Formal definitions == Let D {\textstyle D} be a distribution over vectors in the vector space R n {\textstyle \mathbb {R} ^{n}} . Then D {\textstyle D} is in isotropic position if, for vector v {\textstyle v} sampled from the distribution, E v v T = I d . {\displaystyle \mathbb {E} \,vv^{\mathsf {T}}=\mathrm {Id} .} A set of vectors is said to be in isotropic position if the uniform distribution over that set is in isotropic position. In particular, every orthonormal set of vectors is isotropic. As a related definition, a convex body K {\textstyle K} in R n {\textstyle \mathbb {R} ^{n}} is called isotropic if it has volume | K | = 1 {\textstyle |K|=1} , center of mass at the origin, and there is a constant α > 0 {\textstyle \alpha >0} such that ∫ K ⟨ x , y ⟩ 2 d x = α 2 | y | 2 , {\displaystyle \int _{K}\langle x,y\rangle ^{2}dx=\alpha ^{2}|y|^{2},} for all vectors y {\textstyle y} in R n {\textstyle \mathbb {R} ^{n}} ; here | ⋅ | {\textstyle |\cdot |} stands for the standard Euclidean norm.

    Read more →
  • Automated decision-making

    Automated decision-making

    Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra

    Read more →
  • Digital image processing

    Digital image processing

    Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more), digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); and third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased. == History == Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing includes image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest. === Image sensors === The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology, invented at Bell Labs between 1955 and 1960, This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor. The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting. The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors. MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5 μm NMOS integrated circuit sensor chip. Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors. === Image compression === An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972. DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015. Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression. JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data. === Digital signal processor (DSP) === Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s, and then the first single-chip digital signal processor (DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips. == Tasks == Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means. In particular, digital image processing is a concrete application of, and a practical technology based on: Classification Feature extraction Multi-scale signal analysis Pattern recognition Projection Some techniques that are used in digital image processing include: Anisotropic diffusion Hidden Markov models Image editing Image restoration Independent component analysis Linear filtering Neural networks Partial differential equations Pixelation Point feature matching Principal components analysis Self-organizing maps Wavelets == Digital image transformations == === Filtering === Digital filters are used to blur and sharpen digital images. Filtering can be performed by: convolution with specifically designed kernels (filter array) in the spatial domain masking specific frequency regions in the frequency (Fourier) domain The following examples show both methods: ==== Image padding in Fourier domain filtering ==== Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques: Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding. ==== Filtering code examples ==== MATLAB example for spatial domain highpass filtering. === Affine transformations === Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples: To apply the affine

    Read more →
  • Operation Serenata de Amor

    Operation Serenata de Amor

    Operation Serenata de Amor is an artificial intelligence project designed to analyze public spending in Brazil. The project has been funded by a recurrent financing campaign since September 7, 2016, and came in the wake of major scandals of misappropriation of public funds in Brazil, such as the Mensalão scandal and what was revealed in the Operation Car Wash investigations. The analysis began with data from the National Congress then expanded to other types of budget and instances of government, such as the Federal Senate. The project is built through collaboration on GitHub and using a public group with more than 600 participants on Telegram. The name "Serenata de Amor," which means "serenade of love," was taken from a popular cashew cream bonbon produced by Chocolates Garoto in Brazil. == Modules == Throughout development of the project, new modules have been newly introduced in addition to the main repository: The main repository, serenata-de-amor, serves as the starting point for investigative work. Rosie is the robot programmed to identify public funds expenses with discrepancies, starting with CEAP (Quota for Exercise of Parliamentary Activity); it analyzes each of the reimbursements requested by the deputies and senators, indicating the reasons that lead it to believe they are suspicious. From Rosie was born whistleblower, which tweets under the name of @RosieDaSerenata, distributing the results found on social media. Jarbas (Github repository) is a data visualization tool which shows a complete list of reimbursements made available by the Chamber of Deputies and mined by Rosie. Toolbox is a Python installable package that supports the development of Serenata de Amor and Rosie. == History == Operation Serenata de Amor is an Artificial intelligence project for analysis of public expenditures. It was conceived in March 2016 by data scientist Irio Musskopf, sociologist Eduardo Cuducos and entrepreneur Felipe Cabral. The project was financed collectively in the Catarse platform, where it reached 131% of the collection goal paying 3 months of project development. Ana Schwendler, also a data scientist, Pedro Vilanova "Tonny", data journalist, Bruno Pazzim, software engineer, Filipe Linhares, a frontend engineer, Leandro Devegili, an entrepreneur and André Pinho took the first steps towards constructing the platform, such as collecting and structuring the first datasets. Jessica Temporal, data scientist and Yasodara Córdova "Yaso", researcher, Tatiana Balachova "Russa", UX designer, joined the project after the financing took place. The members created a recurring financing campaign, expanding the analysis of public spending to the Federal Senate. Donors make monthly payments ranging from 5 BRL to 200 BRL to maintain group activities. The monthly amount collected is around 10,000 BRL. == Results == In January 2017, concluding the period financed by the initial campaign, the group carried out an investigation into the suspicious activities found by the data analysis system. 629 complaints were made to the Ombudsman's Office of the Chamber of Deputies, questioning expenses of 216 federal deputies. In addition, the Facebook project page has more than 25,000 followers, and users frequently cite the operation as a benchmark in transparency in the Brazilian government. One of the examples of results obtained by the operation is the case of the Deputy who had to return about 700 BRL to the House after his expenses were analyzed by the platform. The platform was able to analyze more than 3 million notes, raising about 8,000 suspected cases in public spending. The community that supports the work of the team benefits from open source repositories, with licenses open for the collaboration. So much so that the two main data scientists of the project presented it at the CivicTechFest in Taipei, obtaining several mentions even in the international press. The technical leader presented the project in Poland during DevConf2017 in Kraków. It was also presented in the Google News Lab in 2017. It was presented by Yaso, when she was the Director of the initiative, at the MIT Media Lab/Berkman Klein Center Initiative for Artificial Intelligence ethics, and at the Artificial Intelligence and Inclusion Symposium, an initiative of the Global Network of Internet & Society Centers (NoC). It was also presented both by Irio and Yaso at the Digital Harvard Kennedy School, over a lunch seminar, where the transparency of the platform and the main solutions found were discussed, so that the code and data are always available to verify its suitability. This infographic provides information about the first results of Operation Serenata de Amor, a project that analyzes open data on public spending to find discrepancies. The project was presented by Yaso to the House Audit and Control Committee of the Chamber of Deputies in August 2017, and raised the interest of House officials who work with open data. The operation has been a source of inspiration for other civic projects that aim to work with similar goals, demonstrating the broader impact of artificial intelligence also in industry in Brazil. Participation of several team members in events throughout Brazil and abroad can be found on the Internet, such as presentation at OpenDataDay, held at Calango Hackerspace in the Federal District, Campus Party Bahia, Campus Party Brasilia, Friends of Tomorrow, XIII National Meeting of Internal Control, in the event USP Talks Hackfest against corruption in João Pessoa, the latter being also highlighted in the National Press.

    Read more →
  • Intelligent agent

    Intelligent agent

    In artificial intelligence, an intelligent agent is an entity that perceives its environment, takes actions autonomously to achieve goals, and may improve its performance through machine learning or by acquiring knowledge. AI textbooks define artificial intelligence as the "study and design of intelligent agents," emphasizing that goal-directed behavior is central to intelligence. A specialized subset of intelligent agents, agentic AI (also known as an AI agent or simply agent), expands this concept by proactively pursuing goals, making decisions, and taking actions over extended periods. Intelligent agents can range from simple to highly complex. A basic thermostat or control system is considered an intelligent agent, as is a human being, or any other system that meets the same criteria—such as a firm, a state, or a biome. Intelligent agents operate based on an objective function, which encapsulates their goals. They are designed to create and execute plans that maximize the expected value of this function upon completion. For example, a reinforcement learning agent has a reward function, which allows programmers to shape its desired behavior. Similarly, an evolutionary algorithm's behavior is guided by a fitness function. Intelligent agents in artificial intelligence are closely related to agents in economics, and versions of the intelligent agent paradigm are studied in cognitive science, ethics, and the philosophy of practical reason, as well as in many interdisciplinary socio-cognitive modeling and computer social simulations. Intelligent agents are often described schematically as abstract functional systems similar to computer programs . To distinguish theoretical models from real-world implementations, abstract descriptions of intelligent agents are called abstract intelligent agents. Intelligent agents are also closely related to software agents—autonomous computer programs that carry out tasks on behalf of users. They are also referred to using a term borrowed from economics: a "rational agent". == Intelligent agents as the foundation of AI == The concept of intelligent agents provides a foundational lens through which to define and understand artificial intelligence. For instance, the influential textbook Artificial Intelligence: A Modern Approach (Russell & Norvig) describes: Agent: Anything that perceives its environment (using sensors) and acts upon it (using actuators). E.g., a robot with cameras and wheels, or a software program that reads data and makes recommendations. Rational Agent: An agent that strives to achieve the best possible outcome based on its knowledge and past experiences. "Best" is defined by a performance measure – a way of evaluating how well the agent is doing. Artificial Intelligence (as a field): The study and creation of these rational agents. Other researchers and definitions build upon this foundation. Padgham & Winikoff emphasize that intelligent agents should react to changes in their environment in a timely way, proactively pursue goals, and be flexible and robust (able to handle unexpected situations). Some also suggest that ideal agents should be "rational" in the economic sense (making optimal choices) and capable of complex reasoning, like having beliefs, desires, and intentions (BDI model). Kaplan and Haenlein offer a similar definition, focusing on a system's ability to understand external data, learn from that data, and use what is learned to achieve goals through flexible adaptation. Defining AI in terms of intelligent agents offers several key advantages: Avoids Philosophical Debates: It sidesteps arguments about whether AI is "truly" intelligent or conscious, like those raised by the Turing test or Searle's Chinese Room. It focuses on behavior and goal achievement, not on replicating human thought. Objective Testing: It provides a clear, scientific way to evaluate AI systems. Researchers can compare different approaches by measuring how well they maximize a specific "goal function" (or objective function). This allows for direct comparison and combination of techniques. Interdisciplinary Communication: It creates a common language for AI researchers to collaborate with other fields like mathematical optimization and economics, which also use concepts like "goals" and "rational agents." == Objective function == An objective function (or goal function) specifies the goals of an intelligent agent. An agent is deemed more intelligent if it consistently selects actions that yield outcomes better aligned with its objective function. In effect, the objective function serves as a measure of success. The objective function may be: Simple: For example, in a game of Go, the objective function might assign a value of 1 for a win and 0 for a loss. Complex: It might require the agent to evaluate and learn from past actions, adapting its behavior based on patterns that have proven effective. The objective function encapsulates all of the goals the agent is designed to achieve. For rational agents, it also incorporates the trade-offs between potentially conflicting goals. For instance, a self-driving car's objective function might balance factors such as safety, speed, and passenger comfort. Different terms are used to describe this concept, depending on the context. These include: Utility function: Often used in economics and decision theory, representing the desirability of a state. Objective function: A general term used in optimization. Loss function: Typically used in machine learning, where the goal is to minimize the loss (error). Reward Function: Used in reinforcement learning. Fitness Function: Used in evolutionary systems. Goals, and therefore the objective function, can be: Explicitly defined: Programmed directly into the agent. Induced: Learned or evolved over time. In reinforcement learning, a "reward function" provides feedback, encouraging desired behaviors and discouraging undesirable ones. The agent learns to maximize its cumulative reward. In evolutionary systems, a "fitness function" determines which agents are more likely to reproduce. This is analogous to natural selection, where organisms evolve to maximize their chances of survival and reproduction. Some AI systems, such as nearest-neighbor, reason by analogy rather than being explicitly goal-driven. However, even these systems can have goals implicitly defined within their training data. Such systems can still be benchmarked by framing the non-goal system as one whose "goal" is to accomplish its narrow classification task. Systems not traditionally considered agents, like knowledge-representation systems, are sometimes included in the paradigm by framing them as agents with a goal of, for example, answering questions accurately. Here, the concept of an "action" is extended to encompass the "act" of providing an answer. As a further extension, mimicry-driven systems can be framed as agents optimizing a "goal function" based on how closely the agent mimics the desired behavior. In generative adversarial networks (GANs) of the 2010s, an "encoder"/"generator" component attempts to mimic and improvise human text composition. The generator tries to maximize a function representing how well it can fool an antagonistic "predictor"/"discriminator" component. While symbolic AI systems often use an explicit goal function, the paradigm also applies to neural networks and evolutionary computing. Reinforcement learning can generate intelligent agents that appear to act in ways intended to maximize a "reward function". Sometimes, instead of setting the reward function directly equal to the desired benchmark evaluation function, machine learning programmers use reward shaping to initially give the machine rewards for incremental progress. Yann LeCun stated in 2018, "Most of the learning algorithms that people have come up with essentially consist of minimizing some objective function." AlphaZero chess had a simple objective function: +1 point for each win, and -1 point for each loss. A self-driving car's objective function would be more complex. Evolutionary computing can evolve intelligent agents that appear to act in ways intended to maximize a "fitness function" influencing how many descendants each agent is allowed to leave. The mathematical formalism of AIXI was proposed as a maximally intelligent agent in this paradigm. However, AIXI is uncomputable. In the real world, an intelligent agent is constrained by finite time and hardware resources, and scientists compete to produce algorithms that achieve progressively higher scores on benchmark tests with existing hardware. == Agent function == An intelligent agent's behavior can be described mathematically by an agent function. This function determines what the agent does based on what it has seen. A percept refers to the agent's sensory inputs at a single point in time. For example, a self-driving car's percepts might include camera images, lidar data, GPS coordinates, and speed r

    Read more →
  • Intelligent decision support system

    Intelligent decision support system

    An intelligent decision support system (IDSS) is a decision support system that makes extensive use of artificial intelligence (AI) techniques. Use of AI techniques in management information systems has a long history – indeed terms such as "Knowledge-based systems" (KBS) and "intelligent systems" have been used since the early 1980s to describe components of management systems, but the term "Intelligent decision support system" is thought to originate with Clyde Holsapple and Andrew Whinston in the late 1970s. Examples of specialized intelligent decision support systems include Flexible manufacturing systems (FMS), intelligent marketing decision support systems and medical diagnosis systems. Ideally, an intelligent decision support system should behave like a human consultant: supporting decision makers by gathering and analysing evidence, identifying and diagnosing problems, proposing possible courses of action and evaluating such proposed actions. The aim of the AI techniques embedded in an intelligent decision support system is to enable these tasks to be performed by a computer, while emulating human capabilities as closely as possible. Many IDSS implementations are based on expert systems, a well established type of KBS that encode knowledge and emulate the cognitive behaviours of human experts using predicate logic rules, and have been shown to perform better than the original human experts in some circumstances. Expert systems emerged as practical applications in the 1980s based on research in artificial intelligence performed during the late 1960s and early 1970s. They typically combine knowledge of a particular application domain with an inference capability to enable the system to propose decisions or diagnoses. Accuracy and consistency can be comparable to (or even exceed) that of human experts when the decision parameters are well known (e.g. if a common disease is being diagnosed), but performance can be poor when novel or uncertain circumstances arise. Research in AI focused on enabling systems to respond to novelty and uncertainty in more flexible ways is starting to be used in IDSS. For example, intelligent agents that perform complex cognitive tasks without any need for human intervention have been used in a range of decision support applications. Capabilities of these intelligent agents include knowledge sharing, machine learning, data mining, and automated inference. A range of AI techniques such as case based reasoning, rough sets and fuzzy logic have also been used to enable decision support systems to perform better in uncertain conditions. A 2009 research about a multi-artificial system intelligence system named IILS is proposed to automate problem-solving processes within the logistics industry. The system involves integrating intelligence modules based on case-based reasoning, multi-agent systems, fuzzy logic, and artificial neural networks aiming to offer advanced logistics solutions and support in making well-informed, high-quality decisions to address a wide range of customer needs and challenges.

    Read more →
  • Record sealing

    Record sealing

    Record sealing is the process of making public records inaccessible to the public. In many cases, a person with a sealed record gains the legal right to deny or not acknowledge anything to do with the arrest and the legal proceedings from the case itself. Records are commonly sealed in a number of situations: Sealed birth records (typically after adoption or determination of paternity) Juvenile criminal records may be sealed Other types of cases involving juveniles may be sealed, anonymized, or pseudonymized ("impounded"); e.g., child sex offense or custody cases Cases using witness protection information may be partly sealed Cases involving trade secrets Cases involving state secrets == Filing under seal in US court == Normally, records should not be filed under seal without a court permission. However, FRCP 5.2 requires that sensitive text – like Social Security number, Taxpayer Identification Number, birthday, bank accounts, and children’s names – should be redacted off the filings made with the court and accompanying exhibits. A person making a redacted filing can file an unredacted copy under seal, or the Court can choose to order later that an additional filing be made under seal without redaction. Alternately, the filing party may ask the court’s permission to file some exhibits completely under seal. When the document is filed "under seal", it should have a clear indication for the court clerk to file it separately – most often by stamping words "Filed Under Seal" on the bottom of each page. Person making filing should also provide instructions to the court clerk that the document needs to be filed "under seal". Courts often have specific requirements to these filings in their Local Rules. == Difference from expungement == Expungement, which is a physical destruction, namely a complete erasure of one's criminal records, and therefore usually carries a higher standard, differs from record sealing, which is only to restrict the public's access to records, so that only certain law enforcement agencies or courts, under special circumstances, will have access to them. A record seal will greatly improve the chance of employment, as employers will not have access to damning records. There are occasions, like expungement, where one can truthfully state under oath that they have never been convicted before. Most of the time, a record seal has more relaxed requirements than an expungement. If an expungement is not allowed with a case, then sealing a record may be the best bet. Different states have different terms for what constitutes sealing of a record. == Cybersecurity incidents involving sealed records == Several cybersecurity incidents have demonstrated that sealed court documents are not always secure in practice, with vulnerabilities and data breaches exposing sensitive information. In January 2021, following the SolarWinds cyber attack, the U.S. Bankruptcy Court United States District Court for the District of Nevada announced that its Case Management/Electronic Case Files CM/ECF system had been potentially compromised. The judiciary stated that additional safeguards were being implemented to protect filings, and that the review of the incident and its impact was ongoing. Reports noted that the breach raised concerns about exposure of highly sensitive and sealed documents submitted through the CM/ECF system. In 2023, security researcher Jason Parker, following a tip from an activist, identified flaws in online court systems that exposed sealed records including confidential testimony and medical records through publicly accessible portals. In 2024, a cyber intrusion targeting attorneys in a civil case involving Representative Matt Gaetz led to the unauthorized access and leak of sealed depositions and related records. The breach exposed confidential testimony and financial records, some of which were later reported by news outlets, raising concerns about the security of electronically stored legal materials and the handling of sealed filings. In 2025, multiple reports confirmed that the federal judiciary's CM/ECF and PACER (law) filing system was compromised, exposing sealed indictments, confidential informant information, and other sensitive filings. Some courts temporarily reverted to paper-based filing to mitigate the risks of further disclosure. The FBI later confirmed that the breach had exposed sealed records, and investigators suspected foreign state actors were involved. == GAO publications referencing sealed records == Closed Criminal Plea and Sentencing Proceedings (1983) – Reviewed Department of Justice policies on closing plea and sentencing hearings. GAO noted that sealed transcripts should be unsealed once the reasons for closure no longer applied. Information on Plea Agreements and Settlements in Defense Procurement Fraud Cases (1992) – Examined outcomes of procurement fraud prosecutions. GAO observed that in some instances the results were sealed from public access. Military Recruiting: More Needs to Be Done to Better Screen Applicants and Detect Fraud (1999) – Investigated fraudulent enlistments in the armed forces. The report highlighted that sealed juvenile records often prevented recruiters from discovering prior offenses. Social Security Numbers: Governments Could Do More to Reduce Display in Public Records (2004) – Analyzed risks associated with SSN availability in state and local records. GAO pointed out that some categories of records, such as adoption proceedings, were sealed and less likely to expose identifiers. Social Security Numbers: Stronger Safeguards Needed to Protect Privacy (2005 testimony) – Testimony before Congress reiterating concerns over SSN exposure in public records, while noting that sealed categories (e.g., adoption) were exceptions. U.S. Supreme Court: Policies and Perspectives on Video and Audio Coverage of Appellate Court Proceedings (2016) – Surveyed appellate court policies on courtroom media coverage. The report acknowledged distinctions between public filings, confidential submissions, and sealed materials. Evictions: National Data Are Limited and Challenging to Collect (2024) – Examined nationwide eviction data. GAO reported that in some states eviction records may be sealed or expunged, limiting researchers' ability to compile datasets. DOD Fraud Risk Management: Enhanced Data and Collaboration Could Improve Efforts (2024) – Reviewed Department of Defense fraud-risk management. GAO noted that some adjudicative records in its dataset were sealed, restricting completeness of oversight data.

    Read more →
  • Inception score

    Inception score

    The Inception Score (IS) is an algorithm used to assess the quality of images created by a generative image model such as a generative adversarial network (GAN). The score is calculated based on the output of a separate, pretrained Inception v3 image classification model applied to a sample of (typically around 30,000) images generated by the generative model. The Inception Score is maximized when the following conditions are true: The entropy of the distribution of labels predicted by the Inceptionv3 model for the generated images is minimized. In other words, the classification model confidently predicts a single label for each image. Intuitively, this corresponds to the desideratum of generated images being "sharp" or "distinct". The predictions of the classification model are evenly distributed across all possible labels. This corresponds to the desideratum that the output of the generative model is "diverse". It has been somewhat superseded by the related Fréchet inception distance. While the Inception Score only evaluates the distribution of generated images, the FID compares the distribution of generated images with the distribution of a set of real images ("ground truth"). == Definition == Let there be two spaces, the space of images Ω X {\displaystyle \Omega _{X}} and the space of labels Ω Y {\displaystyle \Omega _{Y}} . The space of labels is finite. Let p g e n {\displaystyle p_{gen}} be a probability distribution over Ω X {\displaystyle \Omega _{X}} that we wish to judge. Let a discriminator be a function of type p d i s : Ω X → M ( Ω Y ) {\displaystyle p_{dis}:\Omega _{X}\to M(\Omega _{Y})} where M ( Ω Y ) {\displaystyle M(\Omega _{Y})} is the set of all probability distributions on Ω Y {\displaystyle \Omega _{Y}} . For any image x {\displaystyle x} , and any label y {\displaystyle y} , let p d i s ( y | x ) {\displaystyle p_{dis}(y|x)} be the probability that image x {\displaystyle x} has label y {\displaystyle y} , according to the discriminator. It is usually implemented as an Inception-v3 network trained on ImageNet. The Inception Score of p g e n {\displaystyle p_{gen}} relative to p d i s {\displaystyle p_{dis}} is I S ( p g e n , p d i s ) := exp ⁡ ( E x ∼ p g e n [ D K L ( p d i s ( ⋅ | x ) ‖ ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x ) ] ) {\displaystyle IS(p_{gen},p_{dis}):=\exp \left(\mathbb {E} _{x\sim p_{gen}}\left[D_{KL}\left(p_{dis}(\cdot |x)\|\int p_{dis}(\cdot |x)p_{gen}(x)dx\right)\right]\right)} Equivalent rewrites include ln ⁡ I S ( p g e n , p d i s ) := E x ∼ p g e n [ D K L ( p d i s ( ⋅ | x ) ‖ E x ∼ p g e n [ p d i s ( ⋅ | x ) ] ) ] {\displaystyle \ln IS(p_{gen},p_{dis}):=\mathbb {E} _{x\sim p_{gen}}\left[D_{KL}\left(p_{dis}(\cdot |x)\|\mathbb {E} _{x\sim p_{gen}}[p_{dis}(\cdot |x)]\right)\right]} ln ⁡ I S ( p g e n , p d i s ) := H [ E x ∼ p g e n [ p d i s ( ⋅ | x ) ] ] − E x ∼ p g e n [ H [ p d i s ( ⋅ | x ) ] ] {\displaystyle \ln IS(p_{gen},p_{dis}):=H[\mathbb {E} _{x\sim p_{gen}}[p_{dis}(\cdot |x)]]-\mathbb {E} _{x\sim p_{gen}}[H[p_{dis}(\cdot |x)]]} ln ⁡ I S {\displaystyle \ln IS} is nonnegative by Jensen's inequality. Pseudocode:INPUT discriminator p d i s {\displaystyle p_{dis}} . INPUT generator g {\displaystyle g} . Sample images x i {\displaystyle x_{i}} from generator. Compute p d i s ( ⋅ | x i ) {\displaystyle p_{dis}(\cdot |x_{i})} , the probability distribution over labels conditional on image x i {\displaystyle x_{i}} . Sum up the results to obtain p ^ {\displaystyle {\hat {p}}} , an empirical estimate of ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x {\displaystyle \int p_{dis}(\cdot |x)p_{gen}(x)dx} . Sample more images x i {\displaystyle x_{i}} from generator, and for each, compute D K L ( p d i s ( ⋅ | x i ) ‖ p ^ ) {\displaystyle D_{KL}\left(p_{dis}(\cdot |x_{i})\|{\hat {p}}\right)} . Average the results, and take its exponential. RETURN the result. === Interpretation === A higher inception score is interpreted as "better", as it means that p g e n {\displaystyle p_{gen}} is a "sharp and distinct" collection of pictures. ln ⁡ I S ( p g e n , p d i s ) ∈ [ 0 , ln ⁡ N ] {\displaystyle \ln IS(p_{gen},p_{dis})\in [0,\ln N]} , where N {\displaystyle N} is the total number of possible labels. ln ⁡ I S ( p g e n , p d i s ) = 0 {\displaystyle \ln IS(p_{gen},p_{dis})=0} iff for almost all x ∼ p g e n {\displaystyle x\sim p_{gen}} p d i s ( ⋅ | x ) = ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x {\displaystyle p_{dis}(\cdot |x)=\int p_{dis}(\cdot |x)p_{gen}(x)dx} That means p g e n {\displaystyle p_{gen}} is completely "indistinct". That is, for any image x {\displaystyle x} sampled from p g e n {\displaystyle p_{gen}} , discriminator returns exactly the same label predictions p d i s ( ⋅ | x ) {\displaystyle p_{dis}(\cdot |x)} . The highest inception score N {\displaystyle N} is achieved if and only if the two conditions are both true: For almost all x ∼ p g e n {\displaystyle x\sim p_{gen}} , the distribution p d i s ( y | x ) {\displaystyle p_{dis}(y|x)} is concentrated on one label. That is, H y [ p d i s ( y | x ) ] = 0 {\displaystyle H_{y}[p_{dis}(y|x)]=0} . That is, every image sampled from p g e n {\displaystyle p_{gen}} is exactly classified by the discriminator. For every label y {\displaystyle y} , the proportion of generated images labelled as y {\displaystyle y} is exactly E x ∼ p g e n [ p d i s ( y | x ) ] = 1 N {\displaystyle \mathbb {E} _{x\sim p_{gen}}[p_{dis}(y|x)]={\frac {1}{N}}} . That is, the generated images are equally distributed over all labels.

    Read more →
  • Concept drift

    Concept drift

    In predictive analytics, data science, machine learning and related fields, concept drift or drift is an evolution of data that invalidates the data model. It happens when the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes. Drift detection and drift adaptation are of paramount importance in the fields that involve dynamically changing data and data models. == Predictive model decay == In machine learning and predictive analytics this drift phenomenon is called concept drift. In machine learning, a common element of a data model are the statistical properties, such as probability distribution of the actual data. If they deviate from the statistical properties of the training data set, then the learned predictions may become invalid, if the drift is not addressed. == Data configuration decay == Another important area is software engineering, where three types of data drift affecting data fidelity may be recognized. Changes in the software environment ("infrastructure drift") may invalidate software infrastructure configuration. "Structural drift" happens when the data schema changes, which may invalidate databases. "Semantic drift" is changes in the meaning of data while the structure does not change. In many cases this may happen in complicated applications when many independent developers introduce changes without proper awareness of the effects of their changes in other areas of the software system. For many application systems, the nature of data on which they operate are subject to changes for various reasons, e.g., due to changes in business model, system updates, or switching the platform on which the system operates. In the case of cloud computing, infrastructure drift that may affect the applications running on cloud may be caused by the updates of cloud software. There are several types of detrimental effects of data drift on data fidelity. Data corrosion is passing the drifted data into the system undetected. Data loss happens when valid data are ignored due to non-conformance with the applied schema. Squandering is the phenomenon when new data fields are introduced upstream in the data processing pipeline, but somewhere downstream these data fields are absent. == Inconsistent data == "Data drift" may refer to the phenomenon when database records fail to match the real-world data due to the changes in the latter over time. This is a common problem with databases involving people, such as customers, employees, citizens, residents, etc. Human data drift may be caused by unrecorded changes in personal data, such as place of residence or name, as well as due to errors during data input. "Data drift" may also refer to inconsistency of data elements between several replicas of a database. The reasons can be difficult to identify. A simple drift detection is to run checksum regularly. However the remedy may be not so easy. == Examples == The behavior of the customers in an online shop may change over time. For example, if weekly merchandise sales are to be predicted, and a predictive model has been developed that works satisfactorily. The model may use inputs such as the amount of money spent on advertising, promotions being run, and other metrics that may affect sales. The model is likely to become less and less accurate over time – this is concept drift. In the merchandise sales application, one reason for concept drift may be seasonality, which means that shopping behavior changes seasonally. Perhaps there will be higher sales in the winter holiday season than during the summer, for example. Concept drift generally occurs when the covariates that comprise the data set begin to explain the variation of your target set less accurately — there may be some confounding variables that have emerged, and that one simply cannot account for, which renders the model accuracy to progressively decrease with time. Generally, it is advised to perform health checks as part of the post-production analysis and to re-train the model with new assumptions upon signs of concept drift. == Possible remedies == To prevent deterioration in prediction accuracy because of concept drift, reactive and tracking solutions can be adopted. Reactive solutions retrain the model in reaction to a triggering mechanism, such as a change-detection test or control charts from statistical process control, to explicitly detect concept drift as a change in the statistics of the data-generating process. When concept drift is detected, the current model is no longer up-to-date and must be replaced by a new one to restore prediction accuracy. A shortcoming of reactive approaches is that performance may decay until the change is detected. Tracking solutions seek to track the changes in the concept by continually updating the model. Methods for achieving this include online machine learning, frequent retraining on the most recently observed samples, and maintaining an ensemble of classifiers where one new classifier is trained on the most recent batch of examples and replaces the oldest classifier in the ensemble. Contextual information, when available, can be used to better explain the causes of the concept drift: for instance, in the sales prediction application, concept drift might be compensated by adding information about the season to the model. By providing information about the time of the year, the rate of deterioration of your model is likely to decrease, but concept drift is unlikely to be eliminated altogether. This is because actual shopping behavior does not follow any static, finite model. New factors may arise at any time that influence shopping behavior, the influence of the known factors or their interactions may change. Concept drift cannot be avoided for complex phenomena that are not governed by fixed laws of nature. All processes that arise from human activity, such as socioeconomic processes, and biological processes are likely to experience concept drift. Therefore, periodic retraining, also known as refreshing, of any model is necessary. === Remedy methods === DDM (Drift Detection Method): detects drift by monitoring the model's error rate over time. When the error rate passes a set threshold, it enters a warning phase, and if it passes another threshold, it enters a drift phase. EDDM (Early Drift Detection Method): improves DDM's detection rate by tracking the average distance between two errors instead of only the error rate. ADWIN (Adaptive Windowing): dynamically stores a window of recent data and warns the user if it detects a significant change between the statistics of the window's earlier data compared to more recent data. KSWIN (Kolmogorov–Smirnov Windowing): detects drift based on the Kolmogorov-Smirnov statistical test. DDM and EDDM: Concept Drift Detection online supervised methods that rely on sequential error monitoring to estimate the evolving error rate. ADWIN and KSWIN: Windowing maintain a "window", a subset of the most recent data, of the data stream, which it checks for statistical differences across the window. == Applications in security == Concept drift is a recurring issue in security analytics, especially in malware and intrusion detection. In these systems, models are often trained on past logs, binaries or network traces, but the behaviour of attackers changes over time as new malware families, obfuscation techniques and campaigns appear. When the data no longer resemble the training set, the decision boundaries learned by classifiers or anomaly detectors can become misaligned with the current threat landscape and detection performance can drop unless the models are updated or replaced. Several studies on Windows malware model detection as an evolving data stream and track how performance changes as time passes. They show that classifiers trained on a fixed time window can perform well on nearby data but deteriorate quickly when evaluated on samples collected months or years later, even when large amounts of training data are available. In order to keep up with this, security systems often use sliding or adaptive windows, which restrict training to the most recent portion of the data so that older, less relevant examples are gradually discarded. They also employ drift detectors such as ADWIN and KSWIN that monitor error rates or changes in the distribution of recent observations and signal when the statistics of the incoming stream differ significantly from the past, prompting retraining or model replacement. Related problems appear in spam filtering, fraud detection and intrusion detection, where adversaries change content, patterns of activity or network behavior to evade models trained on historical data. In these settings drift can be gradual, as new types of spam or fraud emerge, or abrupt, after a sudden shift in attack techniques. Common strategies to remain eff

    Read more →