A.I.s

A.I.s

A.I.s is a themed anthology of science fiction short works edited by American writers Jack Dann and Gardner Dozois. It was first published in paperback by Ace Books in December 2004. It was reissued as an ebook by Baen Books in June 2013. The book collects ten novelettes and short stories by various science fiction authors, together with a preface by the editors. == Contents == "Preface" (Jack Dann and Gardner Dozois) "Antibodies" (Charles Stross) "Trojan Horse" (Michael Swanwick) "Birth Day" (Robert Reed) "The Hydrogen Wall" (Gregory Benford) "The Turing Test" (Chris Beckett) "Dante Dreams" (Stephen Baxter) "The Names of All the Spirits" (J. R. Dunn) "From the Corner of My Eye" (Alexander Glass) "Halfjack" (Roger Zelazny) "Computer Virus" (Nancy Kress)

Microsoft To Do

Microsoft To Do (previously styled as Microsoft To-Do) is a cloud-based task management application. It allows users to manage their tasks from a smartphone, tablet and computer. The technology is produced by the team behind Wunderlist, which was acquired by Microsoft, and the stand-alone apps feed into the existing Tasks feature of the Outlook product range. == History == Microsoft To Do was first launched as a preview with basic features in April 2017. Later more features were added including Task list sharing in June 2018. In September 2019, a major update to the app was unveiled, adopting a new user interface with a closer resemblance to Wunderlist. The name was also slightly updated by removing the hyphen from To-Do. In May 2020, Microsoft officially closed the doors on Wunderlist, ending its active service in favor of improving and expanding Microsoft To Do.

Spatial Analysis of Principal Components

Spatial Principal Component Analysis (sPCA) is a multivariate statistical technique that complements the traditional Principal Component Analysis (PCA) by incorporating spatial information into the analysis of genetic variation. While traditional PCA can be used to find spatial patterns, it focuses on reducing data dimensionality by identifying uncorrelated principal components that capture maximum variance, thus often lacking power to identify non-trivial spatial genetic patterns. By accounting for spatial autocorrelation, sPCA is able to uncover spatial patterns in the data and find the spatial structure of datasets where observations are either geographically or topologically linked. This statistical power improvement allows the investigation of cryptic spatial patterns of genetic variability otherwise overlooked. sPCA has been applied in various fields, including geography, ecology and genetics. == History == sPCA was introduced in 2008 by Thibaut Jombart, Sébastien Devillard, Anne-Béatrice Dufour, and D. Pontier as a spatially explicit method to investigate the spatial pattern of genetic variation among individuals or populations. In 2017, Valeria Montano and Thibaut Jombart published an alternative non-parametric test to evaluate the significance of global and local spatial genetic patterns with improved statistical power. == Details == sPCA modifies the PCA framework by integrating spatial weights, typically in the form of connectivity matrices or spatial adjacency graphs. It identifies principal components (PCs) that maximize both genentic variance and spatial autocorreation, as measured by Moran's I. These weights represent relationships between observations based on geographic distance or other spatial criteria. The method decomposes variance into two components: Global structures, correspond to positive autocorrelation, that is, reflect broad-scale spatial patterns where similar values cluster over large regions. Local structures, correspond to negative autocorrelation, that is, capture fine-scale spatial variations or localized patterns. The core of sPCA relies on the eigenanalysis of a spatially weighted covariance or correlation matrix. The spatial weight matrix can be constructed using techniques such as Delaunay triangulation, nearest-neighbor graphs, or distance-based criteria. Applications of sPCA should be used only as an explorative tool. == Applications == sPCA has been widely used in many fields, including: Ecology: To find spatial patterns in species distributions and environmental gradients. Genetics: Population structure and gene flow analysis while allowing for spatial autocorrelation considerations. Biogeography: To identify historical dispersal routes, and barriers to gene flow, providing insights into species distribution patterns and evolutionary history. == Software/Source Code == sPCA implementations are available in R in adegenet and ntbox . These tools facilitate the application of sPCA by providing functions for constructing spatial weight matrices, performing eigenanalysis, and obtaining spatial principal components in an easy-to-read form.

Dynamic time warping

In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walking faster than the other, or if there were accelerations and decelerations during the course of an observation. DTW has been applied to temporal sequences of video, audio, and graphics data — indeed, any data that can be turned into a one-dimensional sequence can be analyzed with DTW. A well-known application has been automatic speech recognition, to cope with different speaking speeds. Other applications include speaker recognition and online signature recognition. It can also be used in partial shape matching applications. In general, DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restriction and rules: Every index from the first sequence must be matched with one or more indices from the other sequence, and vice versa The first index from the first sequence must be matched with the first index from the other sequence (but it does not have to be its only match) The last index from the first sequence must be matched with the last index from the other sequence (but it does not have to be its only match) The mapping of the indices from the first sequence to indices from the other sequence must be monotonically increasing, and vice versa, i.e. if j > i {\displaystyle j>i} are indices from the first sequence, then there must not be two indices l > k {\displaystyle l>k} in the other sequence, such that index i {\displaystyle i} is matched with index l {\displaystyle l} and index j {\displaystyle j} is matched with index k {\displaystyle k} , and vice versa We can plot each match between the sequences 1 : M {\displaystyle 1:M} and 1 : N {\displaystyle 1:N} as a path in a M × N {\displaystyle M\times N} matrix from ( 1 , 1 ) {\displaystyle (1,1)} to ( M , N ) {\displaystyle (M,N)} , such that each step is one of ( 0 , 1 ) , ( 1 , 0 ) , ( 1 , 1 ) {\displaystyle (0,1),(1,0),(1,1)} . In this formulation, we see that the number of possible matches is the Delannoy number. The optimal match is denoted by the match that satisfies all the restrictions and the rules and that has the minimal cost, where the cost is computed as the sum of absolute differences, for each matched pair of indices, between their values. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension. This sequence alignment method is often used in time series classification. Although DTW measures a distance-like quantity between two given sequences, it doesn't guarantee the triangle inequality to hold. In addition to a similarity measure between the two sequences (a so called "warping path" is produced), by warping according to this path the two signals may be aligned in time. The signal with an original set of points X(original), Y(original) is transformed to X(warped), Y(warped). This finds applications in genetic sequence and audio synchronisation. In a related technique sequences of varying speed may be averaged using this technique see the average sequence section. This is conceptually very similar to the Needleman–Wunsch algorithm. == Implementation == This example illustrates the implementation of the dynamic time warping algorithm when the two sequences s and t are strings of discrete symbols. For two symbols x and y, d ( x , y ) {\displaystyle d(x,y)} is a distance between the symbols, e.g., d ( x , y ) = | x − y | {\displaystyle d(x,y)=|x-y|} . int DTWDistance(s: array [1..n], t: array [1..m]) { DTW := array [0..n, 0..m] for i := 0 to n for j := 0 to m DTW[i, j] := infinity DTW[0, 0] := 0 for i := 1 to n for j := 1 to m cost := d(s[i], t[j]) DTW[i, j] := cost + minimum(DTW[i-1, j ], // insertion DTW[i , j-1], // deletion DTW[i-1, j-1]) // match return DTW[n, m] } where DTW[i, j] is the distance between s[1:i] and t[1:j] with the best alignment. We sometimes want to add a locality constraint. That is, we require that if s[i] is matched with t[j], then | i − j | {\displaystyle |i-j|} is no larger than w, a window parameter. We can easily modify the above algorithm to add a locality constraint (differences marked). However, the above given modification works only if | n − m | {\displaystyle |n-m|} is no larger than w, i.e. the end point is within the window length from diagonal. In order to make the algorithm work, the window parameter w must be adapted so that | n − m | ≤ w {\displaystyle |n-m|\leq w} (see the line marked with () in the code). int DTWDistance(s: array [1..n], t: array [1..m], w: int) { DTW := array [0..n, 0..m] w := max(w, abs(n-m)) // adapt window size () for i := 0 to n for j:= 0 to m DTW[i, j] := infinity DTW[0, 0] := 0 for i := 1 to n for j := max(1, i-w) to min(m, i+w) DTW[i, j] := 0 for i := 1 to n for j := max(1, i-w) to min(m, i+w) cost := d(s[i], t[j]) DTW[i, j] := cost + minimum(DTW[i-1, j ], // insertion DTW[i , j-1], // deletion DTW[i-1, j-1]) // match return DTW[n, m] } == Warping properties == The DTW algorithm produces a discrete matching between existing elements of one series to another. In other words, it does not allow time-scaling of segments within the sequence. Other methods allow continuous warping. For example, Correlation Optimized Warping (COW) divides the sequence into uniform segments that are scaled in time using linear interpolation, to produce the best matching warping. The segment scaling causes potential creation of new elements, by time-scaling segments either down or up, and thus produces a more sensitive warping than DTW's discrete matching of raw elements. == Complexity == The time complexity of the DTW algorithm is O ( N M ) {\displaystyle O(NM)} , where N {\displaystyle N} and M {\displaystyle M} are the lengths of the two input sequences. The 50 years old quadratic time bound was broken in 2016: an algorithm due to Gold and Sharir enables computing DTW in O ( N 2 / log ⁡ log ⁡ N ) {\displaystyle O({N^{2}}/\log \log N)} time and space for two input sequences of length N {\displaystyle N} . This algorithm can also be adapted to sequences of different lengths. Despite this improvement, it was shown that a strongly subquadratic running time of the form O ( N 2 − ϵ ) {\displaystyle O(N^{2-\epsilon })} for some ϵ > 0 {\displaystyle \epsilon >0} cannot exist unless the Strong exponential time hypothesis fails. While the dynamic programming algorithm for DTW requires O ( N M ) {\displaystyle O(NM)} space in a naive implementation, the space consumption can be reduced to O ( min ( N , M ) ) {\displaystyle O(\min(N,M))} using Hirschberg's algorithm. == Fast computation == Fast techniques for computing DTW include PrunedDTW, SparseDTW, FastDTW, and the MultiscaleDTW. A common task, retrieval of similar time series, can be accelerated by using lower bounds such as LB_Keogh, LB_Improved, or LB_Petitjean. However, the Early Abandon and Pruned DTW algorithm reduces the degree of acceleration that lower bounding provides and sometimes renders it ineffective. In a survey, Wang et al. reported slightly better results with the LB_Improved lower bound than the LB_Keogh bound, and found that other techniques were inefficient. Subsequent to this survey, the LB_Enhanced bound was developed that is always tighter than LB_Keogh while also being more efficient to compute. LB_Petitjean is the tightest known lower bound that can be computed in linear time. == Average sequence == Averaging for dynamic time warping is the problem of finding an average sequence for a set of sequences. NLAAF is an exact method to average two sequences using DTW. For more than two sequences, the problem is related to that of multiple alignment and requires heuristics. DBA is currently a reference method to average a set of sequences consistently with DTW. COMASA efficiently randomizes the search for the average sequence, using DBA as a local optimization process. == Supervised learning == A nearest-neighbour classifier can achieve state-of-the-art performance when using dynamic time warping as a distance measure. == Amerced Dynamic Time Warping == Amerced Dynamic Time Warping (ADTW) is a variant of DTW designed to better control DTW's permissiveness in the alignments that it allows. The windows that classical DTW uses to constrain alignments introduce a step function. Any warping of the path is allowed within the window and none beyond it. In contrast, ADTW employs an additive penalty that is incurred each time that the path is warped. Any amount of warping is allowed, but each warping action incurs a direct penalty. ADTW significantly outperforms DTW with windowing when applied as a nearest neighbor classifier on a set of benchmark time series classification tasks. == Alternative approaches == In functional data analysis, time series are regarde

PVLV

The primary value learned value (PVLV) model is a possible explanation for the reward-predictive firing properties of dopamine (DA) neurons. It simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards. It is an alternative to the temporal-differences (TD) algorithm. It is used as part of Leabra.

Identi.ca

identi.ca is a free and open-source social networking and blogging service based on the pump.io software, using the Activity Streams protocol. Identi.ca stopped accepting new registrations in 2013, but continues to operate alongside several other pump.io-based hosts provided by E14N which continue to accept new registrations. == Features == Identi.ca is similar to social networking sites like Facebook and Google+, allowing unlimited length status updates, rich text, and images. The Activity Streams protocol supports many kinds of activities such as games. OpenFarmGame is a prototype application for an Activity Streams-based game. Previous features from its StatusNet version such as hashtags, groups, and global search are not supported. == History == === StatusNet === The service received more than 8,000 registrations and 19,000 updates within the first 24 hours of publicly launching on July 2, 2008, and reached its 1,000,000th notice on November 4, 2008. In January 2009, identi.ca received investment funds from venture capital group Montreal Start Up. On March 30, 2009, Control Yourself (since renamed StatusNet Inc) announced that Identi.ca was to become part of a hosted microblogging service called status.net to be launched in May 2009. Status.net offers individual microblogs under a subdomain to be chosen by the customer. Identi.ca will remain a free service. All notices will be published under the Creative Commons Attribution 3.0 license by default, but paying customers will be free to choose a different license. Formerly based on StatusNet, a micro-blogging software package built on the OStatus specification (and earlier based on the OpenMicroBlogging specification), Identi.ca allowed users to send text updates (known as "notices") up to 140 characters long. While similar to Twitter in both concept and operation, Identi.ca/StatusNet provided many features not currently implemented by Twitter, including XMPP support and personal tag clouds. In addition, Identi.ca/StatusNet allowed free export and exchange of personal and "friend" data based on the FOAF standard; therefore, notices could be fed into a Twitter account or other service, and also ported in to a private system similar to Yammer. === pump.io === Developer Evan Prodromou chose to change the site to the pump.io software platform in development, because pump.io offers more features making it technically more advanced. Registration on Identi.ca was closed in December 2012 in preparation for the switch to pump.io software (the popularity of Identi.ca and "official" Status.net hosting were considered a hindrance to the creation of a federated social network). The conversion was completed on 12 July 2013. The 140 character per post limit was removed (in StatusNet, it was a setting, not an inherent limitation); now the blog posts can contain formatting and images. Groups, hashtags, and a page listing popular posts are not yet implemented in pump.io.

Operational taxonomic unit

An operational taxonomic unit (OTU) is an operational definition used to classify groups of closely related individuals. The term was originally introduced in 1963 by Robert R. Sokal and Peter H. A. Sneath in the context of numerical taxonomy, where an "operational taxonomic unit" is simply the group of organisms currently being studied. In this sense, an OTU is a pragmatic definition to group individuals by similarity, equivalent to but not necessarily in line with classical Linnaean taxonomy or modern evolutionary taxonomy. Nowadays, however, the term is commonly used in a different context and refers to clusters of (uncultivated or unknown) organisms, grouped by DNA sequence similarity of a specific taxonomic marker gene (originally coined as mOTU; molecular OTU). In other words, OTUs are pragmatic proxies for "species" at different taxonomic levels, in the absence of traditional systems of biological classification as are available for macroscopic organisms. For several years, OTUs have been the most commonly used units of diversity, especially when analysing small subunit 16S (for prokaryotes) or 18S rRNA (for eukaryotes) marker gene sequence datasets. == Molecular OTU by clustering of marker gene sequences == In the approach represented by DNA barcoding, a particular locus is chosen to be used as the marker gene for classification. This locus should be universally present in the scope selected, variable enough to be different among close-related species, and be flanked by conservative sequences that allow for easy amplification and detection. There are databases containing sequences for such marker genes from many different species, allowing for comparison. (Sometimes only using one locus does not provide sufficient resolution, so multiple marker genes are used. This is the case for plants, where rbcL+matK is common.) Sequences obtained this way can be clustered according to their similarity to one another, and operational taxonomic units are defined based on the similarity threshold set by the researcher. The exact threshold depends on the taxa in question and the mutational rates of the selected locus in the taxon. 97–99% are commonly used, but "it is now recognized to be somewhat arbitrary as sequence variation within and among species varies across taxa". 100% similarity (fully identical) is also common, also known as single variants. It remains debatable how well this commonly used method recapitulates true microbial species phylogeny or ecology. Although OTUs can be calculated differently when using different algorithms or thresholds, research by Schmidt et al. (2014) demonstrated that 16S-derived microbial OTUs were generally ecologically consistent across habitats and several clustering approaches. The number of OTUs defined may be inflated due to errors in DNA sequencing. === OTU clustering approaches === There are three main approaches to clustering OTUs: De novo, for which the clustering is based on similarities between sequencing reads. Closed-reference, for which the clustering is performed against a reference database of sequences. Open-reference, where clustering is first performed against a reference database of sequences, then any remaining sequences that could not be mapped to the reference are clustered de novo. Using a reference provides taxonomic context for the OTUs found. Alternatively, taxonomic context can be found after the construction of clusters by comparing representative sequences from clusters against a reference database. There are also specialized classifiers for this purpose which are much faster than naive comparison using BLAST. === OTU clustering algorithms === Hierarchical clustering algorithms (HCA): uclust & cd-hit & ESPRIT Bayesian clustering: CROP == Molecular OTU by other methods == In addition to similarity-based grouping, marker gene sequences can be sorted into OTUs using molecular phylogeny, k-mer composition, or hybrid methods combining these methods with similarity. There are also Bayesian tree-less methods and machine learning approaches. Using phylogeny often involves manually assigning terminal clades or single nodes to an OTU, so this is usually only done for refinement. Genome skimming can be used to obtain high-copy DNA without the need to choose marker genes or to design PCR primers for the chosen genes. It can provide fairly good coverage of organelle DNA and repetitive elements such as ribosomal DNA, both of which can be used like marker genes in OTU analysis. Whole-genome sequencing is more expensive and involves the production and processing of more data. By considering the entire genome, many (sometimes over 100) marker genes can be used at the same time, producing highly resolved phylogenies that correctly identify problematic taxa. It is also possible to use entire genomes for OTU assignment. For example, genomes from different bacterial species almost always have an average nucleotide identity lower than 95%, a fact that can be used to define new OTUs (and likely new species).