AI Writing Tools

Explore the best AI Writing Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • PCPaint

    PCPaint

    PCPaint was one of the first IBM PC-based mouse-driven GUI paint programs, released in 1984. It followed after Microsoft Doodle, released in 1983 with the Microsoft Mouse version 1 drivers for DOS, and around the same time as Digital Research’s Draw program. It was developed and created by John Bridges and Doug Wolfgram. It was later developed into Pictor Paint. The hardware manufacturer Mouse Systems bundled PCPaint with millions of computer mice that they sold, making PCPaint one of the best-selling DOS-based paint programs of the mid 1980s. == History == In 1983, Doug Wolfgram bought a Microsoft Mouse and decided to write a drawing program for it. They named it “Mouse Draw”. The interface was primitive but the program functioned well. Wolfgram traveled to SoftCon in New Orleans where he demonstrated the program to Mouse Systems. Mouse Systems was developing an optical mouse and they wanted to bundle a painting program so they agreed to publish Mouse Draw. The original program was written entirely in assembly language with primitive graphics routines developed by Wolfgram. John Bridges worked for an educational software company, Classroom Consortia Media, Inc., developing and writing Apple and IBM graphics libraries for CCM's software. Bridges and Wolfgram were friends who had been connected through a bulletin board system developed and run by Wolfgram. The two collaborated cross country via the BBS, Wolfram in California and Bridges in New York. Mouse Systems wanted the paint program to capture the look and feel of MacPaint. John Bridges and Doug Wolfgram started reworking Mouse Draw into what became PCPaint. The program was completely re-written using Bridge's graphics library and the top-level elements were written in C rather than assembly language. Bridges developed the core graphics code for the first version of PCPaint while Wolfgram worked on the user interface and top-level code. Mouse Systems signed an exclusive agreement with Wolfgram's company, Microtex Industries, Inc., to bundle PCPaint with every mouse they sold. They began publishing PCPaint with their mice in 1984. Microsoft responded in 1985 by bundling a competing product, PC Paintbrush, with version 4 of its DOS drivers for the Microsoft Mouse, replacing its in-house Microsoft Doodle program which it published with version 1 of the DOS drivers in mid-1983. Microsoft’s mouse began to outsell Mouse Systems mouse. In November 1985 Microsoft bundled a cut-down version of PC Paintbrush with Windows 1.0 (called Microsoft Paint), later bundling an updated version of PC Paintbrush with Windows 3.0 (as Paintbrush), impacting PCPaint’s marketshare. In early 1987, Mouse Systems decided that PCPaint wasn't helping to sell mice any longer so they discontinued the bundle deal and returned rights to the code to MicroTex Industries, but retained rights to the name, PCPaint. Wolfgram then combined the paint program with a new animation system he was developing (called GRASP) and Paul Mace Software bought publishing rights to the animation system and PCPaint, which was to be renamed Pictor. Bridges again got involved and took over programming responsibilities for GRASP as well as PCPaint while Wolfgram focused on more of the business details. In creating the first version of PCPaint, Doug had a dual-floppy machine with a Computer Innovations compiler on one disk and source code on the other. John had the "luxury" of a 10MB hard disk in his XT. Data was exchanged daily via 1200, then 2400 baud modems. === Authorship and Ownership === John Bridges and Wolfgram continued to work on PCPaint and GRASP on behalf of Paul Mace Software until 1990. Also in that year, Doug Wolfgram sold his remaining rights to PCPaint (and its animation system, GRASP) to John Bridges. In 1994, GRASP development stopped and so did development of Pictor Paint. John Bridges terminated his GRASP publishing contract with Paul Mace Software, and went off to create GLPro (the next generation of GRASP) with GMEDIA. Along with GLPro, came GLPaint, the successor to PCPaint and Pictor Paint. == Versions == In June 1984, Mouse Systems shipped PCPaint 1.0, the first GUI based Paint program for the IBM PC family of computers. John Bridges and Doug Wolfgram, were the co-authors of PCPaint 1.0. PCPaint 1.0 saved its graphics in a modified BSaved image format with the extension of ".PIC". The release of PCPaint Version 1.5 followed in late 1984, with the additions of graphics image compression for the .PIC format and support for "larger-than-screen" images. PCjr support was also added in this version after overcoming severe memory shortage problems getting PCPaint to run on the 128k PCjr. October 1985 saw the release of PCPaint 2.0. EGA support and publishing features were added to this version. The .PIC format was further refined, offering support for the rapidly expanding graphics capabilities of the PC and efficient image compression. PCPaint 3.1 was released in 1989. Unlike previous versions, it was not bundled with mice but was sold as a stand-alone software product. PCPaint 3.1 offered improved text and image handling, provided 36 types of flood and fill, worked with VGA adapters in hi-res 16-color and 256-color modes, allowed the user to save and retrieve files in a variety of intercompatible formats (.PIC, .GIF, .PCX, .IMG), and printed selected portions of images on color or black-and-white dot matrix, ink jet, and laser printers such as PostScript and HP Laser Jet. PCPaint 3.1 is still in use today by some users of DOS emulation programs like DOSBox and available for free download. Pictor Paint was an improved version, written by John Bridges, and bundled with GRASP GRaphical System for Presentation also written by John Bridges. It was also called "The Painter's Easel". GLPaint, released in 1995, was the last in this series of paint programs written by John Bridges. By 1998 version 7.0 provided support for TrueColor images and the Pictor PIC format was expanded to handle these. == Pictor PIC Image Format == PCPaint 1.0 saved its graphics in a modified BSAVE image format (which was popular at the time) with the file type (extension) of ".PIC". By PCPaint 1.5 this format was extended further to accommodate image compression. With the release of version 2.0 the PICtor PIC image format was developed almost to its present state, with no similarity to the BSAVE format used by earlier versions. Pictor Paint saved its files in a compressed format with the file extension PIC, which was the same format used by PCPaint.

    Read more →
  • MarkLogic Server

    MarkLogic Server

    MarkLogic Server is a document-oriented database developed by MarkLogic. It is a NoSQL multi-model database that evolved from an XML database to natively store JSON documents and RDF triples, the data model for semantics. MarkLogic is designed to be a data hub for operational and analytical data. == History == MarkLogic Server was built to address shortcomings with existing search and data products. The product first focused on using XML as the document markup standard and XQuery as the query standard for accessing collections of documents up to hundreds of terabytes in size. Currently the MarkLogic platform is widely used in publishing, government, finance and other sectors. MarkLogic's customers are mostly Global 2000 companies. == Technology == MarkLogic uses documents without upfront schemas to maintain a flexible data model. In addition to having a flexible data model, MarkLogic uses a distributed, scale-out architecture that can handle hundreds of billions of documents and hundreds of terabytes of data. It has received Common Criteria certification, and has high availability and disaster recovery. MarkLogic is designed to run on-premises and within public or private cloud environments like Amazon Web Services. == Features == Indexing MarkLogic indexes the content and structure of documents including words, phrases, relationships, and values in over 200 languages with tokenization, collation, and stemming for core languages. Functionality includes the ability to toggle range indexes, geospatial indexes, the RDF triple index, and reverse indexes on or off based on your data, the kinds of queries that you will run, and your desired performance. Full-text search MarkLogic supports search across its data and metadata using a word or phrase and incorporates Boolean logic, stemming, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and search term weighting. Data can be searched using JavaScript, XQuery, SPARQL, and SQL. Semantics MarkLogic uses RDF triples to provide semantics for ease of storing metadata and querying. ACID Unlike other NoSQL databases, MarkLogic maintains ACID consistency for transactions. Replication MarkLogic provides high availability with replica sets. Scalability MarkLogic scales horizontally using sharding. MarkLogic can run over multiple servers, balancing the load or replicating data to keep the system up and running in the event of hardware failure. Security MarkLogic has built in security features such as element-level permissions and data redaction. Optic API for Relational Operations An API that lets developers view their data as documents, graphs or rows. Security MarkLogic provides redaction, encryption, and element-level security (allowing for control on read and write rights on parts of a document). == Applications == Banking Big Data Fraud prevention Insurance Claims Management and Underwriting Master data management Recommendation engines == Licensing == MarkLogic is available under various licensing and delivery models, namely a free Developer or an Essential Enterprise license.[3] Licenses are available from MarkLogic or directly from cloud marketplaces such as Amazon Web Services and Microsoft Azure. == Releases == 2001 – Cerisent XQE 1: ACID transactions, Full-text search, XML Storage, XQuery, Role-based security 2004 – Cerisent XQE 2: Scale-out architecture, Enhanced search (stemming, thesaurus, wildcard), Backup and restore 2005 – MarkLogic Server 3: Continuing search improvements, Content Processing Framework (including PDF, Word, Excel, PPT), Failover 2008 – MarkLogic Server 4: Geospatial search, entity extraction, advanced XQuery, performance, scalability enhancements, auditing 2011 – MarkLogic Server 5: Flexible replication / DDIL, real-time indexing, advanced search, improved analytics, concurrency enhancements 2012 – MarkLogic Server 6: REST and Java APIs, App Builder, enhanced UI, improved search 2013 – MarkLogic Server 7: Semantic graph, bitemporal data, tiered storage, improved search, better management 2015 – MarkLogic Server 8: A Native JSON storage, Server-side JavaScript, Bitemporal, Node.js client API, Incremental backup, Flexible replication[16] 2017 – MarkLogic Server 9: Data integration across Relational and Non-Relational data, Advanced Encryption, Element Level Security, Redaction 2019 – MarkLogic Server 10: Enhanced Data Hub, improved SQL, security, analytics performance, cloud support 2022 – MarkLogic Server 11: MarkLogic Ops Director (Monitoring and Administration Improvements), expanded PKI 2025 – MarkLogic Server 12: Generative AI and Native Vector Search, Graph Algorithm Support, Virtual TDEs (relational views on the fly)

    Read more →
  • Least-squares spectral analysis

    Least-squares spectral analysis

    Least-squares spectral analysis (LSSA) is a class of methods for estimating a frequency spectrum by fitting sinusoids to data using a least-squares fit. Unlike Fourier analysis, the most widely used spectral method in science, data need not be equally spaced to use LSSA. Furthermore, while Fourier analysis generally amplifies long-period noise in long or gapped records, LSSA mitigates such problems. The first strictly least-squares LSSA method was developed in 1969 and 1971, and is known as the Vaníček method or the Gauss–Vaniček method, after its inventor Petr Vaníček and Carl Friedrich Gauss, the inventor of the least-squares method for error minimization. A widely known LSSA variant is the Lomb method or the Lomb–Scargle periodogram, based on dated computational simplifications of the Vaníček method introduced in the 1970s and 1980s, first by Nicholas R. Lomb and later by Jeffrey D. Scargle. Other LSSA variants have been subsequently developed. == Historical background == The close connections between Fourier analysis, the periodogram, and the least-squares fitting of sinusoids have been known for a long time. However, most developments are restricted to complete data sets of equally spaced samples. In 1963, Freek J. M. Barning of Mathematisch Centrum, Amsterdam, handled unequally spaced data by similar techniques, including both a periodogram analysis equivalent to what nowadays is called the Lomb method and least-squares fitting of selected frequencies of sinusoids determined from such periodograms — and connected by a procedure known today as the matching pursuit with post-back fitting or the orthogonal matching pursuit. Petr Vaníček, a Canadian geophysicist and geodesist of the University of New Brunswick, proposed in 1969 also the matching-pursuit approach for equally and unequally spaced data, which he called "successive spectral analysis" and the result a "least-squares periodogram". He generalized this method to account for any systematic components beyond a simple mean, such as a "predicted linear (quadratic, exponential, ...) secular trend of unknown magnitude", and applied it to a variety of samples, in 1971. Vaníček's strictly least-squares method was then simplified in 1976 by Nicholas R. Lomb of the University of Sydney, who pointed out its close connection to periodogram analysis. Subsequently, the definition of a periodogram of unequally spaced data was modified and analyzed by Jeffrey D. Scargle of NASA Ames Research Center, who showed that, with minor changes, it becomes identical to Lomb's least-squares formula for fitting individual sinusoid frequencies. Scargle states that his paper "does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced," and further points out regarding least-squares fitting of sinusoids compared to periodogram analysis, that his paper "establishes, apparently for the first time, that (with the proposed modifications) these two methods are exactly equivalent." Press summarizes the development this way: A completely different method of spectral analysis for unevenly sampled data, one that mitigates these difficulties and has some other very desirable properties, was developed by Lomb, based in part on earlier work by Barning and Vanicek, and additionally elaborated by Scargle. In 1989, Michael J. Korenberg of Queen's University in Kingston, Ontario, developed the "fast orthogonal search" method of more quickly finding a near-optimal decomposition of spectra or other problems, similar to the technique that later became known as the orthogonal matching pursuit. == Development of LSSA and variants == === The Vaníček method === In the Vaníček method, a discrete data set is approximated by a weighted sum of sinusoids of progressively determined frequencies using a standard linear regression or least-squares fit. The frequencies are chosen using a method similar to Barning's, but going further in optimizing the choice of each successive new frequency by picking the frequency that minimizes the residual after least-squares fitting (equivalent to the fitting technique now known as matching pursuit with pre-backfitting). The number of sinusoids must be less than or equal to the number of data samples (counting sines and cosines of the same frequency as separate sinusoids). The relationship between the DFT and the approximation of trigonometric functions using the least-squares method is well explained in (Strutz, 2017). A data vector Φ is represented as a weighted sum of sinusoidal basis functions, tabulated in a matrix A by evaluating each function at the sample times, with weight vector x: ϕ ≈ A x , {\displaystyle \phi \approx {\textbf {A}}x,} where the weights vector x is chosen to minimize the sum of squared errors in approximating Φ. The solution for x is closed-form, using standard linear regression: x = ( A T A ) − 1 A T ϕ . {\displaystyle x=({\textbf {A}}^{\mathrm {T} }{\textbf {A}})^{-1}{\textbf {A}}^{\mathrm {T} }\phi .} Here the matrix A can be based on any set of functions mutually independent (not necessarily orthogonal) when evaluated at the sample times; functions used for spectral analysis are typically sines and cosines evenly distributed over the frequency range of interest. If we choose too many frequencies in a too-narrow frequency range, the functions will be insufficiently independent, the matrix ill-conditioned, and the resulting spectrum meaningless. When the basis functions in A are orthogonal (that is, not correlated, meaning the columns have zero pair-wise dot products), the matrix ATA is diagonal; when the columns all have the same power (sum of squares of elements), then that matrix is an identity matrix times a constant, so the inversion is trivial. The latter is the case when the sample times are equally spaced and sinusoids chosen as sines and cosines equally spaced in pairs on the frequency interval 0 to a half cycle per sample (spaced by 1/N cycles per sample, omitting the sine phases at 0 and maximum frequency where they are identically zero). This case is known as the discrete Fourier transform, slightly rewritten in terms of measurements and coefficients. x = A T ϕ {\displaystyle x={\textbf {A}}^{\mathrm {T} }\phi } — DFT case for N equally spaced samples and frequencies, within a scalar factor. === The Lomb method === Trying to lower the computational burden of the Vaníček method in 1976 (no longer an issue), Lomb proposed using the above simplification in general, except for pair-wise correlations between sine and cosine bases of the same frequency, since the correlations between pairs of sinusoids are often small, at least when they are not tightly spaced. This formulation is essentially that of the traditional periodogram but adapted for use with unevenly spaced samples. The vector x is a reasonably good estimate of an underlying spectrum, but since we ignore any correlations, Ax is no longer a good approximation to the signal, and the method is no longer a least-squares method — yet in the literature continues to be referred to as such. Rather than just taking dot products of the data with sine and cosine waveforms directly, Scargle modified the standard periodogram formula so to find a time delay τ {\displaystyle \tau } first, such that this pair of sinusoids would be mutually orthogonal at sample times t j {\displaystyle t_{j}} and also adjusted for the potentially unequal powers of these two basis functions, to obtain a better estimate of the power at a frequency. This procedure made his modified periodogram method exactly equivalent to Lomb's method. Time delay τ {\displaystyle \tau } by definition equals to tan ⁡ 2 ω τ = ∑ j sin ⁡ 2 ω t j ∑ j cos ⁡ 2 ω t j . {\displaystyle \tan {2\omega \tau }={\frac {\sum _{j}\sin 2\omega t_{j}}{\sum _{j}\cos 2\omega t_{j}}}.} Then the periodogram at frequency ω {\displaystyle \omega } is estimated as: P x ( ω ) = 1 2 [ [ ∑ j X j cos ⁡ ω ( t j − τ ) ] 2 ∑ j cos 2 ⁡ ω ( t j − τ ) + [ ∑ j X j sin ⁡ ω ( t j − τ ) ] 2 ∑ j sin 2 ⁡ ω ( t j − τ ) ] , {\displaystyle P_{x}(\omega )={\frac {1}{2}}\left[{\frac {\left[\sum _{j}X_{j}\cos \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\cos ^{2}\omega (t_{j}-\tau )}}+{\frac {\left[\sum _{j}X_{j}\sin \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\sin ^{2}\omega (t_{j}-\tau )}}\right],} which, as Scargle reports, has the same statistical distribution as the periodogram in the evenly sampled case. At any individual frequency ω {\displaystyle \omega } , this method gives the same power as does a least-squares fit to sinusoids of that frequency and of the form: ϕ ( t ) = A sin ⁡ ω t + B cos ⁡ ω t . {\displaystyle \phi (t)=A\sin \omega t+B\cos \omega t.} In practice, it is always difficult to judge if a given Lomb peak is significant or not, especially when the nature of the noise is unknown, so for example a false-alarm spectr

    Read more →
  • Knuth–Eve algorithm

    Knuth–Eve algorithm

    In computer science, the Knuth–Eve algorithm is an algorithm for polynomial evaluation. It preprocesses the coefficients of the polynomial to reduce the number of multiplications required at runtime. Ideas used in the algorithm were originally proposed by Donald Knuth in 1962. His procedure opportunistically exploits structure in the polynomial being evaluated. In 1964, James Eve determined for which polynomials this structure exists, and gave a simple method of "preconditioning" polynomials (explained below) to endow them with that structure. == Algorithm == === Preliminaries === Consider an arbitrary polynomial p ∈ R [ x ] {\displaystyle p\in \mathbb {R} [x]} of degree n {\displaystyle n} . Assume that n ≥ 3 {\displaystyle n\geq 3} . Define m {\displaystyle m} such that: if n {\displaystyle n} is odd then n = 2 m + 1 {\displaystyle n=2m+1} , and if n {\displaystyle n} is even then n = 2 m + 2 {\displaystyle n=2m+2} . Unless otherwise stated, all variables in this article represent either real numbers or univariate polynomials with real coefficients. All operations in this article are done over R {\displaystyle \mathbb {R} } . Again, the goal is to create an algorithm that returns p ( x ) {\displaystyle p(x)} given any x {\displaystyle x} . The algorithm is allowed to depend on the polynomial p {\displaystyle p} itself, since its coefficients are known in advance. === Overview === ==== Key idea ==== Using polynomial long division, we can write p ( x ) = q ( x ) ⋅ ( x 2 − α ) + ( β x + γ ) , {\displaystyle p(x)=q(x)\cdot (x^{2}-\alpha )+(\beta x+\gamma ),} where x 2 − α {\displaystyle x^{2}-\alpha } is the divisor. Picking a value for α {\displaystyle \alpha } fixes both the quotient q {\displaystyle q} and the coefficients in the remainder β {\displaystyle \beta } and γ {\displaystyle \gamma } . The key idea is to cleverly choose α {\displaystyle \alpha } such that β = 0 {\displaystyle \beta =0} , so that p ( x ) = q ( x ) ⋅ ( x 2 − α ) + γ . {\displaystyle p(x)=q(x)\cdot (x^{2}-\alpha )+\gamma .} This way, no operations are needed to compute the remainder polynomial, since it's just a constant. We apply this procedure recursively to q {\displaystyle q} , expressing p ( x ) = ( ( q ( x ) ⋅ ( x 2 − α m ) + γ m ) ⋯ ) ⋅ ( x 2 − α 1 ) + γ 1 . {\displaystyle p(x)=\left(\left(q(x)\cdot (x^{2}-\alpha _{m})+\gamma _{m}\right)\cdots \right)\cdot (x^{2}-\alpha _{1})+\gamma _{1}.} After m {\displaystyle m} recursive calls, the quotient q {\displaystyle q} is either a linear or a quadratic polynomial. In this base case, the polynomial can be evaluated with (say) Horner's method. ==== "Preconditioning" ==== For arbitrary p {\displaystyle p} , it may not be possible to force β = 0 {\displaystyle \beta =0} at every step of the recursion. Consider the polynomials p e {\displaystyle p^{e}} and p o {\displaystyle p^{o}} with coefficients taken from the even and odd terms of p {\displaystyle p} respectively, so that p ( x ) = p e ( x 2 ) + x ⋅ p o ( x 2 ) . {\displaystyle p(x)=p^{e}(x^{2})+x\cdot p^{o}(x^{2}).} If every root of p o {\displaystyle p^{o}} is real, then it is possible to write p {\displaystyle p} in the form given above. Each α i {\displaystyle \alpha _{i}} is a different root of p o {\displaystyle p^{o}} , counting multiple roots as distinct. Furthermore, if at least n − 1 {\displaystyle n-1} roots of p {\displaystyle p} lie in one half of the complex plane, then every root of p o {\displaystyle p^{o}} is real. Ultimately, it may be necessary to "precondition" p {\displaystyle p} by shifting it — by setting p ( x ) ← p ( x + t ) {\displaystyle p(x)\gets p(x+t)} for some t {\displaystyle t} — to endow it with the structure that most of its roots lie in one half of the complex plane. At runtime, this shift has to be "undone" by first setting x ← x − t {\displaystyle x\gets x-t} . === Preprocessing step === The following algorithm is run once for a given polynomial p {\displaystyle p} . At this point, the values of x {\displaystyle x} that p {\displaystyle p} will be evaluated on are not known. ==== Better choice of t ==== While any t ≥ Re ( r 2 ) {\displaystyle t\geq {\text{Re}}(r_{2})} can work, it is possible to remove one addition during evaluation if t {\displaystyle t} is also chosen such that two roots of p ( x + t ) {\displaystyle p(x+t)} are symmetric about the origin. In that case, α 1 {\displaystyle \alpha _{1}} can be chosen such that the shifted polynomial has a factor of x 2 − α 1 {\displaystyle x^{2}-\alpha _{1}} , so γ 1 = 0 {\displaystyle \gamma _{1}=0} . It is always possible to find such a t {\displaystyle t} . One possible algorithm for choosing t {\displaystyle t} is: === Evaluation step === The following algorithm evaluates p {\displaystyle p} at some, now known, point x {\displaystyle x} . Assuming t {\displaystyle t} is chosen optimally, γ 1 = 0 {\displaystyle \gamma _{1}=0} . So, the final iteration of the loop can instead run y ← y ⋅ ( s − α i ) , {\displaystyle y\gets y\cdot (s-\alpha _{i}),} saving an addition. == Analysis == In total, evaluation using the Knuth–Eve algorithm for a polynomial of degree n {\displaystyle n} requires n {\displaystyle n} additions and ⌊ n / 2 ⌋ + 2 {\displaystyle \lfloor n/2\rfloor +2} multiplications, assuming t {\displaystyle t} is chosen optimally. No algorithm to evaluate a given polynomial of degree n {\displaystyle n} can use fewer than n {\displaystyle n} additions or fewer than ⌈ n / 2 ⌉ {\displaystyle \lceil n/2\rceil } multiplications during evaluation. This result assumes only addition and multiplication are allowed during both preprocessing and evaluation. The Knuth–Eve algorithm is not well-conditioned.

    Read more →
  • List of .NET libraries and frameworks

    List of .NET libraries and frameworks

    This article contains a list of libraries that can be used in .NET languages. These languages require .NET Framework, Mono, or .NET, which provide a basis for software development, platform independence, language interoperability and extensive framework libraries. Standard Libraries (including the Base Class Library) are not included in this article. == Introduction == Apps created with .NET Framework or .NET run in a software environment known as the Common Language Runtime (CLR), an application virtual machine that provides services such as security, memory management, and exception handling. The framework includes a large class library called Framework Class Library (FCL). Thanks to the hosting virtual machine, different languages that are compliant with the .NET Common Language Infrastructure (CLI) can operate on the same kind of data structures. These languages can therefore use the FCL and other .NET libraries that are also written in one of the CLI compliant languages. When the source code of such languages are compiled, the compiler generates platform-independent code in the Common Intermediate Language (CIL, also referred to as bytecode), which is stored in CLI assemblies. When a .NET app runs, the just-in-time compiler (JIT) turns the CIL code into platform-specific machine code. To improve performance, .NET Framework also comes with the Native Image Generator (NGEN), which performs ahead-of-time compilation to machine code. This architecture provides language interoperability. Each language can use code written in other languages. Calls from one language to another are exactly the same as would be within a single programming language. If a library is written in one CLI language, it can be used in other CLI languages. Moreover, apps that consist only of pure .NET assemblies, can be transferred to any platform that contains an implementation of CLI and run on that platform. For example, apps written using .NET can run on Windows, macOS, and various versions of Linux. .NET apps or their libraries, however, may depend on native platform features, e.g. COM. As such, platform independence of .NET apps depends on the ability to transfer necessary native libraries to target platforms. In 2019, the Windows Forms and Windows Presentation Foundation portions of .NET Framework were made open source. === .NET implementations === There are four primary .NET implementations that are actively developed and maintained: .NET Framework: The original .NET implementation that has existed since 2002. While not yet discontinued, Microsoft does not plan on releasing its next major version, 5.0. Mono: A cross-platform implementation of .NET Framework by Ximian, introduced in 2004. It is free and open-source. It is now developed by Xamarin, a subsidiary of Microsoft. Universal Windows Platform (UWP): An implementation of .NET used for building UWP apps. It's designed to unify development for different targeted types of devices, including PCs, tablets, phablets, phones, and the Xbox. .NET: A cross-platform re-implementation of .NET Framework, introduced in 2016 and initially called .NET Core. It is free and open-source. .NET superseded .NET Framework with the release of .NET 5. Each implementation of .NET includes the following components: One or more runtime environments, e.g. Common Language Runtime (CLR) for .NET Framework and CoreCLR for .NET A class library The .NET Standard is a set of common APIs that are implemented in the Base Class Library of any .NET implementation. The class library of each implementation must implement the .NET Standard, but may also implement additional APIs. Traditionally, .NET apps targeted a certain version of a .NET implementation, e.g. .NET Framework 4.6. Starting with the .NET Standard, an app can target a version of the .NET Standard and then it could be used (without recompiling) by any implementation that supports that level of the standard. This enables portability across different .NET implementations. The following table lists the .NET implementations that adhere to the .NET Standard and the version number at which each implementation became compliant with a given version of .NET Standard. For example, according to this table, .NET Core 3.0 was the first version of .NET Core that adhered to .NET Standard 2.1. This means that any version of .NET Core bigger than 3.0 (e.g. .NET Core 3.1) also adheres to .NET Standard 2.1. == Web frameworks == === ASP.NET === First released in 2002, ASP.NET is an open-source server-side web application framework designed for web development to produce dynamic web pages. It is the successor to Microsoft's Active Server Pages (ASP) technology, built on the Common Language Runtime (CLR). === ASP.NET Core === ASP.NET was completely rewritten in 2016 as a modular web framework, together with other frameworks like Entity Framework. The re-written framework uses the new open-source .NET Compiler Platform (also known by its codename "Roslyn") and is cross platform. The programming models ASP.NET MVC, ASP.NET Web API, and ASP.NET Web Pages (a model using only Razor pages) were merged into a unified MVC 6. === Blazor === Blazor is a free and open-source web framework that enables developers to create Single-page Web apps using C# and HTML in ASP.NET Razor pages ("components"). Blazor is part of the ASP.NET Core framework. Blazor Server apps are hosted on a web server, while Blazor WebAssembly apps are downloaded to the client's web browser before running. In addition, a Blazor Hybrid framework is available with server-based and client-based application components. == Numerical libraries == === Open-source numerical libraries === ==== AForge.NET ==== This is a computer vision and artificial intelligence library. It implements a number of genetic, fuzzy logic and machine learning algorithms with several architectures of artificial neural networks with corresponding training algorithms. ==== ALGLIB ==== This is a cross-platform open source numerical analysis and data processing library. It consists of algorithm collections written in different programming languages (C++, C#, FreePascal, Delphi, VBA) and has dual licensing – commercial and GPL. ==== Math.NET Numerics ==== This library aims to provide methods and algorithms for numerical computations in science, engineering and everyday use. Covered topics include special functions, linear algebra, probability models, random numbers, interpolation, integral transforms and more. MIT/X11 license. ==== Meta.Numerics ==== This is a library for advanced scientific computation in the .NET Framework. ==== ML.NET ==== This is a free software machine learning library. The preview release of ML.NET included transforms for feature engineering like n-gram creation, and learners to handle binary classification, multi-class classification, and regression tasks. Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions. === Proprietary numerical libraries === ==== ILNumerics.Net ==== This is a high performance, typesafe numerical array set of classes and functions for general math, FFT and linear algebra. The library, developed for .NET/Mono, aims to provide 32- and 64-bit script-like syntax in C#, 2D & 3D plot controls, and efficient memory management. It is released under GPLv3 or commercial license. ==== Measurement Studio ==== This is an integrated suite of UI controls and class libraries for use in developing test and measurement applications. The analysis class libraries provide various digital signal processing, signal filtering, signal generation, peak detection, and other general mathematical functionality. ==== NMath ==== This is a numerical component library for the .NET platform developed by CenterSpace Software. It includes signal processing (FFT) classes, a linear algebra (LAPACK & BLAS) framework, and a statistics package. == 3D graphics == === Open-source 3D graphics === ==== Open Toolkit (OpenTK) ==== This is a low-level C# binding for OpenGL, OpenGL ES and OpenAL. It runs on Windows, Linux, Mac OS X, BSD, Android and iOS. It can be used standalone or integrated into a GUI. ==== Windows Presentation Foundation (WPF) ==== This is a graphical subsystem for rendering user interfaces, developed by Microsoft. It also contains a 3D rendering engine. In addition, interactive 2D content can be overlaid on 3D surfaces natively. It only runs on Windows operating systems. === Proprietary 3D graphics === ==== Unity ==== This is a cross-platform game engine developed by Unity Technologies and used to develop video games for PC, consoles, mobile devices and websites. == Image processing == === AForge.NET === This is a computer vision and artificial intelligence library. It implements a number of image processing algorithms and filters. It is released under the LGPLv3 and partly GPLv3 license. Majority of the library is written in C# and th

    Read more →
  • Divide-and-conquer algorithm

    Divide-and-conquer algorithm

    In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem. The divide-and-conquer technique is the basis of efficient algorithms for many problems, such as sorting (e.g., quicksort, merge sort), multiplying large numbers (e.g., the Karatsuba algorithm), finding the closest pair of points, syntactic analysis (e.g., top-down parsers), SAT solving, and computing the discrete Fourier transform (FFT). Designing efficient divide-and-conquer algorithms can be difficult. As in mathematical induction, it is often necessary to generalize the problem to make it amenable to a recursive solution. The correctness of a divide-and-conquer algorithm is usually proved by mathematical induction, and its computational cost is often determined by solving recurrence relations. == Divide and conquer == The divide-and-conquer paradigm is often used to find an optimal solution of a problem. Its basic idea is to decompose a given problem into two or more similar, but simpler, subproblems, to solve them in turn, and to compose their solutions to solve the given problem. Problems of sufficient simplicity are solved directly. For example, to sort a given list of n natural numbers, split it into two lists of about n/2 numbers each, sort each of them in turn, and interleave both results appropriately to obtain the sorted version of the given list (see the picture). This approach is known as the merge sort algorithm. The name "divide and conquer" is sometimes applied to algorithms that reduce each problem to only one sub-problem, such as the binary search algorithm for finding a record in a sorted list (or its analogue in numerical computing, the bisection algorithm for root finding). These algorithms can be implemented more efficiently than general divide-and-conquer algorithms; in particular, if they use tail recursion, they can be converted into simple loops. Under this broad definition, however, every algorithm that uses recursion or loops could be regarded as a "divide-and-conquer algorithm". Therefore, some authors consider that the name "divide and conquer" should be used only when each problem may generate two or more subproblems. The name decrease and conquer has been proposed instead for the single-subproblem class. An important application of divide and conquer is in optimization, where if the search space is reduced ("pruned") by a constant factor at each step, the overall algorithm has the same asymptotic complexity as the pruning step, with the constant depending on the pruning factor (by summing the geometric series); this is known as prune and search. == Early historical examples == Early examples of these algorithms are primarily decrease and conquer – the original problem is successively broken down into single subproblems, and indeed can be solved iteratively. Binary search, a decrease-and-conquer algorithm where the subproblems are of roughly half the original size, has a long history. While a clear description of the algorithm on computers appeared in 1946 in an article by John Mauchly, the idea of using a sorted list of items to facilitate searching dates back at least as far as Babylonia in 200 BC. Another ancient decrease-and-conquer algorithm is the Euclidean algorithm to compute the greatest common divisor of two numbers by reducing the numbers to smaller and smaller equivalent subproblems, which dates to several centuries BC. An early example of a divide-and-conquer algorithm with multiple subproblems is Gauss's 1805 description of what is now called the Cooley–Tukey fast Fourier transform (FFT) algorithm, although he did not analyze its operation count quantitatively, and FFTs did not become widespread until they were rediscovered over a century later. An early two-subproblem D&C algorithm that was specifically developed for computers and properly analyzed is the merge sort algorithm, invented by John von Neumann in 1945. Another notable example is the algorithm invented by Anatolii A. Karatsuba in 1960 that could multiply two n-digit numbers in O ( n log 2 ⁡ 3 ) {\displaystyle O(n^{\log _{2}3})} operations (in Big O notation). This algorithm disproved Andrey Kolmogorov's 1956 conjecture that Ω ( n 2 ) {\displaystyle \Omega (n^{2})} operations would be required for that task. As another example of a divide-and-conquer algorithm that did not originally involve computers, Donald Knuth gives the method a post office typically uses to route mail: letters are sorted into separate bags for different geographical areas, each of these bags is itself sorted into batches for smaller sub-regions, and so on until they are delivered. This is related to a radix sort, described for punch-card sorting machines as early as 1929. == Advantages == === Solving difficult problems === Divide and conquer is a powerful tool for solving conceptually difficult problems: all it requires is a way of breaking the problem into sub-problems, of solving the trivial cases, and of combining sub-problems to the original problem. Similarly, decrease and conquer only requires reducing the problem to a single smaller problem, such as the classic Tower of Hanoi puzzle, which reduces moving a tower of height n {\displaystyle n} to move a tower of height n − 1 {\displaystyle n-1} . === Algorithm efficiency === The divide-and-conquer paradigm often helps in the discovery of efficient algorithms. It was the key, for example, to Karatsuba's fast multiplication method, the quicksort and mergesort algorithms, the Strassen algorithm for matrix multiplication, and fast Fourier transforms. In all these examples, the D&C approach led to an improvement in the asymptotic cost of the solution. For example, if (a) the base cases have constant-bounded size, the work of splitting the problem and combining the partial solutions is proportional to the problem's size n {\displaystyle n} , and (b) there is a bounded number p {\displaystyle p} of sub-problems of size ~ n p {\displaystyle {\frac {n}{p}}} at each stage, then the cost of the divide-and-conquer algorithm will be O ( n log p ⁡ n ) {\displaystyle O(n\log _{p}n)} . For other types of divide-and-conquer approaches, running times can also be generalized. For example, when a) the work of splitting the problem and combining the partial solutions take c n {\displaystyle cn} time, where n {\displaystyle n} is the input size and c {\displaystyle c} is some constant; b) when n < 2 {\displaystyle n<2} , the algorithm takes time upper-bounded by c {\displaystyle c} , and c) there are q {\displaystyle q} subproblems where each subproblem has size ~ n 2 {\displaystyle {\frac {n}{2}}} . Then, the running times are as follows: if the number of subproblems q > 2 {\displaystyle q>2} , then the divide-and-conquer algorithm's running time is bounded by O ( n log 2 ⁡ q ) {\displaystyle O(n^{\log _{2}q})} . if the number of subproblems is exactly one, then the divide-and-conquer algorithm's running time is bounded by O ( n ) {\displaystyle O(n)} . If, instead, the work of splitting the problem and combining the partial solutions take c n 2 {\displaystyle cn^{2}} time, and there are 2 subproblems where each has size n 2 {\displaystyle {\frac {n}{2}}} , then the running time of the divide-and-conquer algorithm is bounded by O ( n 2 ) {\displaystyle O(n^{2})} . === Parallelism === Divide-and-conquer algorithms are naturally adapted for execution in multi-processor machines, especially shared-memory systems where the communication of data between processors does not need to be planned in advance because distinct sub-problems can be executed on different processors. === Memory access === Divide-and-conquer algorithms naturally tend to make efficient use of memory caches. The reason is that once a sub-problem is small enough, it and all its sub-problems can, in principle, be solved within the cache, without accessing the slower main memory. An algorithm designed to exploit the cache in this way is called cache-oblivious, because it does not contain the cache size as an explicit parameter. Moreover, D&C algorithms can be designed for important algorithms (e.g., sorting, FFTs, and matrix multiplication) to be optimal cache-oblivious algorithms–they use the cache in a probably optimal way, in an asymptotic sense, regardless of the cache size. In contrast, the traditional approach to exploiting the cache is blocking, as in loop nest optimization, where the problem is explicitly divided into chunks of the appropriate size—this can also use the cache optimally, but only when the algorithm is tuned for the specific cache sizes of a particular machine. The same advantage exists with regards to other hierarchical storage systems, such as NUMA or virtual memory, as well as for multip

    Read more →
  • Metadata

    Metadata

    Metadata (or metainformation) is data (or information) that defines and describes the characteristics of other data. It often helps to describe, explain, locate, or otherwise make data easier to retrieve, use, or manage. For example, the title, author, and publication date of a book are metadata about the book. But, while a data asset is finite, its metadata is infinite. As such, efforts to define, classify types, or structure metadata are expressed as examples in the context of its use. The term "metadata" has a history dating to the 1960s where it occurred in computer science and in popular culture. Different types of metadata serve different functions. For example, descriptive metadata for a document might include the author, creation date, file size and keywords. Metadata has various purposes. It can help users find relevant information and discover resources. It can also help organize electronic resources, provide digital identification, and archive and preserve resources. Metadata allows users to access resources by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information". Metadata of telecommunication activities including Internet traffic is very widely collected by various national governmental organizations. This data is used for the purposes of traffic analysis and can be used for mass surveillance. Unique metadata standards exist for different disciplines (e.g., museum collections, digital audio files, websites, etc.). Describing the contents and context of data or data files increases its usefulness. For example, a web page may include metadata specifying what software language the page is written in (e.g., HTML), what tools were used to create it, what subjects the page is about, and where to find more information about the subject. This metadata can automatically improve the reader's experience and make it easier for users to find the web page online. A CD may include metadata providing information about the musicians, singers, and songwriters whose work appears on the disc. In many countries, government organizations routinely store metadata about emails, telephone calls, web pages, video traffic, IP connections, and cell phone locations. == Types == There are many distinct types of metadata, including: Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. Administrative metadata – the information to help manage a resource, like resource type, and permissions, and when and how it was created. Reference metadata – the information about the contents and quality of statistical data. Statistical metadata – also called process data, may describe processes that collect, process, or produce statistical data. Legal metadata – provides information about the creator, copyright holder, and public licensing, if provided. Metadata is not strictly bound to one of these categories, as it can describe a piece of data in many other ways. While the metadata application is manifold, covering a large variety of fields, there are specialized and well-accepted models to specify types of metadata. Bretherton & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata. Structural metadata describes the structure of database objects such as tables, columns, keys and indexes. Guide metadata helps humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball, metadata can be divided into three categories: technical metadata (or internal metadata), business metadata (or external metadata), and process metadata. Dan Linstedt, creator of the data vault methodology, says business metadata "...provide[s] definition of the functionality, definition of the data, definition of the elements, and definition of how the data is used within business...business metadata includes business requirements, time-lines, business metrics, business process flows, and business terminology." Business metadata is important because it can greatly facilitate the usefulness of the data to business people. A simple example of business metadata is a glossary entry. Hover functionality in an application or web form can enable a glossary definition to be shown when cursor is on a field or term. Other examples of business metadata include annotation ability within applications. For example, a business user may be viewing a business intelligence (BI) report and notice a trend in the data. The user may have background knowledge as to why this trend occurs. Some business intelligence tools enable the user to create an annotation within the report that explains the trend. Such an annotation can enhance other users' understanding of the data. This example is especially powerful because it is created by a business user for the use of other business people. NISO distinguishes three types of metadata: descriptive, structural, and administrative. Descriptive metadata is typically used for discovery and identification, as information to search and locate an object, such as title, authors, subjects, keywords, and publisher. Structural metadata describes how the components of an object are organized. An example of structural metadata would be how pages are ordered to form chapters of a book. Finally, administrative metadata gives information to help manage the source. Administrative metadata refers to the technical information, such as file type, or when and how the file was created. Two sub-types of administrative metadata are rights management metadata and preservation metadata. Rights management metadata explains intellectual property rights, while preservation metadata contains information to preserve and save a resource. Statistical data repositories have their own requirements for metadata in order to describe not only the source and quality of the data but also what statistical processes were used to create the data, which is of particular importance to the statistical community in order to both validate and improve the process of statistical data production. An additional type of metadata beginning to be more developed is accessibility metadata. Accessibility metadata is not a new concept to libraries; however, advances in universal design have raised its profile. Projects like Cloud4All and GPII identified the lack of common terminologies and models to describe the needs and preferences of users and information that fits those needs as a major gap in providing universal access solutions. Those types of information are accessibility metadata. The Schema.org website has incorporated several accessibility properties based on IMS Global Access for All Information Model Data Element Specification. While the efforts to describe and standardize the varied accessibility needs of information seekers are beginning to become more robust, their adoption into established metadata schemas has not been as developed. For example, while Dublin Core (DC)'s "audience" and MARC 21's "reading level" could be used to identify resources suitable for users with dyslexia and DC's "format" could be used to identify resources available in braille, audio, or large print formats, there is more work to be done. == History == Metadata was traditionally used in the card catalogs of libraries until the 1980s when libraries converted their catalog data to digital databases. In the 2000s, as data and information were increasingly stored digitally, this digital data was described using metadata standards. An early description of "meta data" for computer systems was written by David Griffel and Stuart McIntosh at the MIT Center for International Studies in 1967: "In summary then, we have statements in an object language about subject descriptions of data and token codes for the data. We also have statements in a meta language describing the data relationships and transformations, and ought/is relations between norm and data." == Definition == Metadata means "data about data". Metadata is defined as the data providing information about one or more aspects of the data; it is used to summarize basic information about data that can make tracking and working with specific data easier. Some examples include: Means of creation of the data Source of the data Time and date of creation Creator or author of the data Location on a computer network where the data was created Standards used Data quality For example, a digital image may include metadata that describes the size of the image, its color depth, resolution,

    Read more →
  • Novell Storage Manager

    Novell Storage Manager

    Novell Storage Manager is a system software package released by Novell in 2004 that uses identity, policy and directory events to automate full lifecycle management of file storage for individual users and organizational groups. By tying storage management to an organization's existing identity infrastructure, it has been pointed out, Novell Storage Manager enables the administration of users across all file servers "as a single pool rather than [in] separate independently managed domains." Novell Storage Manager is a component of the Novell File Management Suite. == How It Works == Novell Storage Manager dynamically manages and provisions storage based on user and group events that occur in the directory, including user creations, group assignments, moves, renames, and deletions. When a change happens in the directory that affects a user’s file storage needs or user storage policy, Storage Manager applies the appropriate policy and makes the necessary changes at the file system level to address those storage needs. The following key components comprise Novell Storage Manager's identity and policy-driven state machine architecture: Directory services; Storage policies; Novell Storage Manager event monitors; Novell Storage Manager policy engine; Novell Storage Manager agents; and Action objects. This state machine architecture enables the engine to properly deal with transient waits with directory synchronization issues. It also allows recovery from failures involving network communications, a target server or a server running a component of Storage Manager—including the policy engine itself. If a failure or interruption occurs at any point during operation, Storage Manager will be able to successfully continue the operation from where it was when the interruption occurred. == Reviews == Jon Toigo called Novell Storage Manager "a robust and smart approach to corralling user files... into an organized and efficient management scheme". He also said it was "best in class" of the products he'd reviewed.

    Read more →
  • WiPay

    WiPay

    WiPay is a Caribbean-based payment technology company that specializes in electronic payments for businesses. WiPay was founded in 2016 by Aldwyn Wayne Jr., a Trinidadian businessman and graduate of Georgia Tech Institute. In September 2019, WiPay partnered with MasterCard. As a result, WiPay became the only licensed Payment Facilitator (PAYFAC) on both the MasterCard and Visa networks in the region.

    Read more →
  • OpenSMILE

    OpenSMILE

    openSMILE is source-available software for automatic extraction of features from audio signals and for classification of speech and music signals. "SMILE" stands for "Speech & Music Interpretation by Large-space Extraction". The software is mainly applied in the area of automatic emotion recognition and is widely used in the affective computing research community. The openSMILE project exists since 2008 and is maintained by the German company audEERING GmbH since 2013. openSMILE is provided free of charge for research purposes and personal use under a source-available license. For commercial use of the tool, the company audEERING offers custom license options. == Application Areas == openSMILE is used for academic research as well as for commercial applications in order to automatically analyze speech and music signals in real-time. In contrast to automatic speech recognition which extracts the spoken content out of a speech signal, openSMILE is capable of recognizing the characteristics of a given speech or music segment. Examples for such characteristics encoded in human speech are a speaker's emotion, age, gender, and personality, as well as speaker states like depression, intoxication, or vocal pathological disorders. The software further includes music classification technology for automatic music mood detection and recognition of chorus segments, key, chords, tempo, meter, dance-style, and genre. The openSMILE toolkit serves as benchmark in manifold research competitions such as Interspeech ComParE, AVEC, MediaEval, and EmotiW. == History == The openSMILE project was started in 2008 by Florian Eyben, Martin Wöllmer, and Björn Schuller at the Technical University of Munich within the European Union research project SEMAINE. The goal of the SEMAINE project was to develop a virtual agent with emotional and social intelligence. In this system, openSMILE was applied for real-time analysis of speech and emotion. The final SEMAINE software release is based on openSMILE version 1.0.1. In 2009, the emotion recognition toolkit (openEAR) was published based on openSMILE. "EAR" stands for "Emotion and Affect Recognition". In 2010, openSMILE version 1.0.1 was published and was introduced and awarded at the ACM Multimedia Open-Source Software Challenge. Between 2011 and 2013, the technology of openSMILE was extended and improved by Florian Eyben and Felix Weninger in the context of their doctoral thesis at the Technical University of Munich. The software was also applied for the project ASC-Inclusion, which was funded by the European Union. For this project, the software was extended by Erik Marchi in order to teach emotional expression to autistic children, based on automatic emotion recognition and visualization. In 2013, the company audEERING acquired the rights to the code-base from the Technical University of Munich and version 2.0 was published under a source-available research license. Until 2016, openSMILE was downloaded more than 50,000 times worldwide and has established itself as a standard toolkit for emotion recognition. == Awards == openSMILE was awarded in 2010 in the context of the ACM Multimedia Open Source Competition. The software tool is applied in numerous scientific publications on automatic emotion recognition. openSMILE and its extension openEAR have been cited in more than 1000 scientific publications until today.

    Read more →
  • Explore-then-commit algorithm

    Explore-then-commit algorithm

    Explore Then Commit (ETC) is an algorithm for the multi-armed bandit problem foc,used on finding the best trade-off between exploration and exploitation. == Multi-armed bandit problem == The multi-armed bandit problem is a sequential game where one player has to choose at each turn between K {\displaystyle K} actions (arms). Behind every arm a {\displaystyle a} is an unknown distribution ν a {\displaystyle \nu _{a}} that lies in a set D {\displaystyle {\mathcal {D}}} known by the player (for example, D {\displaystyle {\mathcal {D}}} can be the set of Gaussian distributions or Bernoulli distributions). At each turn t {\displaystyle t} the player chooses (pulls) an arm a t {\displaystyle a_{t}} , they then get an observation X t {\displaystyle X_{t}} of the distribution ν a t {\displaystyle \nu _{a_{t}}} . === Regret minimization === The goal is to minimize the regret at time T {\displaystyle T} that is defined as R T := ∑ a = 1 K Δ a E [ N a ( T ) ] {\displaystyle R_{T}:=\sum _{a=1}^{K}\Delta _{a}\mathbb {E} [N_{a}(T)]} where μ a := E [ ν a ] {\displaystyle \mu _{a}:=\mathbb {E} [\nu _{a}]} is the mean of arm a {\displaystyle a} μ ∗ := max a μ a {\displaystyle \mu ^{}:=\max _{a}\mu _{a}} is the highest mean Δ a := μ ∗ − μ a {\displaystyle \Delta _{a}:=\mu ^{}-\mu _{a}} N a ( t ) {\displaystyle N_{a}(t)} is the number of pulls of arm a {\displaystyle a} up to turn t {\displaystyle t} The player has to find an algorithm that chooses at each turn t {\displaystyle t} which arm to pull based on the previous actions and observations ( a s , X s ) s < t {\displaystyle (a_{s},X_{s})_{s Read more →

  • KiSAO

    KiSAO

    The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. KiSAO is part of the BioModels.net project and of the COMBINE initiative. == Structure == KiSAO consists of three main branches: simulation algorithm simulation algorithm characteristic simulation algorithm parameter The elements of each algorithm branch are linked to characteristic and parameter branches using has characteristic and has parameter relationships accordingly. The algorithm branch itself is hierarchically structured using relationships which denote that the descendant algorithms were derived from, or specify, more general ancestors.

    Read more →
  • Solomonoff's theory of inductive inference

    Solomonoff's theory of inductive inference

    Solomonoff's theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment being observed is generated by an unknown algorithm. This is also called a theory of induction. Due to its basis in the dynamical (state-space model) character of Algorithmic Information Theory, it encompasses statistical as well as dynamical information criteria for model selection. It was introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory. Solomonoff proved that this induction is incomputable (or more precisely, lower semi-computable), but noted that "this incomputability is of a very benign kind", and that it "in no way inhibits its use for practical prediction" (as it can be approximated from below more accurately with more computational resources). It is only "incomputable" in the benign sense that no scientific consensus is able to prove that the best current scientific theory is the best of all possible theories. However, Solomonoff's theory does provide an objective criterion for deciding among the current scientific theories explaining a given set of observations. Solomonoff's induction naturally formalizes Occam's razor by assigning larger prior credences to theories that require a shorter algorithmic description. == Origin == === Philosophical === The theory is based in philosophical foundations, and was founded by Ray Solomonoff around 1960. It is a mathematically formalized combination of Occam's razor and the Principle of Multiple Explanations. All computable theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Marcus Hutter's universal artificial intelligence builds upon this to calculate the expected value of an action. === Principle === Solomonoff's induction has been argued to be the computational formalization of pure Bayesianism. To understand, recall that Bayesianism derives the posterior probability P [ T | D ] {\displaystyle \mathbb {P} [T|D]} of a theory T {\displaystyle T} given data D {\displaystyle D} by applying Bayes rule, which yields P [ T | D ] = P [ D | T ] P [ T ] P [ D | T ] P [ T ] + ∑ A ≠ T P [ D | A ] P [ A ] {\displaystyle \mathbb {P} [T|D]={\frac {\mathbb {P} [D|T]\mathbb {P} [T]}{\mathbb {P} [D|T]\mathbb {P} [T]+\sum _{A\neq T}\mathbb {P} [D|A]\mathbb {P} [A]}}} where theories A {\displaystyle A} are alternatives to theory T {\displaystyle T} . For this equation to make sense, the quantities P [ D | T ] {\displaystyle \mathbb {P} [D|T]} and P [ D | A ] {\displaystyle \mathbb {P} [D|A]} must be well-defined for all theories T {\displaystyle T} and A {\displaystyle A} . In other words, any theory must define a probability distribution over observable data D {\displaystyle D} . Solomonoff's induction essentially boils down to demanding that all such probability distributions be computable. Interestingly, the set of computable probability distributions is a subset of the set of all programs, which is countable. Similarly, the sets of observable data considered by Solomonoff were finite. Without loss of generality, we can thus consider that any observable data is a finite bit string. As a result, Solomonoff's induction can be defined by only invoking discrete probability distributions. Solomonoff's induction then allows to make probabilistic predictions of future data F {\displaystyle F} , by simply obeying the laws of probability. Namely, we have P [ F | D ] = E T [ P [ F | T , D ] ] = ∑ T P [ F | T , D ] P [ T | D ] {\displaystyle \mathbb {P} [F|D]=\mathbb {E} _{T}[\mathbb {P} [F|T,D]]=\sum _{T}\mathbb {P} [F|T,D]\mathbb {P} [T|D]} . This quantity can be interpreted as the average predictions P [ F | T , D ] {\displaystyle \mathbb {P} [F|T,D]} of all theories T {\displaystyle T} given past data D {\displaystyle D} , weighted by their posterior credences P [ T | D ] {\displaystyle \mathbb {P} [T|D]} . === Mathematical === The proof of the "razor" is based on the known mathematical properties of a probability distribution over a countable set. These properties are relevant because the infinite set of all programs is a denumerable set. The sum S of the probabilities of all programs must be exactly equal to one (as per the definition of probability) thus the probabilities must roughly decrease as we enumerate the infinite set of all programs, otherwise S will be strictly greater than one. To be more precise, for every ϵ {\displaystyle \epsilon } > 0, there is some length l such that the probability of all programs longer than l is at most ϵ {\displaystyle \epsilon } . This does not, however, preclude very long programs from having very high probability. Fundamental ingredients of the theory are the concepts of algorithmic probability and Kolmogorov complexity. The universal prior probability of any prefix p of a computable sequence x is the sum of the probabilities of all programs (for a universal computer) that compute something starting with p. Given some p and any computable but unknown probability distribution from which x is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of x in optimal fashion. == Mathematical guarantees == === Solomonoff's completeness === The remarkable property of Solomonoff's induction is its completeness. In essence, the completeness theorem guarantees that the expected cumulative errors made by the predictions based on Solomonoff's induction are upper-bounded by the Kolmogorov complexity of the (stochastic) data generating process. The errors can be measured using the Kullback–Leibler divergence or the square of the difference between the induction's prediction and the probability assigned by the (stochastic) data generating process. === Solomonoff's uncomputability === Unfortunately, Solomonoff also proved that Solomonoff's induction is uncomputable. In fact, he showed that computability and completeness are mutually exclusive: any complete theory must be uncomputable. The proof of this is derived from a game between the induction and the environment. Essentially, any computable induction can be tricked by a computable environment, by choosing the computable environment that negates the computable induction's prediction. This fact can be regarded as an instance of the no free lunch theorem. == Modern applications == === Artificial intelligence === Though Solomonoff's inductive inference is not computable, several AIXI-derived algorithms approximate it in order to make it run on a modern computer. The more computing power they are given, the closer their predictions are to the predictions of inductive inference (their mathematical limit is Solomonoff's inductive inference). Another direction of inductive inference is based on E. Mark Gold's model of learning in the limit from 1967 and has developed since then more and more models of learning. The general scenario is the following: Given a class S of computable functions, is there a learner (that is, recursive functional) which for any input of the form (f(0),f(1),...,f(n)) outputs a hypothesis (an index e with respect to a previously agreed on acceptable numbering of all computable functions; the indexed function may be required consistent with the given values of f). A learner M learns a function f if almost all its hypotheses are the same index e, which generates the function f; M learns S if M learns every f in S. Basic results are that all recursively enumerable classes of functions are learnable while the class REC of all computable functions is not learnable. Many related models have been considered and also the learning of classes of recursively enumerable sets from positive data is a topic studied from Gold's pioneering paper in 1967 onwards. A far reaching extension of the Gold’s approach is developed by Schmidhuber's theory of generalized Kolmogorov complexities, which are kinds of super-recursive algorithms.

    Read more →
  • Kunstweg

    Kunstweg

    Bürgi's Kunstweg is a set of algorithms developed by Jost Bürgi in the late 16th century. They are used to calculate sines to arbitrary precision.. Bürgi used these algorithms to calculate a Canon Sinuum, a sine table in increments of 2 arc seconds. It is believed that the table featured values accurate to eight sexagesimal places. Some authors have speculated that the table only covered the range from 0° to 45°, although there is no evidence supporting this claim. Such tables were crucial for maritime navigation. Johannes Kepler described the Canon Sinuum as the most precise sine table known at the time. Bürgi explained his algorithms in his work Fundamentum Astronomiae, which he presented to Emperor Rudolf II in 1592. The Kunstweg algorithm calculates sine values iteratively. In each step, the value of a cell is the sum of the two preceding cells in the same column. The final cell's value is halved before beginning the next iteration. Ultimately, the values in the last column are normalized. Accurate sine approximations are achieved after only a few iterations. In 2015, Menso Folkerts and coworkers demonstrated that this iterative process does indeed converge toward the true sine values. According to them this was the first step towards differential calculus.

    Read more →
  • Data (word)

    Data (word)

    The word data is most often used as a singular collective mass noun in educated everyday usage. However, due to the history and etymology of the word, considerable controversy has existed on whether it should be considered a mass noun used with verbs conjugated in the singular, or should be treated as the plural of the now-rarely-used datum. == Usage in English == In one sense, data is the plural form of datum. Datum actually can also be a count noun with the plural datums (see usage in datum article) that can be used with cardinal numbers (e.g., "80 datums"); data (originally a Latin plural) is not used like a normal count noun with cardinal numbers and can be plural with plural determiners such as these and many, or it can be used as a mass noun with a verb in the singular form. Even when a very small quantity of data is referenced (one number, for example), the phrase piece of data is often used, as opposed to datum. The debate over appropriate usage continues, but "data" as a singular form is far more common. In English, the word datum is still used in the general sense of "an item given". In cartography, geography, nuclear magnetic resonance and technical drawing, it is often used to refer to a single specific reference datum from which distances to all other data are measured. Any measurement or result is a datum, though data point is now far more common. Data is indeed most often used as a singular mass noun in educated everyday usage. Some major newspapers, such as The New York Times, use it either in the singular or plural. In The New York Times, the phrases "the survey data are still being analyzed" and "the first year for which data is available" have appeared within one day. The Wall Street Journal explicitly allows this usage in its style guide. The Associated Press style guide classifies data as a collective noun that takes the singular when treated as a unit but the plural when referring to individual items (e.g., "The data is sound" and "The data have been carefully collected"). In scientific writing, data is often treated as a plural, as in These data do not support the conclusions, but the word is also used as a singular mass entity like information (e.g., in computing and related disciplines). British usage now widely accepts treating data as singular in standard English, including everyday newspaper usage at least in non-scientific use. UK scientific publishing still prefers treating it as a plural. Some UK university style guides recommend using data for both singular and plural use, and others recommend treating it only as a singular in connection with computers. The IEEE Computer Society allows usage of data as either a mass noun or plural based on author preference, while IEEE in the editorial style manual indicates to always use the plural form. Some professional organizations and style guides require that authors treat data as a plural noun. For example, the Air Force Flight Test Center once stated that the word data is always plural, never singular.

    Read more →