AI Detector Generator

AI Detector Generator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • XLeratorDB

    XLeratorDB

    XLeratorDB is a suite of database function libraries that enable Microsoft SQL Server to perform a wide range of additional (non-native) business intelligence and ad hoc analytics. The libraries, which are embedded and run centrally on the database, include more than 450 individual functions similar to those found in Microsoft Excel spreadsheets. The individual functions are grouped and sold as six separate libraries based on usage: finance, statistics, math, engineering, unit conversions and strings. WestClinTech, the company that developed XLeratorDB, claims it is "the first commercial function package add-in for Microsoft SQL Server." == Company history == WestClinTech (LLC), founded by software industry veterans Charles Flock and Joe Stampf in 2008, is located in Irvington, New York, United States. Flock was a co-founder of The Frustum Group, developer of the OPICS enterprise banking and trading platform, which was acquired by London-based Misys, PLC in 1996. Stampf joined Frustum in 1994 and with Flock remained active with the company after acquisition, helping to develop successive generations of OPICS now employed by over 150 leading financial institutions worldwide. Following a full year of research, development and testing, WestClinTech introduced and recorded its first commercial sale of XLeratorDB in April 2009. In September 2009, XLeratorDB became available to all Federal agencies through NASA's Strategic Enterprise-Wide Procurement (SEWP-IV) program, a government-wide acquisition contract. == Technology == XLeratorDB uses Microsoft SQL CLR(Common Language Runtime) technology. SQL CLR allows managed code to be hosted by, and run in, the Microsoft SQL Server environment. SQL CLR relies on the creation, deployment and registration of .NET Framework assemblies that are physically stored in managed code dynamic-link libraries (DLL). The assemblies may contain .NET namespaces, classes, functions, and properties. Because managed code compiles to native code prior to execution, functions using SQL CLR can achieve significant performance increases versus the equivalent functions written in T-SQL in some scenarios. XLeratorDB requires Microsoft SQL Server 2005 or SQL Server 2005 Express editions, or later (compatibility mode 90 or higher). The product installs with PERMISSION_SET=SAFE. SAFE mode, the most restrictive permission set, is accessible by all users. Code executed by an assembly with SAFE permissions cannot access external system resources such as files, the network, the internet, environment variables, or the registry. == Functions == In computer science, a function is a portion of code within a larger program which performs a specific task and is relatively independent of the remaining code. As used in database and spreadsheet applications these functions generally represent mathematical formulas widely used across a variety of fields. While this code may be user-generated, it is also embedded as a pre-written sub-routine in applications. These functions are typically identified by common nomenclature which corresponds to their underlying operations: e.g. IRR identifies the function which calculates Internal Rate of Return on a series of periodic cash flows. === Function uses === As subroutines, functions can be integrated and used in a variety of ways, and as part of larger, more complicated applications. Within large enterprise applications they may, for example, play an important role in defining business rules or risk management parameters, while remaining virtually invisible to end users. Within database management systems and spreadsheets, however, these kinds of functions also represent discrete sets of tools; they can be accessed directly and utilized on a stand-alone basis, or in more complex, user-defined configurations. In this context, functions can be used for business intelligence and ad hoc analysis of data in fields such as finance, statistics, engineering, math, etc. === Function types === XLeratorDB uses three kinds of functions to perform analytic operations: scalar, aggregate, and a hybrid form which WestClinTech calls Range Queries. Scalar functions take a single value, perform an operation and return a single value. An example of this type of function is LOG, which returns the logarithm of a number to a specified base. Aggregate functions operate on a series of values but return a single, summarizing value. An example of this type of function is AVG, which returns the average of values in a specified group. In XLeratorDB there are some functions which have characteristics of aggregate functions (operating on multiple series of values) but cannot be processed in SQL CLR using single column inputs, such as AVG does. For example, irregular internal rate of return (XIRR), a financial function, operates on a collection of cash flow values from one column, but must also apply variable period lengths from another column and an initial iterative assumption from a third, in order to return a single, summarizing value. WestClinTech documentation notes that Range Queries specify the data to be included in the result set of the function independently of the WHERE clause associated with the T-SQL statement, by incorporating a SELECT statement into the function as a string argument; the function then traps that SELECT statement, executes it internally and processes the result. Some XLeratorDB functions that employ Range Queries are: NPV, XNPV, IRR, XIRR, MIRR, MULTINOMIAL, and SERIESSUM. Within the application these functions are identified by a "_q" naming convention: e.g. NPV_q, IRR_q, etc. == Analytic functions == === SQL Server functions === Microsoft SQL Server is the #3 selling database management system (DBMS), behind Oracle and IBM. (While versions of SQL Server have been on the market since 1987, XLeratorDB is compatible with only the 2005 edition and later.) Like all major DBMS, SQL Server performs a variety of data mining operations by returning or arraying data in different views (also known as drill-down). In addition, SQL Server uses Transact-SQL (T-SQL) to execute four major classes of pre-defined functions in native mode. Functions operating on the DBMS offer several advantages over client layer applications like Excel: they utilize the most up-to-date data available; they can process far larger quantities of data; and, the data is not subject to exporting and transcription errors. SQL Server 2008 includes a total of 58 functions that perform relatively basic aggregation (12), math (23) and string manipulation (23) operations useful for analytics; it includes no native functions that perform more complex operations directly related to finance, statistics or engineering. === Excel functions === Microsoft Excel, a component of Microsoft Office suite, is one of the most widely used spreadsheet applications on the market today. In addition to its inherent utility as a stand-alone desktop application, Excel overlaps and complements the functionality of DBMS in several ways: storing and arraying data in rows and columns; performing certain basic tasks such as pivot table and aggregating values; and facilitating sharing, importing and exporting of database data. Excel's chief limitation relative to a true database is capacity; Excel 2003 is limited to some 65k rows and 256 columns; Excel 2007 extends this capacity to roughly 1million rows and 16k columns. By comparison, SQL Server is able to manage over 500k terabytes of memory. Excel offers, however, an extensive library of specialized pre-written functions which are useful for performing ad hoc analysis on database data. Excel 2007 includes over 300 of these pre-defined functions, although customized functions can also be created by users, or imported from third party developers as add-ons. Excel functions are grouped by type: === Excel business intelligence functions === Operating on the client computing layer Excel plays an important role as a business intelligence tool because it: performs a wide array of complex analytic functions not native to most DBMS software offers far greater ad hoc reporting and analytic flexibility than most enterprise software provides a medium for sharing and collaborating because of its ubiquity throughout the enterprise Microsoft reinforces this positioning with Business Intelligence documentation that positions Excel in a clearly pivotal role. === XLeratorDB vs. Excel functions === While operating within the database environment, XLeratorDB functions utilize the same naming conventions and input formats, and in most cases, return the same calculation results as Excel functions. XLeratorDB, coupled with SQL Server's native capabilities, compares to Excel's function sets as follows:

    Read more →
  • Lancichinetti–Fortunato–Radicchi benchmark

    Lancichinetti–Fortunato–Radicchi benchmark

    Lancichinetti–Fortunato–Radicchi benchmark is an algorithm that generates benchmark networks (artificial networks that resemble real-world networks). They have a priori known communities and are used to compare different community detection methods. The advantage of the benchmark over other methods is that it accounts for the heterogeneity in the distributions of node degrees and of community sizes. == The algorithm == The node degrees and the community sizes are distributed according to a power law, with different exponents. The benchmark assumes that both the degree and the community size have power law distributions with different exponents, γ {\displaystyle \gamma } and β {\displaystyle \beta } , respectively. N {\displaystyle N} is the number of nodes and the average degree is ⟨ k ⟩ {\displaystyle \langle k\rangle } . There is a mixing parameter μ {\displaystyle \mu } , which is the average fraction of neighboring nodes of a node that do not belong to any community that the benchmark node belongs to. This parameter controls the fraction of edges that are between communities. Thus, it reflects the amount of noise in the network. At the extremes, when μ = 0 {\displaystyle \mu =0} all links are within community links, if μ = 1 {\displaystyle \mu =1} all links are between nodes belonging to different communities. One can generate the benchmark network using the following steps. Step 1: Generate a network with nodes following a power law distribution with exponent γ {\displaystyle \gamma } and choose extremes of the distribution k min {\displaystyle k_{\min }} and k max {\displaystyle k_{\max }} to get desired average degree is ⟨ k ⟩ {\displaystyle \langle k\rangle } . Step 2: ( 1 − μ ) {\displaystyle (1-\mu )} fraction of links of every node is with nodes of the same community, while fraction μ {\displaystyle \mu } is with the other nodes. Step 3: Generate community sizes from a power law distribution with exponent β {\displaystyle \beta } . The sum of all sizes must be equal to N {\displaystyle N} . The minimal and maximal community sizes s min {\displaystyle s_{\min }} and s max {\displaystyle s_{\max }} must satisfy the definition of community so that every non-isolated node is in at least in one community: s min > k min {\displaystyle s_{\min }>k_{\min }} s max > k max {\displaystyle s_{\max }>k_{\max }} Step 4: Initially, no nodes are assigned to communities. Then, each node is randomly assigned to a community. As long as the number of neighboring nodes within the community does not exceed the community size a new node is added to the community, otherwise stays out. In the following iterations the “homeless” node is randomly assigned to some community. If that community is complete, i.e. the size is exhausted, a randomly selected node of that community must be unlinked. Stop the iteration when all the communities are complete and all the nodes belong to at least one community. Step 5: Implement rewiring of nodes keeping the same node degrees but only affecting the fraction of internal and external links such that the number of links outside the community for each node is approximately equal to the mixing parameter μ {\displaystyle \mu } . == Testing == Consider a partition into communities that do not overlap. The communities of randomly chosen nodes in each iteration follow a p ( C ) {\displaystyle p(C)} distribution that represents the probability that a randomly picked node is from the community C {\displaystyle C} . Consider a partition of the same network that was predicted by some community finding algorithm and has p ( C 2 ) {\displaystyle p(C_{2})} distribution. The benchmark partition has p ( C 1 ) {\displaystyle p(C_{1})} distribution. The joint distribution is p ( C 1 , C 2 ) {\displaystyle p(C_{1},C_{2})} . The similarity of these two partitions is captured by the normalized mutual information. I n = ∑ C 1 , C 2 p ( C 1 , C 2 ) log 2 ⁡ p ( C 1 , C 2 ) p ( C 1 ) p ( C 2 ) 1 2 H ( { p ( C 1 ) } ) + 1 2 H ( { p ( C 2 ) } ) {\displaystyle I_{n}={\frac {\sum _{C_{1},C_{2}}p(C_{1},C_{2})\log _{2}{\frac {p(C_{1},C_{2})}{p(C_{1})p(C_{2})}}}{{\frac {1}{2}}H(\{p(C_{1})\})+{\frac {1}{2}}H(\{p(C_{2})\})}}} If I n = 1 {\displaystyle I_{n}=1} the benchmark and the detected partitions are identical, and if I n = 0 {\displaystyle I_{n}=0} then they are independent of each other.

    Read more →
  • List of algorithms

    List of algorithms

    An algorithm is a fundamental set of rules or defined procedures that are typically designed and used to be a simpler way to solve a specific problem or a broad set of problems. Simply speaking, algorithms define different processes, sets of rules and regulations, or methodologies that are to be followed through in calculations, data processing, data mining, pattern recognition, automated reasoning or other problem-solving operations. With the increasing automation of services, more and more decisions are being made by algorithms. Some general examples are risk assessments, anticipatory policing, and pattern recognition technology. The following is a list of well-known algorithms. == Automated planning == == Combinatorial algorithms == === General combinatorial algorithms === Brent's algorithm: finds a cycle in function value iterations using only two iterators Floyd's cycle-finding algorithm: finds a cycle in function value iterations Gale–Shapley algorithm: solves the stable matching problem Pseudorandom number generators (uniformly distributed—see also List of pseudorandom number generators for other PRNGs with varying degrees of convergence and varying statistical quality): ACORN generator Blum Blum Shub Lagged Fibonacci generator Linear congruential generator Mersenne Twister === Graph algorithms === Blossom algorithm: algorithm for constructing maximum-cardinality matching on graphs. Coloring algorithm: algorithms for graph (vertex or edge) coloring (subject to constraints, e.g. proper coloring or list coloring) Hopcroft–Karp algorithm: convert a bipartite graph to a maximum-cardinality matching Hungarian algorithm: algorithm for finding a perfect matching Prüfer coding: conversion between a labeled tree and its Prüfer sequence Tarjan's off-line lowest common ancestors algorithm: computes lowest common ancestors for pairs of nodes in a tree Topological sort: finds linear order of nodes (e.g. jobs) based on their dependencies. ==== Graph drawing ==== Coin graph drawing algorithms for finite connected planar graphs (approximately computing the theoretical circle-packing given by the Koebe-Andreev-Thurston theorem). See also Fáry's theorem on straight-line drawings of planar graphs. Force-based algorithms (also known as force-directed algorithms or spring-based algorithms) Spectral layout ==== Network theory ==== Network analysis Link analysis Girvan–Newman algorithm: detect communities in complex systems Web link analysis Hyperlink-Induced Topic Search (HITS) (also known as Hubs and authorities) PageRank TrustRank Flow networks Dinic's algorithm: is a strongly polynomial algorithm for computing the maximum flow in a flow network. Edmonds–Karp algorithm: implementation of Ford–Fulkerson Ford–Fulkerson algorithm: computes the maximum flow in a graph Karger's algorithm: a Monte Carlo method to compute the minimum cut of a connected graph Push–relabel algorithm: computes a maximum flow in a graph ==== Routing for graphs ==== Edmonds' algorithm (also known as Chu–Liu/Edmonds' algorithm): find maximum or minimum branchings Euclidean minimum spanning tree: algorithms for computing the minimum spanning tree of a set of points in the plane Longest path problem: find a simple path of maximum length in a given graph Minimum spanning tree Borůvka's algorithm Kruskal's algorithm Prim's algorithm Reverse-delete algorithm Nonblocking minimal spanning switch say, for a telephone exchange Shortest path problem Bellman–Ford algorithm: computes shortest paths in a weighted graph (where some of the edge weights may be negative) Dijkstra's algorithm: computes shortest paths in a graph with non-negative edge weights Floyd–Warshall algorithm: solves the all pairs shortest path problem in a weighted, directed graph Johnson's algorithm: all pairs shortest path algorithm in sparse weighted directed graph Transitive closure problem: find the transitive closure of a given binary relation Traveling salesman problem Christofides algorithm Nearest neighbour algorithm Vehicle routing problem Clarke and Wright Saving algorithm Warnsdorff's rule: a heuristic method for solving the Knight's tour problem ==== Graph search ==== A: special case of best-first search that uses heuristics to improve speed B: a best-first graph search algorithm that finds the least-cost path from a given initial node to any goal node (out of one or more possible goals) Backtracking: abandons partial solutions when they are found not to satisfy a complete solution Beam search: is a heuristic search algorithm that is an optimization of best-first search that reduces its memory requirement Beam stack search: integrates backtracking with beam search Best-first search: traverses a graph in the order of likely importance using a priority queue Bidirectional search: find the shortest path from an initial vertex to a goal vertex in a directed graph Breadth-first search: traverses a graph level by level Brute-force search: an exhaustive and reliable search method, but computationally inefficient in many applications D: an incremental heuristic search algorithm Depth-first search: traverses a graph branch by branch Dijkstra's algorithm: a special case of A for which no heuristic function is used General Problem Solver: a seminal theorem-proving algorithm intended to work as a universal problem solver machine. Iterative deepening depth-first search (IDDFS): a state space search strategy Jump point search: an optimization to A which may reduce computation time by an order of magnitude using further heuristics Lexicographic breadth-first search (also known as Lex-BFS): a linear time algorithm for ordering the vertices of a graph SSS: state space search traversing a game tree in a best-first fashion similar to that of the A search algorithm Uniform-cost search: a tree search that finds the lowest-cost route where costs vary ==== Subgraphs ==== Cliques Bron–Kerbosch algorithm: a technique for finding maximal cliques in an undirected graph MaxCliqueDyn maximum clique algorithm: find a maximum clique in an undirected graph Strongly connected components Kosaraju's algorithm Path-based strong component algorithm Tarjan's strongly connected components algorithm Subgraph isomorphism problem === Sequence algorithms === ==== Approximate sequence matching ==== Bitap algorithm: fuzzy algorithm that determines if strings are approximately equal. Phonetic algorithms Daitch–Mokotoff Soundex: a Soundex refinement which allows matching of Slavic and Germanic surnames Double Metaphone: an improvement on Metaphone Match rating approach: a phonetic algorithm developed by Western Airlines Metaphone: an algorithm for indexing words by their sound, when pronounced in English NYSIIS: phonetic algorithm, improves on Soundex Soundex: a phonetic algorithm for indexing names by sound, as pronounced in English String metrics: computes a similarity or dissimilarity (distance) score between two pairs of text strings Damerau–Levenshtein distance: computes a distance measure between two strings, improves on Levenshtein distance Dice's coefficient (also known as the Dice coefficient): a similarity measure related to the Jaccard index Hamming distance: sum number of positions which are different Jaro–Winkler distance: is a measure of similarity between two strings Levenshtein edit distance: computes a metric for the amount of difference between two sequences Trigram search: search for text when the exact syntax or spelling of the target object is not precisely known ==== Selection algorithms ==== Introselect Quickselect ==== Sequence search ==== Linear search: locates an item in an unsorted sequence Selection algorithm: finds the kth largest item in a sequence Sorted lists Binary search algorithm: locates an item in a sorted sequence Eytzinger binary search: cache friendly binary search algorithm Fibonacci search technique: search a sorted sequence using a divide and conquer algorithm that narrows down possible locations with the aid of Fibonacci numbers Jump search (or block search): linear search on a smaller subset of the sequence Predictive search: binary-like search which factors in magnitude of search term versus the high and low values in the search. Sometimes called dictionary search or interpolated search. Uniform binary search: an optimization of the classic binary search algorithm Ternary search: a technique for finding the minimum or maximum of a function that is either strictly increasing and then strictly decreasing or vice versa ==== Sequence merging ==== k-way merge algorithm Simple merge algorithm Union (merge, with elements on the output not repeated) ==== Sequence permutations ==== Fisher–Yates shuffle (also known as the Knuth shuffle): randomly shuffle a finite set Heap's permutation generation algorithm: interchange elements to generate next permutation Schensted algorithm: constructs a pair of Young tableaux from a permutation Steinhaus–Johnson–Trotter algorithm (also known as the Johnson–Trotter algorithm):

    Read more →
  • Artificial Intelligence Applications Institute

    Artificial Intelligence Applications Institute

    The Artificial Intelligence Applications Institute (AIAI) at the School of Informatics at the University of Edinburgh is a non-profit technology transfer organisation that promoted research in the field of artificial intelligence. == History == The Artificial Intelligence Applications Institute (AIAI) was founded in 1983 at the University of Edinburgh as a specialist research and technology-transfer unit focusing on the practical uses of artificial intelligence (AI). The institute was established by Professor Jim Howe and colleagues from the Science and Engineering Research Council (SERC) Special Interest Group in AI in the Department of Artificial Intelligence, with a mission to apply AI techniques to solve real-world industrial and governmental problems. Under the directorship of Austin Tate, who served from 1985 to 2019, AIAI became one of the leading UK research centres devoted to AI programming systems, intelligent planning systems, decision support, and knowledge-based engineering. It collaborated with both academic partners and international organisations such as the European Space Agency and the UK Ministry of Defence. In 2001, AIAI joined the newly created Centre for Intelligent Systems and their Applications (CISA) within the University's School of Informatics. In December 2019, the institute was renamed the Artificial Intelligence and its Applications Institute to reflect a broader integration of fundamental and applied AI research. == Research programmes == AIAI’s research spans multiple areas of artificial intelligence, including: AI programming Systems - Edinburgh Prolog, Edinburgh Common Lisp, Logo; Knowledge representation and reasoning – development of ontologies, rule-based inference, and semantic modelling; Automated planning and scheduling – intelligent task management systems used in aerospace, manufacturing, and emergency response; Natural language processing and intelligent agents – interaction frameworks for human–computer collaboration; AI ethics and decision-making – research into responsible deployment and evaluation of autonomous systems. The institute also contributes to interdisciplinary fields such as computational creativity, explainable AI, and human–AI interaction. AIAI maintains close collaboration with the Bayes Centre and the Alan Turing Institute through joint research programmes and doctoral training initiatives. == Technology transfer and impact == From its inception, AIAI has combined academic research with technology-transfer activity, offering professional training, industrial consultancy, and bespoke software systems. It pioneered one of the earliest knowledge-based project-management systems, O-Plan, later evolved into the I-Plan framework used for autonomous planning and workflow management.

    Read more →
  • Ordered dithering

    Ordered dithering

    Ordered dithering is any image dithering algorithm which uses a pre-set threshold map tiled across an image. It is commonly used to display a continuous image on a display of smaller color depth. For example, Microsoft Windows uses it in 16-color graphics modes. With the most common "Bayer" threshold map, the algorithm is characterized by noticeable crosshatch patterns in the result. == Threshold map == The algorithm reduces the number of colors by applying a threshold map M to the pixels displayed, causing some pixels to change color, depending on the distance of the original color from the available color entries in the reduced palette. The first threshold maps were designed by hand to minimise the perceptual difference between a grayscale image and its two-bit quantisation for up to a 4x4 matrix. An optimal threshold matrix is one that for any possible quantisation of color has the minimum possible texture so that the greatest impression of the underlying feature comes from the image being quantised. It can be proven that for matrices whose side length is a power of two there is an optimal threshold matrix. The map may be rotated or mirrored without affecting the effectiveness of the algorithm. This threshold map (for sides with length as power of two) is also known as a Bayer matrix or, when unscaled, an index matrix. For threshold maps whose dimensions are a power of two, the map can be generated recursively via: M 2 n = 1 ( 2 n ) 2 [ 4 M n 4 M n + 2 J n 4 M n + 3 J n 4 M n + J n ] = J 2 ⊗ M n + 1 n 2 M 2 ⊗ J n , {\displaystyle \mathbf {M} _{2n}={\frac {1}{(2n)^{2}}}{\begin{bmatrix}4\mathbf {M} _{n}&4\mathbf {M} _{n}+2\mathbf {J} _{n}\\4\mathbf {M} _{n}+3\mathbf {J} _{n}&4\mathbf {M} _{n}+\mathbf {J} _{n}\end{bmatrix}}=\mathbf {J} _{2}\otimes \mathbf {M} _{n}+{\frac {1}{n^{2}}}\mathbf {M} _{2}\otimes \mathbf {J} _{n},} where J n {\displaystyle \mathbf {J} _{n}} are n × n {\displaystyle n\times n} matrices of ones and ⊗ {\displaystyle \otimes } is the Kronecker product. While the metric for texture that Bayer proposed could be used to find optimal matrices for sizes that are not a power of two, such matrices are uncommon as no simple formula for finding them exists, and relatively small matrix sizes frequently give excellent practical results (especially when combined with other modifications to the dithering algorithm). This function can also be expressed using only bit arithmetic: M(i, j) = bit_reverse(bit_interleave(bitwise_xor(i, j), i)) / n ^ 2 == Pre-calculated threshold maps == Rather than storing the threshold map as a matrix of n {\displaystyle n} × n {\displaystyle n} integers from 0 to n 2 {\displaystyle n^{2}} , depending on the exact hardware used to perform the dithering, it may be beneficial to pre-calculate the thresholds of the map into a floating point format, rather than the traditional integer matrix format shown above. For this, the following formula can be used: Mpre(i,j) = Mint(i,j) / n^2 This generates a standard threshold matrix. for the 2×2 map: this creates the pre-calculated map: Additionally, normalizing the values to average out their sum to 0 (as done in the dithering algorithm shown below) can be done during pre-processing as well by subtracting 1⁄2 of the largest value from every value: Mpre(i,j) = Mint(i,j) / n^2 – 0.5 maxValue creating the pre-calculated map: == Algorithm == The ordered dithering algorithm renders the image normally, but for each pixel, it offsets its color value with a corresponding value from the threshold map according to its location, causing the pixel's value to be quantized to a different color if it exceeds the threshold. For most dithering purposes, it is sufficient to simply add the threshold value to every pixel (without performing normalization by subtracting 1⁄2), or equivalently, to compare the pixel's value to the threshold: if the brightness value of a pixel is less than the number in the corresponding cell of the matrix, plot that pixel black, otherwise, plot it white. This lack of normalization slightly increases the average brightness of the image, and causes almost-white pixels to not be dithered. This is not a problem when using a gray scale palette (or any palette where the relative color distances are (nearly) constant), and it is often even desired, since the human eye perceives differences in darker colors more accurately than lighter ones, however, it produces incorrect results especially when using a small or arbitrary palette, so proper normalization should be preferred. In other words, the algorithm performs the following transformation on each color c of every pixel: c ′ = n e a r e s t _ p a l e t t e _ c o l o r ( c + r × ( M ( x mod n , y mod n ) − 1 / 2 ) ) {\displaystyle c'=\mathrm {nearest\_palette\_color} {\mathopen {}}\left(c+r\times \left(M(x{\bmod {n}},y{\bmod {n}})-1/2\right){\mathclose {}}\right)} where M(i, j) is the threshold map on the i-th row and j-th column, c′ is the transformed color, and r is the amount of spread in color space. Assuming an RGB palette with 23N evenly distanced colors where each color (a triple of red, green and blue values) is represented by an octet from 0 to 255, one would typically choose r ≈ 255 N {\textstyle r\approx {\frac {255}{N}}} . (1⁄2 is again the normalizing term.) Because the algorithm operates on single pixels and has no conditional statements, it is very fast and suitable for real-time transformations. Additionally, because the location of the dithering patterns always stays the same relative to the display frame, it is less prone to jitter than error-diffusion methods, making it suitable for animations. Because the patterns are more repetitive than error-diffusion method, an image with ordered dithering compresses better. Ordered dithering is more suitable for line-art graphics as it will result in straighter lines and fewer anomalies. The values read from the threshold map should preferably scale into the same range as the minimal difference between distinct colors in the target palette. Equivalently, the size of the map selected should be equal to or larger than the ratio of source colors to target colors. For example, when quantizing a 24 bpp image to 15 bpp (256 colors per channel to 32 colors per channel), the smallest map one would choose would be 4×2, for the ratio of 8 (256:32). This allows expressing each distinct tone of the input with different dithering patterns. === A variable palette: pattern dithering === == Non-Bayer approaches == The above thresholding matrix approach describes the Bayer family of ordered dithering algorithms. A number of other algorithms are also known; they generally involve changes in the threshold matrix, which changes the distribution of the "noise" introduced by all kinds of dithering (the difference between the original image and the dithered image). === Halftone === Halftone dithering performs a form of clustered dithering, creating a look similar to halftone patterns, using a specially crafted matrix. === Void and cluster === The Void and cluster algorithm uses a pre-generated blue noise as the matrix for the dithering process. The blue noise matrix keeps the Bayer's good high frequency content, but with a more uniform coverage of all the frequencies involved shows a much lower amount of patterning. The "voids-and-cluster" method gets its name from the matrix generation procedure, where a black image with randomly initialized white pixels is gaussian-blurred to find the brightest and darkest parts, corresponding to voids and clusters. After a few swaps have evenly distributed the bright and dark parts, the pixels are numbered by importance. It takes significant computational resources to generate the blue noise matrix: on a modern computer a 64×64 matrix requires a couple seconds using the original algorithm. This algorithm can be extended to make animated dither masks which also consider the axis of time. This is done by running the algorithm in three dimensions and using a kernel which is a product of a two-dimensional gaussian kernel on the XY plane, and a one-dimensional Gaussian kernel on the Z axis. === Simulated Annealing === Simulated annealing can generate dither masks by starting with a flat histogram and swapping values to optimize a loss function. The loss function controls the spectral properties of the mask, allowing it to make blue noise or noise patterns meant to be filtered by specific filters. The algorithm can also be extended over time for animated dither masks with chosen temporal properties.

    Read more →
  • Adaptive algorithm

    Adaptive algorithm

    An adaptive algorithm is an algorithm that changes its behavior at the time it is run, based on information available and on a priori defined reward mechanism (or criterion). Such information could be the story of recently received data, information on the available computational resources, or other run-time acquired (or a priori known) information related to the environment in which it operates. Among the most used adaptive algorithms is the Widrow-Hoff’s least mean squares (LMS), which represents a class of stochastic gradient-descent algorithms used in adaptive filtering and machine learning. In adaptive filtering the LMS is used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal (difference between the desired and the actual signal). For example, stable partition, using no additional memory is O(n lg n) but given O(n) memory, it can be O(n) in time. As implemented by the C++ Standard Library, stable_partition is adaptive and so it acquires as much memory as it can get (up to what it would need at most) and applies the algorithm using that available memory. Another example is adaptive sort, whose behavior changes upon the presortedness of its input. An example of an adaptive algorithm in radar systems is the constant false alarm rate (CFAR) detector. In machine learning and optimization, many algorithms are adaptive or have adaptive variants, which usually means that the algorithm parameters such as learning rate are automatically adjusted according to statistics about the optimisation thus far (e.g. the rate of convergence). Examples include adaptive simulated annealing, adaptive coordinate descent, adaptive quadrature, AdaBoost, Adagrad, Adadelta, RMSprop, and Adam. In data compression, adaptive coding algorithms such as Adaptive Huffman coding or Prediction by partial matching can take a stream of data as input, and adapt their compression technique based on the symbols that they have already encountered. In signal processing, the Adaptive Transform Acoustic Coding (ATRAC) codec used in MiniDisc recorders is called "adaptive" because the window length (the size of an audio "chunk") can change according to the nature of the sound being compressed, to try to achieve the best-sounding compression strategy.

    Read more →
  • Controlled vocabulary

    Controlled vocabulary

    A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, preferred terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction. == In library and information science == In library and information science, controlled vocabulary is a carefully selected list of words and phrases, which are used to tag units of information (document or work) so that they may be more easily retrieved by a search. Controlled vocabularies solve the problems of homographs, synonyms and polysemes by a bijection between concepts and preferred terms. In short, controlled vocabularies reduce unwanted ambiguity inherent in normal human languages where the same concept can be given different names and ensure consistency. For example, in the Library of Congress Subject Headings (a subject heading system that uses a controlled vocabulary), preferred terms—subject headings in this case—have to be chosen to handle choices between variant spellings of the same word (American versus British), choice among scientific and popular terms (cockroach versus Periplaneta americana), and choices between synonyms (automobile versus car), among other difficult issues. Choices of preferred terms are based on the principles of user warrant (what terms users are likely to use), literary warrant (what terms are generally used in the literature and documents), and structural warrant (terms chosen by considering the structure, scope of the controlled vocabulary). Controlled vocabularies also typically handle the problem of homographs with qualifiers. For example, the term pool has to be qualified to refer to either swimming pool or the game pool to ensure that each preferred term or heading refers to only one concept. === Types used in libraries === There are two main kinds of controlled vocabulary tools used in libraries: subject headings and thesauri. While the differences between the two are diminishing, there are still some minor differences: Historically, subject headings were designed to describe books in library catalogs by catalogers while thesauri were used by indexers to apply index terms to documents and articles. Subject headings tend to be broader in scope describing whole books, while thesauri tend to be more specialized covering very specific disciplines. Because of the card catalog system, subject headings tend to have terms that are in indirect order (though with the rise of automated systems this is being removed), while thesaurus terms are always in direct order. Subject headings tend to use more pre-coordination of terms such that the designer of the controlled vocabulary will combine various concepts together to form one preferred subject heading. (e.g., children and terrorism) while thesauri tend to use singular direct terms. Thesauri list not only equivalent terms but also narrower, broader terms and related terms among various preferred and non-preferred (but potentially synonymous) terms, while historically most subject headings did not. For example, the Library of Congress Subject Heading itself did not have much syndetic structure until 1943, and it was not until 1985 when it began to adopt the thesauri type term "Broader term" and "Narrow term". The terms are chosen and organized by trained professionals (including librarians and information scientists) who possess expertise in the subject area. Controlled vocabulary terms can accurately describe what a given document is actually about, even if the terms themselves do not occur within the document's text. Well known subject heading systems include the Library of Congress system, Medical Subject Headings (MeSH) created by the United States National Library of Medicine, and Sears. Well known thesauri include the Art and Architecture Thesaurus and the ERIC Thesaurus. When selecting terms for a controlled vocabulary, the designer has to consider the specificity of the term chosen, whether to use direct entry, inter consistency and stability of the language. Lastly the amount of pre-coordination (in which case the degree of enumeration versus synthesis becomes an issue) and post-coordination in the system is another important issue. Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as metadata. == Indexing languages == There are three main types of indexing languages. Controlled indexing language – only approved terms can be used by the indexer to describe the document Natural language indexing language – any term from the document in question can be used to describe the document Free indexing language – any term (not only from the document) can be used to describe the document When indexing a document, the indexer also has to choose the level of indexing exhaustivity, the level of detail in which the document is described. For example, using low indexing exhaustivity, minor aspects of the work will not be described with index terms. In general the higher the indexing exhaustivity, the more terms indexed for each document. In recent years free text search as a means of access to documents has become popular. This involves using natural language indexing with an indexing exhaustively set to maximum (every word in the text is indexed). These methods have been compared in some studies, such as the 2007 article, "A Comparative Evaluation of Full-text, Concept-based, and Context-sensitive Search". === Advantages === Controlled vocabularies are often claimed to improve the accuracy of free text searching, such as to reduce irrelevant items in the retrieval list. These irrelevant items (false positives) are often caused by the inherent ambiguity of natural language. Take the English word football for example. Football is the name given to a number of different team sports. Worldwide the most popular of these team sports is association football, which also happens to be called soccer in several countries. The word football is also applied to rugby football (rugby union and rugby league), American football, Australian rules football, Gaelic football, and Canadian football. A search for football therefore will retrieve documents that are about several completely different sports. Controlled vocabulary solves this problem by tagging the documents in such a way that the ambiguities are eliminated. Compared to free text searching, the use of a controlled vocabulary can dramatically increase the performance of an information retrieval system, if performance is measured by precision (the percentage of documents in the retrieval list that are actually relevant to the search topic). In some cases controlled vocabulary can enhance recall as well, because unlike natural language schemes, once the correct preferred term is searched, there is no need to search for other terms that might be synonyms of that term. === Disadvantages === A controlled vocabulary search may lead to unsatisfactory recall, in that it will fail to retrieve some documents that are actually relevant to the search question. This is particularly problematic when the search question involves terms that are sufficiently tangential to the subject area such that the indexer might have decided to tag it using a different term (but the searcher might consider the same). Essentially, this can be avoided only by an experienced user of controlled vocabulary whose understanding of the vocabulary coincides with that of the indexer. Another possibility is that the article is just not tagged by the indexer because indexing exhaustivity is low. For example, an article might mention football as a secondary focus, and the indexer might decide not to tag it with "football" because it is not important enough compared to the main focus. But it turns out that for the searcher that article is relevant and hence recall fails. A free text search would automatically pick up that article regardless. On the other hand, free text searches have high exhaustivity (every word is searched) so although it has much lower precision, it has potential for high recall as long as the searcher overcome the problem of synonyms by entering every combination. Controlled vocabularies may become outdated rapidly in fast developing fields of knowledge, unless the preferred terms are updated regularly. Even in an ideal scenario, a controlled vocabulary is often less specific than the words of the text itself. Indexers trying to choose the appropriate index terms might misinterpret the author, while this precise problem is not a factor in a free text, as it uses the author's own words. The use of controlled vocabularies can be costly compared to free

    Read more →
  • Zassenhaus algorithm

    Zassenhaus algorithm

    In mathematics, the Zassenhaus algorithm is a method to calculate a basis for the intersection and sum of two subspaces of a vector space. It is named after Hans Zassenhaus, but no publication of this algorithm by him is known. It is used in computer algebra systems. == Algorithm == === Input === Let V be a vector space and U, W two finite-dimensional subspaces of V with the following spanning sets: U = ⟨ u 1 , … , u n ⟩ {\displaystyle U=\langle u_{1},\ldots ,u_{n}\rangle } and W = ⟨ w 1 , … , w k ⟩ . {\displaystyle W=\langle w_{1},\ldots ,w_{k}\rangle .} Finally, let B 1 , … , B m {\displaystyle B_{1},\ldots ,B_{m}} be linearly independent vectors so that u i {\displaystyle u_{i}} and w i {\displaystyle w_{i}} can be written as u i = ∑ j = 1 m a i , j B j {\displaystyle u_{i}=\sum _{j=1}^{m}a_{i,j}B_{j}} and w i = ∑ j = 1 m b i , j B j . {\displaystyle w_{i}=\sum _{j=1}^{m}b_{i,j}B_{j}.} === Output === The algorithm computes the base of the sum U + W {\displaystyle U+W} and a base of the intersection U ∩ W {\displaystyle U\cap W} . === Algorithm === The algorithm creates the following block matrix of size ( ( n + k ) × ( 2 m ) ) {\displaystyle ((n+k)\times (2m))} : ( a 1 , 1 a 1 , 2 ⋯ a 1 , m a 1 , 1 a 1 , 2 ⋯ a 1 , m ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ a n , 1 a n , 2 ⋯ a n , m a n , 1 a n , 2 ⋯ a n , m b 1 , 1 b 1 , 2 ⋯ b 1 , m 0 0 ⋯ 0 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ b k , 1 b k , 2 ⋯ b k , m 0 0 ⋯ 0 ) {\displaystyle {\begin{pmatrix}a_{1,1}&a_{1,2}&\cdots &a_{1,m}&a_{1,1}&a_{1,2}&\cdots &a_{1,m}\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\a_{n,1}&a_{n,2}&\cdots &a_{n,m}&a_{n,1}&a_{n,2}&\cdots &a_{n,m}\\b_{1,1}&b_{1,2}&\cdots &b_{1,m}&0&0&\cdots &0\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\b_{k,1}&b_{k,2}&\cdots &b_{k,m}&0&0&\cdots &0\end{pmatrix}}} Using elementary row operations, this matrix is transformed to the row echelon form. Then, it has the following shape: ( c 1 , 1 c 1 , 2 ⋯ c 1 , m ∙ ∙ ⋯ ∙ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ c q , 1 c q , 2 ⋯ c q , m ∙ ∙ ⋯ ∙ 0 0 ⋯ 0 d 1 , 1 d 1 , 2 ⋯ d 1 , m ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⋯ 0 d ℓ , 1 d ℓ , 2 ⋯ d ℓ , m 0 0 ⋯ 0 0 0 ⋯ 0 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⋯ 0 0 0 ⋯ 0 ) {\displaystyle {\begin{pmatrix}c_{1,1}&c_{1,2}&\cdots &c_{1,m}&\bullet &\bullet &\cdots &\bullet \\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\c_{q,1}&c_{q,2}&\cdots &c_{q,m}&\bullet &\bullet &\cdots &\bullet \\0&0&\cdots &0&d_{1,1}&d_{1,2}&\cdots &d_{1,m}\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\0&0&\cdots &0&d_{\ell ,1}&d_{\ell ,2}&\cdots &d_{\ell ,m}\\0&0&\cdots &0&0&0&\cdots &0\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\0&0&\cdots &0&0&0&\cdots &0\end{pmatrix}}} Here, ∙ {\displaystyle \bullet } stands for arbitrary numbers, and the vectors ( c p , 1 , c p , 2 , … , c p , m ) {\displaystyle (c_{p,1},c_{p,2},\ldots ,c_{p,m})} for every p ∈ { 1 , … , q } {\displaystyle p\in \{1,\ldots ,q\}} and ( d p , 1 , … , d p , m ) {\displaystyle (d_{p,1},\ldots ,d_{p,m})} for every p ∈ { 1 , … , ℓ } {\displaystyle p\in \{1,\ldots ,\ell \}} are nonzero. Then ( y 1 , … , y q ) {\displaystyle (y_{1},\ldots ,y_{q})} with y i := ∑ j = 1 m c i , j B j {\displaystyle y_{i}:=\sum _{j=1}^{m}c_{i,j}B_{j}} is a basis of U + W {\displaystyle U+W} and ( z 1 , … , z ℓ ) {\displaystyle (z_{1},\ldots ,z_{\ell })} with z i := ∑ j = 1 m d i , j B j {\displaystyle z_{i}:=\sum _{j=1}^{m}d_{i,j}B_{j}} is a basis of U ∩ W {\displaystyle U\cap W} . === Proof of correctness === First, we define π 1 : V × V → V , ( a , b ) ↦ a {\displaystyle \pi _{1}:V\times V\to V,(a,b)\mapsto a} to be the projection to the first component. Let H := { ( u , u ) ∣ u ∈ U } + { ( w , 0 ) ∣ w ∈ W } ⊆ V × V . {\displaystyle H:=\{(u,u)\mid u\in U\}+\{(w,0)\mid w\in W\}\subseteq V\times V.} Then π 1 ( H ) = U + W {\displaystyle \pi _{1}(H)=U+W} and H ∩ ( 0 × V ) = 0 × ( U ∩ W ) {\displaystyle H\cap (0\times V)=0\times (U\cap W)} . Also, H ∩ ( 0 × V ) {\displaystyle H\cap (0\times V)} is the kernel of π 1 | H {\displaystyle {\pi _{1}|}_{H}} , the projection restricted to H. Therefore, dim ⁡ ( H ) = dim ⁡ ( U + W ) + dim ⁡ ( U ∩ W ) {\displaystyle \dim(H)=\dim(U+W)+\dim(U\cap W)} . The Zassenhaus algorithm calculates a basis of H. In the first m columns of this matrix, there is a basis y i {\displaystyle y_{i}} of U + W {\displaystyle U+W} . The rows of the form ( 0 , z i ) {\displaystyle (0,z_{i})} (with z i ≠ 0 {\displaystyle z_{i}\neq 0} ) are obviously in H ∩ ( 0 × V ) {\displaystyle H\cap (0\times V)} . Because the matrix is in row echelon form, they are also linearly independent. All rows which are different from zero ( ( y i , ∙ ) {\displaystyle (y_{i},\bullet )} and ( 0 , z i ) {\displaystyle (0,z_{i})} ) are a basis of H, so there are dim ⁡ ( U ∩ W ) {\displaystyle \dim(U\cap W)} such z i {\displaystyle z_{i}} s. Therefore, the z i {\displaystyle z_{i}} s form a basis of U ∩ W {\displaystyle U\cap W} . == Example == Consider the two subspaces U = ⟨ ( 1 − 1 0 1 ) , ( 0 0 1 − 1 ) ⟩ {\displaystyle U=\left\langle \left({\begin{array}{r}1\\-1\\0\\1\end{array}}\right),\left({\begin{array}{r}0\\0\\1\\-1\end{array}}\right)\right\rangle } and W = ⟨ ( 5 0 − 3 3 ) , ( 0 5 − 3 − 2 ) ⟩ {\displaystyle W=\left\langle \left({\begin{array}{r}5\\0\\-3\\3\end{array}}\right),\left({\begin{array}{r}0\\5\\-3\\-2\end{array}}\right)\right\rangle } of the vector space R 4 {\displaystyle \mathbb {R} ^{4}} . Using the standard basis, we create the following matrix of dimension ( 2 + 2 ) × ( 2 ⋅ 4 ) {\displaystyle (2+2)\times (2\cdot 4)} : ( 1 − 1 0 1 1 − 1 0 1 0 0 1 − 1 0 0 1 − 1 5 0 − 3 3 0 0 0 0 0 5 − 3 − 2 0 0 0 0 ) . {\displaystyle \left({\begin{array}{rrrrrrrr}1&-1&0&1&&1&-1&0&1\\0&0&1&-1&&0&0&1&-1\\\\5&0&-3&3&&0&0&0&0\\0&5&-3&-2&&0&0&0&0\end{array}}\right).} Using elementary row operations, we transform this matrix into the following matrix: ( 1 0 0 0 ∙ ∙ ∙ ∙ 0 1 0 − 1 ∙ ∙ ∙ ∙ 0 0 1 − 1 ∙ ∙ ∙ ∙ 0 0 0 0 1 − 1 0 1 ) {\displaystyle \left({\begin{array}{rrrrrrrrr}1&0&0&0&&\bullet &\bullet &\bullet &\bullet \\0&1&0&-1&&\bullet &\bullet &\bullet &\bullet \\0&0&1&-1&&\bullet &\bullet &\bullet &\bullet \\\\0&0&0&0&&1&-1&0&1\end{array}}\right)} (Some entries have been replaced by " ∙ {\displaystyle \bullet } " because they are irrelevant to the result.) Therefore ( ( 1 0 0 0 ) , ( 0 1 0 − 1 ) , ( 0 0 1 − 1 ) ) {\displaystyle \left(\left({\begin{array}{r}1\\0\\0\\0\end{array}}\right),\left({\begin{array}{r}0\\1\\0\\-1\end{array}}\right),\left({\begin{array}{r}0\\0\\1\\-1\end{array}}\right)\right)} is a basis of U + W {\displaystyle U+W} , and ( ( 1 − 1 0 1 ) ) {\displaystyle \left(\left({\begin{array}{r}1\\-1\\0\\1\end{array}}\right)\right)} is a basis of U ∩ W {\displaystyle U\cap W} .

    Read more →
  • PenTile matrix family

    PenTile matrix family

    PenTile matrix is a family of patented subpixel matrix schemes used in electronic device displays. PenTile is a trademark of Samsung. PenTile matrices are used in AMOLED and LCD displays. These subpixel layouts are specifically designed to operate with proprietary algorithms for subpixel rendering embedded in the display driver, allowing plug and play compatibility with conventional RGB (Red-Green-Blue) stripe panels. == Overview == "PenTile Matrix" (a neologism from penta-, meaning "five" in Greek and tile) describes the geometric layout of the prototypical subpixel arrangement developed in the early 1990s. The layout consists of a quincunx comprising two red subpixels, two green subpixels, and one central blue subpixel in each unit cell. It was inspired by biomimicry of the human retina, which has nearly equal numbers of L and M type cone cells, but significantly fewer S cones. As the S cones are primarily responsible for perceiving blue colors, which do not appreciably affect the perception of luminance, reducing the number of blue subpixels with respect to the red and green subpixels in a display does not reduce the image quality. However, the layout may cause color leakage image distortion, which can be reduced by filters. In some cases the layout causes reduced moiré and blockiness compared to conventional RGB layouts. The PenTile layout is specifically designed to work with and be dependent upon subpixel rendering that uses only one and a quarter subpixel per pixel, on average, to render an image. That is, that any given input pixel is mapped to either a red-centered logical pixel, or a green-centered logical pixel. === History === PenTile was invented by Candice H. Brown Elliott, for which she was awarded the Society for Information Display's Otto Schade Prize in 2014. The technology was licensed by the company Clairvoyante from 2000 until 2008, during which time several prototype PenTile displays were developed by a number of Asian liquid crystal display (LCD) manufacturers. In March 2008, Samsung Electronics acquired Clairvoyante's PenTile IP assets. Samsung then funded a new company, Nouvoyance, Inc. to continue development of the PenTile technology. == PenTile RGBG == PenTile RGBG layout used in AMOLED and plasma displays uses green pixels interleaved with alternating red and blue pixels. The human eye is most sensitive to green, especially for high resolution luminance information. The green subpixels are mapped to input pixels on a one-to-one basis. The red and blue subpixels are subsampled, reconstructing the chroma signal at a lower resolution. The luminance signal is processed using adaptive subpixel rendering filters to optimize reconstruction of high spatial frequencies from the input image, wherein the green subpixels provide the majority of the reconstruction. The red and blue subpixels are capable of reconstructing the horizontal and vertical spatial frequencies, but not the highest of the diagonal. Diagonal high spatial frequency information in the red and blue channels of the input image are transferred to the green subpixels for image reconstruction. Thus the RG-BG scheme creates a color display with one third fewer subpixels than a traditional RGB-RGB scheme but with the same measured luminance display resolution. This is similar to the Bayer filter commonly used in digital cameras. === Devices === As of 2021, "almost all" OLED screens in portable consumer devices use some form of Pentile subpixel layout. == PenTile RGBW == PenTile RGBW technology, used in LCD, adds an extra subpixel to the traditional red, green and blue subpixels that is a clear area without color filtering material and with the only purpose of letting backlight come through, hence W for white. This makes it possible to produce a brighter image compared to an RGB-matrix while using the same amount of power, or produce an equally bright image while using less power. The PenTile RGBW layout uses each red, green, blue and white subpixel to present high-resolution luminance information to the human eyes' red-sensing and green-sensing cone cells, while using the combined effect of all the color subpixels to present lower-resolution chroma (color) information to all three cone cell types. Combined, this optimizes the match of display technology to the biological mechanisms of human vision. The layout uses one third fewer subpixels for the same resolution as the RGB stripe (RGB-RGB) layout, in spite of having four color primaries instead of the conventional three, using subpixel rendering combined with metamer rendering. Metamer rendering optimizes the energy distribution between the white subpixel and the combined red, green, and blue subpixels: W <> RGB, to improve image sharpness. The display driver chip has an RGB to RGBW color vector space converter and gamut mapping algorithm, followed by metamer and subpixel rendering algorithms. In order to maintain saturated color quality, to avoid simultaneous contrast error between saturated colors and peak white brightness, while simultaneously reducing backlight power requirements, the display backlight brightness is under control of the PenTile driver engine. When the image is mostly desaturated colors, those near white or grey, the backlight brightness is significantly reduced, often to less than 50% peak, while the LCD levels are increased to compensate. When the image has very bright saturated colors, the backlight brightness is maintained at higher levels. The PenTile RGBW also has an optional high-brightness mode that doubles the brightness of the desaturated color image areas, such as black-and-white text, for improved outdoor viewability. === Devices === Motorola MC65 Motorola ES55 Motorola ES400 Motorola Atrix 4G Samsung Galaxy Note 10.1 2014 version Lenovo Yoga 2 Pro Lenovo Yoga 3 Pro HP ENVY TouchSmart 14-k022tx Sleekbook MSI GS60 Ghost Pro 4K Lenovo IdeaPad Y50 4K Asus ZenBook UX303LN 4K Asus ZenBook Pro UX501JW LG UH7500/6500/6100 LG ThinQ G7/G7+ Oculus Quest 1 == Controversy == An ongoing controversy regarding the definition or measurement of resolution of color subpixelated flat panel displays led many people to question the resolution claims of PenTile display products. Journalists have noted that in "just about every flat-panel TV in existence, each pixel is composed of one red, one green, and one blue subpixel (RGB), all of uniform size". In traditional flat-panel screens, the resolution is defined by the number of red, green, and blue subpixels, in groups of three, in an array in each axis. As a result, each pixel or group of subpixels can render any colour on the screen, regardless of neighbouring pixels. This is not the case with PenTile screens. The Video Electronics Standards Association (VESA) method of measuring and defining resolution in color displays is to measure the contrast of line pairs, requiring a minimum of 50% Michelson contrast for displays intended for rendering text. The developers of PenTile displays use this VESA criterion for contrast of line pairs to calculate the resolutions specified. In the RGBG layout the alternate red and blue subpixels are 'shared' or sub-sampled with neighboring pixels. Due to the one third lower subpixel density on PenTile displays the pixel structure may be more visible when compared to RGB stripe displays with the same pixel density. The loss of subpixels for a given resolution specification has led some journalists to describe the use of PenTile as "shady practice" and "sort of cheating". For a given size and resolution specification, the PenTile screen can appear grainy, pixelated, speckled, with blurred text on some saturated colors and backgrounds when compared to RGB stripe color. This effect is understood to be caused by the restriction of the number of subpixels that may participate in the image reconstruction when colors are highly saturated to primaries. In the RGBW case, this is caused as the W subpixel will not be available in order to maintain the saturated color. In the RGBG case, this effect will occur when the color boundary is primarily red or blue, as the fully populated (one green per pixel) sub-pixel cannot contribute. For all other cases, text and especially full color images are effectively reconstructed. == Advantages and disadvantages == The PenTile layout reduces the number of subpixels needed to create a specified resolution. Consequently it is possible to achieve an HD resolution on a PenTile AMOLED screen at lower cost than other technologies, and most reviewers note that "300 ppi" (as per VESA - not full pixels) resolution displays (such as Samsung Galaxy S III) make the PenTile effect less obvious than lower resolution PenTile displays (Droid Razr). The second advantage is lower power consumption: the HTC One S's use of a PenTile display makes it more energy efficient and thinner than equivalent LCD screens, giving it better battery life than the HTC One X's IPS LCD. A PenTile AMOLED screen is also

    Read more →
  • MarkLogic Server

    MarkLogic Server

    MarkLogic Server is a document-oriented database developed by MarkLogic. It is a NoSQL multi-model database that evolved from an XML database to natively store JSON documents and RDF triples, the data model for semantics. MarkLogic is designed to be a data hub for operational and analytical data. == History == MarkLogic Server was built to address shortcomings with existing search and data products. The product first focused on using XML as the document markup standard and XQuery as the query standard for accessing collections of documents up to hundreds of terabytes in size. Currently the MarkLogic platform is widely used in publishing, government, finance and other sectors. MarkLogic's customers are mostly Global 2000 companies. == Technology == MarkLogic uses documents without upfront schemas to maintain a flexible data model. In addition to having a flexible data model, MarkLogic uses a distributed, scale-out architecture that can handle hundreds of billions of documents and hundreds of terabytes of data. It has received Common Criteria certification, and has high availability and disaster recovery. MarkLogic is designed to run on-premises and within public or private cloud environments like Amazon Web Services. == Features == Indexing MarkLogic indexes the content and structure of documents including words, phrases, relationships, and values in over 200 languages with tokenization, collation, and stemming for core languages. Functionality includes the ability to toggle range indexes, geospatial indexes, the RDF triple index, and reverse indexes on or off based on your data, the kinds of queries that you will run, and your desired performance. Full-text search MarkLogic supports search across its data and metadata using a word or phrase and incorporates Boolean logic, stemming, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and search term weighting. Data can be searched using JavaScript, XQuery, SPARQL, and SQL. Semantics MarkLogic uses RDF triples to provide semantics for ease of storing metadata and querying. ACID Unlike other NoSQL databases, MarkLogic maintains ACID consistency for transactions. Replication MarkLogic provides high availability with replica sets. Scalability MarkLogic scales horizontally using sharding. MarkLogic can run over multiple servers, balancing the load or replicating data to keep the system up and running in the event of hardware failure. Security MarkLogic has built in security features such as element-level permissions and data redaction. Optic API for Relational Operations An API that lets developers view their data as documents, graphs or rows. Security MarkLogic provides redaction, encryption, and element-level security (allowing for control on read and write rights on parts of a document). == Applications == Banking Big Data Fraud prevention Insurance Claims Management and Underwriting Master data management Recommendation engines == Licensing == MarkLogic is available under various licensing and delivery models, namely a free Developer or an Essential Enterprise license.[3] Licenses are available from MarkLogic or directly from cloud marketplaces such as Amazon Web Services and Microsoft Azure. == Releases == 2001 – Cerisent XQE 1: ACID transactions, Full-text search, XML Storage, XQuery, Role-based security 2004 – Cerisent XQE 2: Scale-out architecture, Enhanced search (stemming, thesaurus, wildcard), Backup and restore 2005 – MarkLogic Server 3: Continuing search improvements, Content Processing Framework (including PDF, Word, Excel, PPT), Failover 2008 – MarkLogic Server 4: Geospatial search, entity extraction, advanced XQuery, performance, scalability enhancements, auditing 2011 – MarkLogic Server 5: Flexible replication / DDIL, real-time indexing, advanced search, improved analytics, concurrency enhancements 2012 – MarkLogic Server 6: REST and Java APIs, App Builder, enhanced UI, improved search 2013 – MarkLogic Server 7: Semantic graph, bitemporal data, tiered storage, improved search, better management 2015 – MarkLogic Server 8: A Native JSON storage, Server-side JavaScript, Bitemporal, Node.js client API, Incremental backup, Flexible replication[16] 2017 – MarkLogic Server 9: Data integration across Relational and Non-Relational data, Advanced Encryption, Element Level Security, Redaction 2019 – MarkLogic Server 10: Enhanced Data Hub, improved SQL, security, analytics performance, cloud support 2022 – MarkLogic Server 11: MarkLogic Ops Director (Monitoring and Administration Improvements), expanded PKI 2025 – MarkLogic Server 12: Generative AI and Native Vector Search, Graph Algorithm Support, Virtual TDEs (relational views on the fly)

    Read more →
  • Informationist

    Informationist

    An informationist (or information specialist in context) provides research and knowledge management services in the context of clinical care or biomedical research. Although there is no one educational pathway or formalized set of skills or knowledge for informationists, one way to think of the informationist is as one who possesses the knowledge and skill of a medical librarian with extensive research specialization and some formal clinical or public health education that goes beyond on-the-job osmosis. Medical librarians and other biomedical professional organizations have been exploring the possibilities for evaluating how informationists are being used and whether their activities supplement or replace medical library activity. More generally, an informationist is a professional who works with information within a particular business, analytic or scientific context to drive toward outcomes based on evidence, analysis, prediction and execution. For example, an extension of the term is increasingly emerging in financial services, life sciences and health care industries. Though still nascently in use, its adoption applies to individuals with extensive industry expertise, acute familiarity with organizational structures and processes, deep domain level information mastery and information systems technical savvy. Informationists in this context support transformational initiatives within and across functional areas of an enterprise as architects, governance experts, continuous improvement advocates and strategists. == Background == The term was proposed in 2000 by Davidoff & Florance. Their editorial suggested that physicians should be delegating their information needs to informationists, just as they currently order CT scans from radiologists or cardiac catheterizations from cardiologists. They conceived of an information professional who was embedded in (and indeed, supported by) the clinical departments. Supporters of the concept see it as a means for librarians to reinvigorate connections with the faculty/clinicians, as well as provide superior service by dint of informationists' biomedical training. Critics complained that the idea is nothing new; librarians already provide in-depth, high quality information services and clinical medical librarians have been working alongside physicians, nurses and other clinicians for years. Large informationist programs in the U.S. exist at the National Institutes of Health and at Vanderbilt University. Welch Medical Library at Johns Hopkins University (JHU) is developing an informationist service model in which its 10 clinical and public health librarians are moving from serving as liaison librarians for assigned departments toward becoming embedded informationists within their departments. To prepare for the embedded informationist role, librarians are undertaking education as needed to supplement their backgrounds. For example, librarians bring experience in clinical behavior counseling, public health, nursing, and more. Informationist training can then focus upon filling gaps in research methods knowledge more so than on gaining additional knowledge in the librarian's area of expertise. Courses, seminars and workshops being undertaken include those covering systematic reviews, evidence-based medicine, critical appraisal, medical language, anatomy and physiology, biostatistics, and clinical research. The term informationist is related to that of informatician—also informaticist—and many informationists do possess skills in clinical topics, bioinformatics, and biomedical informatics. Harvard University, the University of Pittsburgh, and Washington University in St. Louis are examples of institutional libraries which have hired PhD-level scientists (who may or may not have library degrees) to provide informatics support for biomedical research.

    Read more →
  • Artificial intelligence in government

    Artificial intelligence in government

    Artificial intelligence (AI) has a range of uses in government. It can be used to further public policy objectives (in areas such as emergency services, health and welfare), as well as assist the public to interact with the government (through the use of virtual assistants, for example). According to the Harvard Business Review, "Applications of artificial intelligence to the public sector are broad and growing, with early experiments taking place around the world." Hila Mehr from the Ash Center for Democratic Governance and Innovation at Harvard University notes that AI in government is not new, with postal services using machine methods in the late 1990s to recognise handwriting on envelopes to automatically route letters. The use of AI in government comes with significant benefits, including efficiencies resulting in cost savings (for instance by reducing the number of front office staff) and reducing the opportunities for corruption. However, it also carries risks (described below). == Uses of AI in government == The potential uses of AI in government are wide and varied, with Deloitte considering that "Cognitive technologies could eventually revolutionize every facet of government operations". Mehr suggests that six types of government problems are appropriate for AI applications: Resource allocation—such as where administrative support is required to complete tasks more quickly. Large datasets—where these are too large for employees to work efficiently and multiple datasets could be combined to provide greater insights. Expert shortage—including where basic questions could be answered and niche issues can be learned. Predictable scenario—historical data makes the situation predictable. Procedural tasks refer to repetitive tasks in which the answers to inputs or outputs are binary. Diverse data—where data takes various forms (such as visual and linguistic) and needs to be summarized regularly. Mehr states that "While applications of AI in government work have not kept pace with the rapid expansion of AI in the private sector, the potential use cases in the public sector mirror common applications in the private sector." Potential and actual uses of AI in government can be divided into three broad categories: those that contribute to public policy objectives, those that assist public interactions with the government, and other uses. === Contributing to public policy objectives === There are a range of examples of where AI can contribute to public policy objectives. These include: Receiving benefits at job loss, retirement, bereavement and child birth almost immediately, in an automated way (thus without requiring any actions from citizens at all) Social insurance service provision Classifying emergency calls based on their urgency (like the system used by the Cincinnati Fire Department in the United States) Detecting and preventing the spread of diseases Assisting public servants in making welfare payments and immigration decisions Adjudicating bail hearings Triaging health care cases Monitoring social media for public feedback on policies Monitoring social media to identify emergency situations Identifying fraudulent benefits claims Predicting a crime and recommending optimal police presence Predicting traffic congestion and car accidents Anticipating road maintenance requirements Identifying breaches of health regulations Providing personalised education to students Marking exam papers Assisting with defence and national security (see Artificial intelligence § Military and Applications of artificial intelligence § Other fields in which AI methods are implemented respectively) Artificial Intelligence in China has been used to drive both political and economic markets. In 2019, Shanghai’s government rolled out 100 billion yuan to assist in funding enterprises that used AI to introduce 22 new policy agendas. Shanghai invested in these enterprises to attract top international talent in order to set up the Shanghai Municipal Big Data Center. City Brain AI is an urban management platform made by Alibaba. China uses City Brain AI to maintain a significant share of capital investment through public and state owned enterprises. The synergy between public and private sectors are more than capital-driven with City Brain AI. The blend of both public and private shareholding is only made out to be through the role of provincial and sub-provincial governments. Both hold control over the direction that City Brain AI makes both socially and economically. === Assisting public interactions with government === AI can be used to assist members of the public to interact with government and access government services, for example by: Answering questions using virtual assistants or chatbots (see below) Directing requests to the appropriate area within government Filling out forms Assisting with searching documents (e.g. IP Australia's trade mark search) Scheduling appointments Various governments, including those of Australia and Estonia, have implemented virtual assistants to aid citizens in navigating services, with applications ranging from tax inquiries to life-event registrations. === Gerrymandering === Gerrymandering is a method of influencing political process by drawing map boundaries in favor of incumbent parties. Academic researchers Wendy Tam Cho and Bruce Cain have proposed partially automating the map-drawing process with an AI system to reduce partisan gerrymandering. Even with this AI system, the process may still be manipulated to favor partisan interests, so the researchers emphasized the importance of transparency and human involvement. === Other uses === Other uses of AI in government include: Translation Language interpretation pioneered by the European Commission's Directorate General for Interpretation and Florika Fink-Hooijer. Drafting documents == Potential benefits == AI offers potential efficiencies and cost savings for the government. For example, Deloitte has estimated that automation could save US Government employees between 96.7 million to 1.2 billion hours a year, resulting in potential savings of between $3.3 billion to $41.1 billion a year. The Harvard Business Review has stated that while this may lead a government to reduce employee numbers, "Governments could instead choose to invest in the quality of its services. They can re-employ workers' time towards more rewarding work that requires lateral thinking, empathy, and creativity—all things at which humans continue to outperform even the most sophisticated AI program." == Risks == Risks associated with the use of AI in government include AI becoming susceptible to bias, a lack of transparency in how an AI application may make decisions, and the accountability for any such decisions. For example, a 2026 lawsuit alleged that the U.S. Department of Government Efficiency used ChatGPT to flag and cancel federal humanities grants, including projects on Jewish history and Israeli culture, over some objections from NEH officials, illustrating how automated decision-making could affect funding outcomes.

    Read more →
  • Rapid prototyping

    Rapid prototyping

    Rapid prototyping is a group of techniques used to quickly fabricate a scale model of a physical part or assembly using three-dimensional computer aided design (CAD) data. Construction of the part or assembly is usually done using 3D printing or "additive layer manufacturing" technology. The first methods for rapid prototyping became available in mid 1987 and were used to produce models and prototype parts. Today, they are used for a wide range of applications and are used to manufacture production-quality parts in relatively small numbers if desired without the typical unfavorable short-run economics. This economy has encouraged online service bureaus. Historical surveys of RP technology start with discussions of simulacra production techniques used by 19th-century sculptors. Some modern sculptors use the progeny technology to produce exhibitions and various objects. The ability to reproduce designs from a dataset has given rise to issues of rights, as it is now possible to interpolate volumetric data from 2D images. As with CNC subtractive methods, the computer-aided-design – computer-aided manufacturing CAD -CAM workflow in the traditional rapid prototyping process starts with the creation of geometric data, either as a 3D solid using a CAD workstation, or 2D slices using a scanning device. For rapid prototyping this data must represent a valid geometric model; namely, one whose boundary surfaces enclose a finite volume, contain no holes exposing the interior, and do not fold back on themselves. In other words, the object must have an "inside". The model is valid if for each point in 3D space the computer can determine uniquely whether that point lies inside, on, or outside the boundary surface of the model. CAD post-processors will approximate the application vendors' internal CAD geometric forms (e.g., B-splines) with a simplified mathematical form, which in turn is expressed in a specified data format which is a common feature in additive manufacturing: STL file format, a de facto standard for transferring solid geometric models to SFF machines. To obtain the necessary motion control trajectories to drive the actual SFF, rapid prototyping, 3D printing or additive manufacturing mechanism, the prepared geometric model is typically sliced into layers, and the slices are scanned into lines (producing a "2D drawing" used to generate trajectory as in CNC's toolpath), mimicking in reverse the layer-to-layer physical building process. == Application areas == Rapid prototyping is also commonly applied in software engineering to try out new business models and application architectures such as Aerospace, Automotive, Financial Services, Product development, and Healthcare. Aerospace design and industrial teams rely on prototyping in order to create new AM methodologies in the industry. Using SLA they can quickly make multiple versions of their projects in a few days and begin testing quicker. Rapid Prototyping allows designers/developers to provide an accurate idea of how the finished product will turn out before putting too much time and money into the prototype. 3D printing being used for Rapid Prototyping allows for Industrial 3D printing to take place. With this, you could have large-scale moulds to spare parts being pumped out quickly within a short period of time. == Types of Rapid Prototyping == Stereolithography (SLA) → a laser-cured photopolymer for materials such as thermoplastic-like photopolymers. Selective Laser Sintering (SLS) → a laser-sintered powder for materials such as Nylon or TPU. Direct Metal Laser Sintering (DMLS) → laser-sintered metal powder for materials like stainless steel, titanium, chrome, and aluminum. Fused Deposition Modeling (FDM) → fused extrusions of filaments like ABS, PC, and PPCU. Multi Jet Fusion (MJF) → it is an inkjet array selective fusing across bed of nylon powder for Black Nylon 12. PolyJet (PJET) → it is a uv-cured jetted photopolymer to work with acrylic-based and elastomeric photopolymers. Computer Numerical Controlled Machine (CNC) → it is used for manipulating engineering-grade thermoplastics and metals. Injection Molding (IM) → the injection is done using aluminum molds and it is used for thermoplastics, metals and liquid silicone rubber. Vacuum Casting→ is a manufacturing process used to create high-quality prototypes and small batches of parts. == History == In the 1970s, Joseph Henry Condon and others at Bell Labs developed the Unix Circuit Design System (UCDS), automating the laborious and error-prone task of manually converting drawings to fabricate circuit boards for the purposes of research and development. By the 1980s, U.S. policy makers and industrial managers were forced to take note that America's dominance in the field of machine tool manufacturing evaporated, in what was named the machine tool crisis. Numerous projects sought to counter these trends in the traditional CNC CAM area, which had begun in the US. Later when Rapid Prototyping Systems moved out of labs to be commercialized, it was recognized that developments were already international and U.S. rapid prototyping companies would not have the luxury of letting a lead slip away. The National Science Foundation was an umbrella for the National Aeronautics and Space Administration (NASA), the US Department of Energy, the US Department of Commerce NIST, the US Department of Defense, Defense Advanced Research Projects Agency (DARPA), and the Office of Naval Research coordinated studies to inform strategic planners in their deliberations. One such report was the 1997 Rapid Prototyping in Europe and Japan Panel Report in which Joseph J. Beaman founder of DTM Corporation [DTM RapidTool pictured] provides a historical perspective: The roots of rapid prototyping technology can be traced to practices in topography and photosculpture. Within TOPOGRAPHY Blanther (1892) suggested a layered method for making a mold for raised relief paper topographical maps .The process involved cutting the contour lines on a series of plates which were then stacked. Matsubara (1974) of Mitsubishi proposed a topographical process with a photo-hardening photopolymer resin to form thin layers stacked to make a casting mold. PHOTOSCULPTURE was a 19th-century technique to create exact three-dimensional replicas of objects. Most famously Francois Willeme (1860) placed 24 cameras in a circular array and simultaneously photographed an object. The silhouette of each photograph was then used to carve a replica. Morioka (1935, 1944) developed a hybrid photo sculpture and topographic process using structured light to photographically create contour lines of an object. The lines could then be developed into sheets and cut and stacked, or projected onto stock material for carving. The Munz (1956) Process reproduced a three-dimensional image of an object by selectively exposing, layer by layer, a photo emulsion on a lowering piston. After fixing, a solid transparent cylinder contains an image of the object. "The Origins of Rapid Prototyping - RP stems from the ever-growing CAD industry, more specifically, the solid modeling side of CAD. Before solid modeling was introduced in the late 1980's, three-dimensional models were created with wire frames and surfaces. But not until the development of true solid modeling could innovative processes such as RP be developed. Charles Hull, who helped found 3D Systems in 1986, developed the first RP process. This process, called stereolithography, builds objects by curing thin consecutive layers of certain ultraviolet light-sensitive liquid resins with a low-power laser. With the introduction of RP, CAD solid models could suddenly come to life". The technologies referred to as Solid Freeform Fabrication are what we recognize today as rapid prototyping, 3D printing or additive manufacturing: Swainson (1977), Schwerzel (1984) worked on polymerization of a photosensitive polymer at the intersection of two computer controlled laser beams. Ciraud (1972) considered magnetostatic or electrostatic deposition with electron beam, laser or plasma for sintered surface cladding. These were all proposed but it is unknown if working machines were built. Hideo Kodama of Nagoya Municipal Industrial Research Institute was the first to publish an account of a solid model fabricated using a photopolymer rapid prototyping system (1981). The first 3D rapid prototyping system relying on Fused Deposition Modeling (FDM) was made in April 1992 by Stratasys but the patent did not issue until June 9, 1992. Sanders Prototype, Inc introduced the first desktop inkjet 3D Printer (3DP) using an invention from August 4, 1992 (Helinski), Modelmaker 6Pro in late 1993 and then the larger industrial 3D printer, Modelmaker 2, in 1997. Z-Corp using the MIT 3DP powder binding for Direct Shell Casting (DSP) invented 1993 was introduced to the market in 1995. Even at that early date the technology was seen as having a place in manufacturing practice. A low resol

    Read more →
  • Pointer algorithm

    Pointer algorithm

    In computer science, a pointer algorithm (sometimes called a pointer machine, or a reference machine; see the article Pointer machine for a close but non-identical concept) is a type of algorithm that manages a linked data structure. This concept is used as a model for lower-bound proofs and specific restrictions on the linked data structure and on the algorithm's access to the structure vary. This model has been used extensively with problems related to the disjoint-set data structure. Thus, Tarjan and La Poutré used this model to prove lower bounds on the amortized complexity of a disjoint-set data structure (La Poutré also addressed the interval split-find problem). Blum used this model to prove a lower bound on the single operation worst-case time of disjoint set data structure. Blum and Rochow proved a worst-case lower bound for the interval union-find problem. == Example == In Tarjan's lower bound for the disjoint set union problem, the assumptions on the algorithm are: The algorithm maintains a linked structure of nodes. Each element of the problem is associated with a node. Each set is represented by a node. The nodes of each set constitute a distinct connected component in the structure (this property is called separability). The find operation is performed by following links from the element node to the set node. Under these assumptions, the lower bound of Ω ( m α ( m , n ) ) {\displaystyle \Omega (m\alpha (m,n))} on the cost of a sequence of m operations is proven.

    Read more →
  • Software intelligence

    Software intelligence

    Software intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in information technology environments. Similarly to business intelligence (BI), software intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and blueprints of the innerworking of applications, and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes. == History == Software intelligence has been used by Kirk Paul Lafler, an American engineer, entrepreneur, and consultant, and founder of Software Intelligence Corporation in 1979. At that time, it was mainly related to SAS activities, in which he has been an expert since 1979. In the early 1980s, Victor R. Basili participated in different papers detailing a methodology for collecting valid software engineering data relating to software engineering, evaluation of software development, and variations. In 2004, different software vendors in software analysis started using the terms as part of their product naming and marketing strategy. Then in 2010, Ahmed E. Hassan and Tao Xie defined software intelligence as a "practice offering software practitioners up-to-date and pertinent information to support their daily decision-making processes and Software Intelligence should support decision-making processes throughout the lifetime of a software system". They go on by defining software intelligence as a "strong impact on modern software practice" for the upcoming decades. == Capabilities == Because of the complexity and wide range of components and subjects implied in software, software intelligence is derived from different aspects of software: Software composition is the construction of software application components. Components result from software coding, as well as the integration of the source code from external components: Open source, 3rd party components, or frameworks. Other components can be integrated using application programming interface call to libraries or services. Software architecture refers to the structure and organization of elements of a system, relations, and properties among them. Software flaws designate problems that can cause security, stability, resiliency, and unexpected results. There is no standard definition of software flaws but the most accepted is from The MITRE Corporation where common flaws are cataloged as Common Weakness Enumeration. Software grades assess attributes of the software. Historically, the classification and terminology of attributes have been derived from the ISO 9126-3 and the subsequent ISO 25000:2005 quality model. Software economics refers to the resource evaluation of software in the past, present, or future to make decisions and to govern. == Components == The capabilities of software intelligence platforms include an increasing number of components: Code analyzer to serve as an information basis for other software intelligence components identifying objects created by the programming language, external objects from Open source, third parties objects, frameworks, API, or services Graphical visualization and blueprinting of the inner structure of the software product or application considered including dependencies, from data acquisition (automated and real-time data capture, end-user entries) up to data storage, the different layers within the software, and the coupling between all elements. Navigation capabilities within components and impact analysis features List of flaws, architectural and coding violations, against standardized best practices, cloud blocker preventing migration to a Cloud environment, and rogue data-call entailing the security and integrity of software Grades or scores of the structural and software quality aligned with industry-standard like OMG, CISQ or SEI assessing the reliability, security, efficiency, maintainability, and scalability to cloud or other systems. Metrics quantifying and estimating software economics including work effort, sizing, and technical debt Industry references and benchmarking allowing comparisons between outputs of analysis and industry standards == User aspect == Some considerations must be made in order to successfully integrate the usage of software Intelligence systems in a company. Ultimately the software intelligence system must be accepted and utilized by the users in order for it to add value to the organization. If the system does not add value to the users' mission, they simply don't use it as stated by M. Storey in 2003. At the code level and system representation, software intelligence systems must provide a different level of abstractions: an abstract view for designing, explaining and documenting and a detailed view for understanding and analyzing the software system. At the governance level, the user acceptance for software intelligence covers different areas related to the inner functioning of the system as well as the output of the system. It encompasses these requirements: Comprehensive: missing information may lead to a wrong or inappropriate decision, as well as it is a factor influencing the user acceptance of a system. Accurate: accuracy depends on how the data is collected to ensure fair and indisputable opinion and judgment. Precise: precision is usually judged by comparing several measurements from the same or different sources. Scalable: lack of scalability in the software industry is a critical factor leading to failure. Credible: outputs must be trusted and believed. Deploy-able and usable. == Applications == Software intelligence has many applications in all businesses relating to the software environment, whether it is software for professionals, individuals, or embedded software. Depending on the association and the usage of the components, applications will relate to: Change and modernization: uniform documentation and blueprinting on all inner components, external code integrated, or call to internal or external components of the software Resiliency and security: measuring against industry standards to diagnose structural flaws in an IT environment. Compliance validation regarding security, specific regulations or technical matters. Decisions making and governance: Providing analytics about the software itself or stakeholders involved in the development of the software, e.g. productivity measurement to inform business and IT leaders about progress towards business goals. Assessment and Benchmarking to help business and IT leaders to make informed, fact-based decision about software. == Marketplace == Software intelligence is a high-level discipline and has been gradually growing covering the applications listed above. There are several markets driving the need for it: Application Portfolio Analysis (APA) aiming at improving the enterprise performance. Software Assessment for producing the software KPI and improving quality and productivity. Software security and resiliency measures and validation. Software evolution or legacy modernization, for which blueprinting the software systems are needed nor tools improving and facilitating modifications.

    Read more →